find text from visible HTML in python -

- July 15, 2013

i'm trying following:

i have text file have values line line.
a website generate list of values based on page number. values xxx & yyy in example below.
python script reads first text file (efficient 0(1) lookups using set) , search in website page after page +1, , if value match found must print page number.

the search must www.site.com/1 www.site.com/2 www.site.com/3 ...etc

html source:

<pre class="values">     <strong>a</strong>     <strong>b</strong>     <strong>c</strong>     <span id="1">         <a href="/#">+</a>          <span title="1">1</span>         <a href="/#">xxx</a>         <a href="/#">yyy</a>     </span> </pre>

text file efficient 0(1) lookups using set:

with open("values.txt", "r") f1:         lines = set(f1) # efficient 0(1) lookups using set         line in html :             if line in lines:                 print(line)

from xml.etree import elementtree et  <pre class="values">     <strong>a</strong>     <strong>b</strong>     <strong>c</strong>     <span id="1">         <a href="/#">+</a>          <span title="1">1</span>         <a href="/#">xxx</a> <a href="/#">yyy</a>     </span> </pre>  open('/path/to/file.html') fp:     html = et.fromstring(fp.read())  node in html.iter():     if node.tag == 'a':         print node.text

Search This Blog

Unity

find text from visible HTML in python -

Comments

Post a Comment

Popular posts from this blog

angularjs - Showing an empty as first option in select tag -

qt - Change color of QGraphicsView rubber band -

c - Reading the program header contents of an ELF file -