excel - VBA web data not showing entire table -
i trying download table excel sheet loop through next table.the loop working(very slow though) getting top of page up(the top 5 lines dog name trainer name etc) , main table not appear.i cookie message up. suggestion welcome:
option explicit sub macro1() sheets("sheet1").select range("a1").select dim integer dim e integer dim myurl string, shorturl string sheets("sheet1").select = 1 while < 3 myurl = "url;http://www.racingpost.com/greyhounds/dog_home.sd?dog_id=" & & "" activesheet.querytables.add(connection:=myurl, destination:=range("$a$1")) .name = shorturl .fieldnames = true .rownumbers = false .filladjacentformulas = false .preserveformatting = true .refreshonfileopen = false .backgroundquery = true .refreshstyle = xlinsertdeletecells .savepassword = false .savedata = true .adjustcolumnwidth = true .refreshperiod = 0 .webselectiontype = xlentirepage .webformatting = xlwebformattingnone .webpreformattedtexttocolumns = true .webconsecutivedelimitersasone = true .websingleblocktextimport = false .webdisabledaterecognition = false .webdisableredirections = false .refresh backgroundquery:=false .webdisabledaterecognition = false .webdisableredirections = false .refresh backgroundquery:=false end columns("a:j").select selection.copy range("k1").select selection.pastespecial paste:=xlpastevalues, operation:=xlnone, skipblanks _ :=false, transpose:=false columns("a:j").select range("j1").activate application.cutcopymode = false selection.delete shift:=xltoleft columns("a:j").select selection.columnwidth = 20.01 columns("b:b").select selection.columnwidth = 20.01 rows("1:9").select selection.insert shift:=xldown, copyorigin:=xlformatfromleftorabove = + 1 loop end sub
the table data loaded via ajax
request after initial page load.
if @ page in chrome , open developer tools (f12) -> network tab
. see additional request following url: http://www.racingpost.com/greyhounds/dog_form.sd?dog_id=
the method using retrieve data slow. 1 way speed request url's via xmlhttprequest
, parse corresponding data need yourself.
here example of xmlhttprequest
(note data returned string of source code can parse):
function xmlhttprequest(url string) string dim xml object set xml = createobject("msxml2.xmlhttp") xml.open "get", url, false xml.send xmlhttprequest = xml.responsetext end function
so requesting data via method this:
response = xmlhttprequest("http://www.somesite.com")
this fastest method know retrieve data website since doesn't involve rendering anything.
then parse given data need things in front or behind data consistent in source. (usually divs specific class names or that). generic parse might this:
loc1 = instr(response,"myclassname") loc1 = instr(loc1, response, ">") + 1 'the exact beginning of data i'd loc2 = instr(loc1, response, "</td>")' end of data i'd data = trim(mid(response,loc1,loc2-loc1))
finally here methods paste in , running. i'm not sure fields after parsed few each page examples:
option explicit sub gettrackdata() dim response string dim doghomeurl string dim dogformurl string dim integer dim x integer dim dogname string dim dogdate string dim trainer string dim breeding string dim loc1 long, loc2 long doghomeurl = "http://www.racingpost.com/greyhounds/dog_home.sd?dog_id=" dogformurl = "http://www.racingpost.com/greyhounds/dog_form.sd?dog_id=" x = 2 = 1 10 response = xmlhttprequest(doghomeurl & i) debug.print (response) 'parse overall info 'this basic of parsing web page 'just find start of data want instr 'then find end of data instr 'and use mid pull out data want 'rinse , repeat method every line of data we'd loc1 = instr(response, "popuphead") loc1 = instr(loc1, response, "<h1>") + 4 loc2 = instr(loc1, response, "</h1>") dogname = trim(mid(response, loc1, loc2 - loc1)) 'apparantly if dog name blank there data report on web site if dogname <> "" 'now lets dogdate loc1 = instr(loc2, response, "<li>") loc1 = instr(loc1, response, "(") + 1 loc2 = instr(loc1, response, ")") dogdate = trim(mid(response, loc1, loc2 - loc1)) 'now trainer loc1 = instr(loc2, response, "<strong>trainer</strong>") + 24 loc2 = instr(loc1, response, "</li>") trainer = trim(mid(response, loc1, loc2 - loc1)) response = xmlhttprequest(dogformurl & i) 'now need loop through form table , parse out values care loc1 = instr(response, "full results") while (loc1 <> 0) dim racedate string dim racetrack string dim dis string loc1 = instr(loc1, response, ">") + 1 loc2 = instr(loc1, response, "</a>") racedate = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td>") + 4 loc2 = instr(loc1, response, "</td>") racetrack = trim(mid(response, loc1, loc2 - loc1)) range("a" & x).value = dogname range("b" & x).value = dogdate range("c" & x).value = trainer range("d" & x).value = racedate range("e" & x).value = racetrack loc1 = instr(loc2, response, "full results") x = x + 1 loop debug.print (response) end if 'parse form table next end sub function xmlhttprequest(url string) string dim xml object set xml = createobject("msxml2.xmlhttp") xml.open "get", url, false xml.send xmlhttprequest = xml.responsetext end function
edit 1
the data interating on erroneous, apparantly first column isn't link. here amended example more fields being parsed. let me know if have questions:
option explicit sub gettrackdata() dim response string dim doghomeurl string dim dogformurl string dim integer dim x integer dim dogname string dim dogdate string dim trainer string dim breeding string dim loc1 long, loc2 long dim qt string qt = """" doghomeurl = "http://www.racingpost.com/greyhounds/dog_home.sd?dog_id=" dogformurl = "http://www.racingpost.com/greyhounds/dog_form.sd?dog_id=" x = 2 = 1 10 response = xmlhttprequest(doghomeurl & i) debug.print (response) 'parse overall info 'this basic of parsing web page 'just find start of data want instr 'then find end of data instr 'and use mid pull out data want 'rinse , repeat method every line of data we'd loc1 = instr(response, "popuphead") loc1 = instr(loc1, response, "<h1>") + 4 loc2 = instr(loc1, response, "</h1>") dogname = trim(mid(response, loc1, loc2 - loc1)) 'apparantly if dog name blank there data report on web site if dogname <> "" 'now lets dogdate loc1 = instr(loc2, response, "<li>") loc1 = instr(loc1, response, "(") + 1 loc2 = instr(loc1, response, ")") dogdate = trim(mid(response, loc1, loc2 - loc1)) 'now trainer loc1 = instr(loc2, response, "<strong>trainer</strong>") + 24 loc2 = instr(loc1, response, "</li>") trainer = trim(mid(response, loc1, loc2 - loc1)) response = xmlhttprequest(dogformurl & i) 'now need loop through form table , parse out values care loc1 = instr(response, "<td class=" & qt & "first" & qt) + 17 while (loc1 > 17) dim racedate string dim racetrack string dim dis string dim trp string dim splt string dim pos string dim fin string dim string dim winsec string dim remarks string dim time string dim going string dim price string dim grd string dim calc string loc1 = instr(loc1, response, ">") + 1 loc2 = instr(loc1, response, "</td>") racedate = trim(mid(response, loc1, loc2 - loc1)) if instr(racedate, "<a href") > 0 'we have link parse out date link dim tem1 long dim tem2 long tem1 = instr(racedate, ">") + 1 tem2 = instr(tem1, racedate, "</a>") racedate = trim(mid(racedate, tem1, tem2 - tem1)) end if loc1 = instr(loc2, response, "<td>") + 4 loc2 = instr(loc1, response, "</td>") racetrack = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td><span class=") + 16 loc1 = instr(loc1, response, ">") + 1 loc2 = instr(loc1, response, "</span>") dis = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td class=") loc1 = instr(loc1, response, ">") + 1 loc2 = instr(loc1, response, "</td>") trp = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td>") + 4 loc2 = instr(loc1, response, "</td>") splt = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td>") + 4 loc2 = instr(loc1, response, "</td>") pos = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<span class= " & qt & "black" & qt & ">") + 21 loc2 = instr(loc1, response, "</span>") fin = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td>") + 4 loc2 = instr(loc1, response, "</td>") = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<a href=") + 8 loc1 = instr(loc1, response, ">") + 1 loc2 = instr(loc1, response, "</a>") winsec = trim(mid(response, loc1, loc2 - loc1)) '<td><i> loc1 = instr(loc2, response, "<td><i>") + 7 loc2 = instr(loc1, response, "</i>") remarks = trim(mid(response, loc1, loc2 - loc1)) '<span class="black"> loc1 = instr(loc2, response, "<span class=" & qt & "black" & qt & ">") + 21 loc2 = instr(loc1, response, "</span>") time = trim(mid(response, loc1, loc2 - loc1)) '<td class="center"> loc1 = instr(loc2, response, "<td class=" & qt & "center" & qt & ">") + 19 loc2 = instr(loc1, response, "</td>") going = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td class=" & qt & "center" & qt & ">") + 19 loc2 = instr(loc1, response, "</td>") price = trim(mid(response, loc1, loc2 - loc1)) loc1 = instr(loc2, response, "<td class=" & qt & "center" & qt & ">") + 19 loc2 = instr(loc1, response, "</td>") grd = trim(mid(response, loc1, loc2 - loc1)) range("a" & x).value = dogname range("b" & x).value = dogdate range("c" & x).value = trainer range("d" & x).value = racedate range("e" & x).value = racetrack range("f" & x).value = dis range("g" & x).value = trp range("h" & x).value = splt range("i" & x).value = pos range("j" & x).value = fin range("k" & x).value = range("l" & x).value = winsec range("m" & x).value = remarks range("n" & x).value = time range("o" & x).value = going range("p" & x).value = price range("q" & x).value = grd loc1 = instr(loc2, response, "<td class=" & qt & "first" & qt) + 17 x = x + 1 loop debug.print (response) end if 'parse form table next end sub function xmlhttprequest(url string) string dim xml object set xml = createobject("msxml2.xmlhttp") xml.open "get", url & "&cache_buster=" & generaterandom, false xml.send xmlhttprequest = xml.responsetext end function function generaterandom() string generaterandom = int(rnd * 1000) end function
Comments
Post a Comment