string - Python newbie here - Index Error: list index out of range -
i had web scrapper made works fine point sorts acquired data dumped e_data. i'm complete python newbie, , appreciated.
the error:
traceback (most recent call last): file "c:\wamp\www\_clients\dstest\web_scrape2.py", line 78, in <module> customer = row.find_all('td')[2].gettext().split() indexerror: list index out of range
the faulty code:
if re.findall('\\bnew\\b', str(e_data)) != []: row in e_data.find_all('tr'): if re.findall('</table>', str(row)) == [] , re.findall('\\bnew\\b', str(row)) != []: job_no = row.find('a').string customer = row.find_all('td')[2].gettext().split() move_date = row.find_all('td')[3].gettext() result = {'job_no': job_no, 'customer': customer, 'move_date': move_date} print json.dumps(result) else: print "data unavailable"
content of e_data:
<center><b><h3> total of: 1 transactions <right> <a href="javascript:window.close()">exit</a> <style type="text/css"> .xf{color: blue; text-decoration: underline;} .xn{color: red; text-decoration: underline; cursor: hand}></style> </right></h3></b></center> <table align="center" border="0" cellspacing="0" width="90%"><tr><td> <center> <table bgcolor="#eeeeee" border="1" cellpadding="3" cellspacing="0" style="font-size: 8pt" width="100"> <tr bgcolor="darkblue"><th><font color="white" face="verdana,helvetica">job_no</font></th><th><font color="white" face="verdana,helvetica">category</font></th><th><font color="white" face="verdana,helvetica">customer</font></th><th><font color="white" face="verdana,helvetica">move_date</font></th><th><font color="white" face="verdana,helvetica">deliver</font></th><th><font color="white" face="verdana,helvetica">dlv_imm</font></th><th><font color="white" face="verdana,helvetica">origin</font></th><th><font color="white" face="verdana,helvetica">destination</font></th><th><font color="white" face="verdana,helvetica">miles</font></th><th><font color="white" face="verdana,helvetica">cf_lbs</font></th><th><font color="white" face="verdana,helvetica">estimate</font></th><th><font color="white" face="verdana,helvetica">open_date</font></th><th><font color="white" face="verdana,helvetica">vip</font></th></tr><tr style="background:#ccccff" valign="top"><td><a href="/wc.dll?mprep~printselect~ltpax57752~uz2w225186" target="_blank">j4074407</a><br> <b><br><font color="#ff0000">new</font></br></b></br></td></tr></table></center></td></tr></table><td>long_dist.<br>followup<br><b><font color="#008000" size="1">reference</font></b></br></br></td><td><b>newlead2</b><br>user:sam <br>newlead2@gmail.com <br>4838484838</br></br></br></td><td>01/18/2016</td><td> / / <br> / /</br></td><td>...</td><td><b>fl fort lauderdale </b></td><td><b>ca oakland </b></td><td align="right">3068</td><td>200 cf<br>2000 lbs</br></td><td align="right">1410.00</td><td align="center">02/18/2015 02:27:05 pm</td><td>...</td> <tr bgcolor="darkblue" style="font-face:bold"><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td></td><td align="right"><font color="white"><b> 3068</b></font></td><td></td><td align="right"><font color="white"><b> 1410.00</b></font></td><td></td><td></td></tr>
you trying access non-existant element of array from
customer = row.find_all('td')[2].gettext().split()
print length of array , you'll know
len(row.findall('td'))
iterate through 'td' elements using
tdellements = row.find_all('td') tdelement in tdelements: #your code here
Comments
Post a Comment