python - Beautiful Soup: Get text data from html -

January 15, 2010

here html code want extract data following html code using beautiful soup

<tr class="tr-option"> <td class="td-option"><a href="">a.</a></td> <td class="td-option">120 m</td> <td class="td-option"><a href="">b.</a></td> <td class="td-option">240 m</td> <td class="td-option"><a href="">c.</a></td> <td class="td-option" >300 m</td> <td class="td-option"><a href="">d.</a></td> <td class="td-option" >none of these</td> </tr>

here beautiful soup code

soup = beautifulsoup(html_doc) option in soup.find_all('td', attrs={'class':"td-option"}):     print option.text

output of above code:

a. 120 m b. 240 m c. 300 m d. none of these

but want following output

a.120 m b.240 m c.300 m d.none of these

what should do?

since find_all returns list of options, can use list comprehensions obtain answer expect

>>> a_list = [ option.text option in soup.find_all('td', attrs={'class':"td-option"}) ] >>> new_list = [ a_list[i] + a_list[i+1] in range(0,len(a_list),2) ] >>> option in new_list: ...     print option ...  a.120 m b.240 m c.300 m d.none of these

what does?

[ a_list[i] + a_list[i+1] in range(0,len(a_list),2) ] takes adjacent elements a_list , appends them.

Search This Blog

Script

python - Beautiful Soup: Get text data from html -

Comments

Post a Comment

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

javascript - Bootstrap Popover: iOS Safari strange behaviour -

spring cloud - How to configure SpringCloud Eureka instance to point to https on non standard port -