web scraping - Using python to visit a link and print data -

June 15, 2014

i'm writing web scraper , trying drake lyrics. scraper has visit 1 site (main metrolyrics site) , visit each individual song link, print out lyrics.

i'm having trouble visiting second link. i've searched around on beautifulsoup , pretty confused. i'm wondering if can help.

# intended print of drake song lyrics on metrolyrics  pyquery import pyquery pq lxml import etree import requests bs4 import beautifulsoup  # visits website response = requests.get('http://www.metrolyrics.com/drake-lyrics.html')  # separates different types of content doc = pq(response.content)  # finds titles in content titles = doc('.title')  # visits each title, prints each verse title in titles:     # visits each title   response_title = requests.get(title)     # separates content   doc2 = pq(response_title.content)     # finds song lyrics   verse = doc2('.verse')     # prints song lyrics   print verse.text

in response_title = requests.get(title), python isn't recognizing title link, makes sense. how actual in there, though? appreciate help.

replace

response_title = requests.get(title)

with

response_title = requests.get(title.attrib['href'])

full working script (with fixed note comment below)

#!/usr/bin/python  pyquery import pyquery pq lxml import etree import requests bs4 import beautifulsoup  # visits website response = requests.get('http://www.metrolyrics.com/drake-lyrics.html')  # separates different types of content doc = pq(response.content)  # finds titles in content titles = doc('.title')  # visits each title, prints each verse title in titles:     # visits each title   #response_title = requests.get(title)   response_title = requests.get(title.attrib['href'])      # separates content   doc2 = pq(response_title.content)     # finds song lyrics   verse = doc2('.verse')     # prints song lyrics   print verse.text()

Search This Blog

Script

web scraping - Using python to visit a link and print data -

Comments

Post a Comment

Popular posts from this blog

android - Sent Blob results empty -

javascript - Bootstrap Popover: iOS Safari strange behaviour -

ruby - How to configure keymap of Rubymine for rails console -