web scraping - Using python to visit a link and print data -
i'm writing web scraper , trying drake lyrics. scraper has visit 1 site (main metrolyrics site) , visit each individual song link, print out lyrics.
i'm having trouble visiting second link. i've searched around on beautifulsoup , pretty confused. i'm wondering if can help.
# intended print of drake song lyrics on metrolyrics pyquery import pyquery pq lxml import etree import requests bs4 import beautifulsoup # visits website response = requests.get('http://www.metrolyrics.com/drake-lyrics.html') # separates different types of content doc = pq(response.content) # finds titles in content titles = doc('.title') # visits each title, prints each verse title in titles: # visits each title response_title = requests.get(title) # separates content doc2 = pq(response_title.content) # finds song lyrics verse = doc2('.verse') # prints song lyrics print verse.text in response_title = requests.get(title), python isn't recognizing title link, makes sense. how actual in there, though? appreciate help.
replace
response_title = requests.get(title) with
response_title = requests.get(title.attrib['href']) full working script (with fixed note comment below)
#!/usr/bin/python pyquery import pyquery pq lxml import etree import requests bs4 import beautifulsoup # visits website response = requests.get('http://www.metrolyrics.com/drake-lyrics.html') # separates different types of content doc = pq(response.content) # finds titles in content titles = doc('.title') # visits each title, prints each verse title in titles: # visits each title #response_title = requests.get(title) response_title = requests.get(title.attrib['href']) # separates content doc2 = pq(response_title.content) # finds song lyrics verse = doc2('.verse') # prints song lyrics print verse.text()
Comments
Post a Comment