python - Open a webpage and return a dictionary of links on that page -


i wanted write function opens webpage , returns dictionary of links , text on page. tried it's giving me error. can do?

def process(url):     myopener = myopener()     #page = urllib.urlopen(url)     page = myopener.open(url)      text = page.read()     page.close() 

example input

<a href='http://my.computer .com/some/file.html'>link text</a> 

output

{"http://my.computer.com/some/file.html":link text.."} 

welcome stack overflow,

you haven't shown myopener does, have used own. code uses python 3 , beautiful soup 4 html parser (a personal favorite) on python wikipedia article.

root_url = "https://en.wikipedia.org" html_string = retrieve_webage(root_url + "/wiki/python_%28programming_language%29") soup = beautifulsoup(html_string) output = {} # can redefine soup here parse part of page link in soup.find_all('a'):     linkhref = link.get('href')     if not linkhref:         # ingnore blank hyperlinks         pass     elif linkhref[0] == '/':         # add root url relitive links         linkhref = root_url + linkhref     output[linkhref] = link.text 

this script overwrite links identical href attributes reads them down page. can learn more beautiful soup here.

feel free comment below if have questions


Comments

Popular posts from this blog

java - Static nested class instance -

c# - Bluetooth LE CanUpdate Characteristic property -

JavaScript - Replace variable from string in all occurrences -