web scraping - Django Beautiful Soup Changing User Agent - No Effect -


i trying run webscrapping application. however, code working sites though set user agent (have tried several different ones). code work on dev site (which hosted on subdomain of pythonanywhere), not on production site. seems if blocked sites (even though have not been accessing them @ if ever). ideas? email websites , see if can granted access not doing malicious.

url = request.get['url']     import requests     bs4 import beautifulsoup     r = requests.get(url)     soup = beautifulsoup(r.content)     soup = beautifulsoup(r.content, "html.parser")      if not soup.find('meta', property="og:title"):         title = soup.title.string     else:         title = soup.find('meta', property="og:title")['content']      if "403" in title or not title:         import urllib2         opener = urllib2.build_opener()         opener.addheaders = [('user-agent', 'mozilla/5.0')]         response = opener.open(url)         page = response.read()         soup = beautifulsoup(page)          if not soup.find('meta', property="og:title"):             title = soup.title.string         else:             title = soup.find('meta', property="og:title")['content'] 


Comments

Popular posts from this blog

java - Static nested class instance -

c# - Bluetooth LE CanUpdate Characteristic property -

JavaScript - Replace variable from string in all occurrences -