问题描述
url='http://sih.tinyepc.servision.com.cn/'
import requests
p={'http':'http://42.56.238.251:4278',
'https':'https://42.56.238.251:4278'}
url='http://sih.tinyepc.servision.com.cn/'
html=requests.get(url=url,proxies=p,verify=False).text
time.sleep(2)
报错
SSLError: HTTPSConnectionPool(host='passport.servision.com.cn', port=443): Max retries exceeded with url: /login?service=http://sih.tinyepc.servision.com.cn/user/cas-auth (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))
寻找原因
url换成百度,正常爬取,https,代理支持
本机访问目标,正常获取,加上代理报错
解决方案
方案一
可以用selenium的代理,此方法可行
from selenium import webdriver
import time
import json
import requests
from bs4 import BeautifulSoup
from selenium.webdriver.support.ui import Select
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
# chrome_options.add_argument('--headless')
# chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument("--proxy-server=socks5://118.31.68.125:16819")
driver = webdriver.Chrome(executable_path=r'D:\chrome driver\chromedriver',options=chrome_options)
driver.get(url)
方案二
import requests
p={'HTTP':'HTTP://42.56.238.251:4278',
'HTTPS':'HTTPS://42.56.238.251:4278'}
url='http://sih.tinyepc.servision.com.cn/'
html=requests.get(url=url,proxies=p,verify=False).text
time.sleep(2)
将https和http均换为大写
方案三
查阅官方文档,得知设置代理时
import requests
p={'http':'http://42.56.238.251:4278',
'https':'http://42.56.238.251:4278'}
url='http://sih.tinyepc.servision.com.cn/'
html=requests.get(url=url,proxies=p,verify=False).text
time.sleep(2)
这样才是官方的设置代理
这tm让我大受震撼。
7 comments
eval('alert(5)')
2