Thursday, April 19, 2018

Encoding Chinese (non-ascii) for URL string

Symptom:

~/anaconda3/lib/python3.6/http/client.py in putrequest(self, method, url, skip_host, skip_accept_encoding)
   1115 
   1116         # Non-ASCII characters should have been eliminated earlier
-> 1117         self._output(request.encode('ascii'))
   1118 
   1119         if self._http_vsn == 11:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-13: ordinal not in range(128)

Solution:
import urllib.request
import urllib.parse

uriencoded = urllib.parse.quote('/s?wd=无人驾驶',encoding='UTF-8')
output = urllib.request.urlopen('https://www.baidu.com'+uriencoded)