Troubleshooting Unicode string behavior when updating Python 2.x->3.x -
i using python 2.7 in project , since have seen python 3.5 coming soon, decided upgrade python interpreter.
i using base64
encoding. since string objects somehow different in new python versions, getting following error:
typeerror: 'str' not support buffer interface
i have found out have encode string before passing function ('string'.encode()
), isn't there way encode string automatically unicode or something?
in python 3, str
means "unicode text" -- whether express 'mystring'
or u'mystring'
makes no difference (the latter tolerated facilitate porting/coexistence python 2).
to indicate binary string of bytes, you'd use b'mystring'
instead.
as https://docs.python.org/3/library/base64.html puts it,
this module provides functions encoding binary data
(my emphasis) -- nothing text (i.e unicode) data. logical consequence, functions in module expect or return byte strings.
not sure why can't using byte strings (as opposed text strings) directly in program, if that's problem, simplest wrap needed functions base64
module own function provide whatever encoding (text -> bytes) or decoding (bytes -> text) require. example:
import base64 def b64encode(text, codec='utf8'): return base64.b64encode(text.encode(codec))
then use b64encode
throughout rest of code, rather base64.b64encode
directly -- etc, etc decoding part.
Comments
Post a Comment