Saturday, 12 October 2019

Is six.text_type the same as text.decode('utf8')?

Given a function like:

import six

def convert_to_unicode(text):
  """Converts `text` to Unicode (if it's not already), assuming utf-8 input."""
  if six.PY3:
    if isinstance(text, str):
      return text
    elif isinstance(text, bytes):
      return text.decode("utf-8", "ignore")
    else:
      raise ValueError("Unsupported string type: %s" % (type(text)))
  elif six.PY2:
    if isinstance(text, str):
      return text.decode("utf-8", "ignore")
    elif isinstance(text, unicode):
      return text
    else:
      raise ValueError("Unsupported string type: %s" % (type(text)))
  else:
    raise ValueError("Not running on Python2 or Python 3?")

Since six handles the python2 and python3 compatibility, would the above convert_to_unicode(text) function be equivalent to just six.text_type(text)? I.e.

def convert_to_unicode(text):
    return six.text_type(text)

Are there cases that the original convert_to_unicode capture but six.text_type can't?



from Is six.text_type the same as text.decode('utf8')?

No comments:

Post a Comment