Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ascii character #251

Open
tokejepsen opened this issue Aug 10, 2017 · 32 comments
Open

Non-ascii character #251

tokejepsen opened this issue Aug 10, 2017 · 32 comments

Comments

@tokejepsen
Copy link
Member

tokejepsen commented Aug 10, 2017

Description

Getting an error when having as non-ascii character in play.

Reproducible

  1. Set environment variable NONASCIICHARACTER with a non-ascii character, like "ö".
  2. Log the environment variable in a plugin.
class NonAcsiiCharacterCollector(pyblish.api.ContextPlugin):
    order = pyblish.api.CollectorOrder

    def process(self, context):
        self.log.info(os.environ["NONASCIICHARACTER"])

Error produced:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\test-environment\lib\threading.py", line 801, in __bootstrap_inner
    self.run()
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\test-environment\lib\threading.py", line 754, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\Users\tje\Documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-qml\pyblish_qml\ipc\server.py", line 209, in _listen
    "payload": result
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\test-environment\lib\json\__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\test-environment\lib\json\encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\test-environment\lib\json\encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6 in position 2: invalid start byte
@tokejepsen tokejepsen added the bug label Aug 10, 2017
@tokejepsen
Copy link
Member Author

tokejepsen commented Aug 10, 2017

The origin of this issue was from this error:

Traceback (most recent call last):
 File "C:/Program Files/Nuke10.5v1/plugins\nuke\executeInMain.py", line 19, in executeInMainThreadWithResult
   r = _nuke.RunInMainThread.result(id)
 File "\\10.11.0.184\171000_tgb_library\personalfolders\tokejepsen\nuke-pipeline-wip\repositories\tgbvfx-environment\pyblish-qml\pyblish_qml\ipc\service.py", line 95, in process
   return formatting.format_result(result)
 File "\\10.11.0.184\171000_tgb_library\personalfolders\tokejepsen\nuke-pipeline-wip\repositories\tgbvfx-environment\pyblish-qml\pyblish_qml\ipc\formatting.py", line 43, in format_result
   "records": format_records(result["records"]),
 File "\\10.11.0.184\171000_tgb_library\personalfolders\tokejepsen\nuke-pipeline-wip\repositories\tgbvfx-environment\pyblish-qml\pyblish_qml\ipc\formatting.py", line 57, in format_records
   formatted.append(format_record(record_))
 File "\\10.11.0.184\171000_tgb_library\personalfolders\tokejepsen\nuke-pipeline-wip\repositories\tgbvfx-environment\pyblish-qml\pyblish_qml\ipc\formatting.py", line 89, in format_record
   record["message"] = str(record.pop("msg"))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 435: ordinal not in range(128)

My biggest problem is how to trace down where this non-ascii character is coming from. I know the user has "ö" in their name, but its not in their Windows username, so it has to be coming from Ftrack.

@mottosso
Copy link
Member

Nice catch, incredible I hadn't encountered a single non-ascii character and caught this earlier.

Which version of Python are you getting this error with?

@tokejepsen
Copy link
Member Author

This is with Python 2.7.

@tokejepsen
Copy link
Member Author

@mottosso got any ideas how to trigger the error message from Nuke, maybe outside of Nuke?

@tokejepsen
Copy link
Member Author

Its not quite the same error as what you get from the reproducible, but it would allow me to track down where the non-ascii character is coming from.

@mottosso
Copy link
Member

I'm having a look at it now, it looks like this triggers the same error.

>>> import os
>>> os.environ["NONASCIICHARACTER"] = "ä"
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/pyblish-qml/tests/test_formatting.py", line 33, in test_nonascii_plugin
    os.environ["NONASCIICHARACTER"] = "\xe4"
  File "/usr/lib/python3.5/os.py", line 730, in __setitem__
    value = self.encodevalue(value)
  File "/usr/lib/python3.5/os.py", line 799, in encode
    return value.encode(encoding, 'surrogateescape')
UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 0: ordinal not in range(128)

@mottosso
Copy link
Member

Possibly related: http://bugs.jython.org/issue1841

@tokejepsen
Copy link
Member Author

I've updated the issue with the error message produced from the reproducible.

@mottosso
Copy link
Member

Ok, could you try running Pyblish QML in Python 3?

@tokejepsen
Copy link
Member Author

Its not an issue in Python 3.

@mottosso
Copy link
Member

Ok, I'm not sure how to resolve it. If you can find a way of reproducing it without environment variables then I could put together a test for it and try and resolve it.

This works so far.

import sys
import logging

from pyblish import api, util
from pyblish_qml.ipc import formatting


def test_nonascii():
    record = logging.LogRecord(
        name="record1",
        level=logging.INFO,
        pathname=sys.modules[__name__].__file__,
        lineno=0,
        msg=u"ä",  # non-ascii
        args=[],
        exc_info=None,
        func=None
    )

    formatting.format_record(record)


def test_nonascii_plugin():
    class NonAcsiiCharacterCollector(api.ContextPlugin):
        order = api.CollectorOrder

        def process(self, context):
            self.log.info("ä")

    util.publish(plugins=[NonAcsiiCharacterCollector])

@tokejepsen
Copy link
Member Author

I've narrowed down where the problem is a bit. The problem is a users last name that is non ascii in Ftrack. A debug message gets logged with the last name which is causing the Unicode error.

@mottosso
Copy link
Member

Could you post one of the characters causing the trouble?

@tokejepsen
Copy link
Member Author

tokejepsen commented Aug 10, 2017

I don't quite know how the pyblish-qml is working with json objects, so I can't get a test together that produces the error we need.
Seems to be a problem dumping the formatted records to json objects:

import sys
import logging
import os
import json

from pyblish_qml.ipc import formatting


def test_nonascii():
    record = logging.LogRecord(
        name="record1",
        level=logging.INFO,
        pathname=sys.modules[__name__].__file__,
        lineno=0,
        msg=os.environ["NONASCIICHARACTER"],
        args=[],
        exc_info=None,
        func=None
    )

    json.dumps(formatting.format_record(record))


test_nonascii()

Error:

Traceback (most recent call last):
  File "C:\Users\tje\Documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-qml\test.py", line 24, in <module>
    test_nonascii()
  File "C:\Users\tje\Documents\conda-git-deployment\repositories\tgbvfx-environment\pyblish-qml\test.py", line 21, in test_nonascii
    json.dumps(formatting.format_record(record))
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\python2\lib\json\__init__.py", line 244, in dumps
    return _default_encoder.encode(obj)
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\python2\lib\json\encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "C:\Users\tje\AppData\Local\Continuum\Miniconda2\envs\python2\lib\json\encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe4 in position 0: unexpected end of data

@tokejepsen
Copy link
Member Author

Could you post one of the characters causing the trouble?

ö

@tokejepsen
Copy link
Member Author

If you can find a way of reproducing it without environment variables

When you have a test with the non ascii character in the python file you have to provide an encoding like # -*- coding: utf-8 -*-, and at this point it all works. But the issue is with external non acsii characters like environment variables, so I don't know how to simulate that without.

@mottosso
Copy link
Member

Oo, I'm having a look at this but it cooks my brain.

From what I gather, when files don't specify a -*- coding: at the header of a Python module, Python automatically uses the system default, which varies by platform and sometimes individual interpreter settings and environment variables.

In your case, the plug-in you've written that logs that character is writing a unicode character that pyblish.discover() then interprets as something other than unicode, based on system defaults. In my case, the error doesn't happen under Ubuntu, but it does on Windows.

What you'll need to do, is decode data you gather externally before passing into self.log.

self.log.info(os.environ["NONASCII"].decode('ascii', 'ignore'))

It'll remove characters it doesn't understand. It's not ideal, but Unicode is my Achilles heel and it's the best I've got. Let me know how that works for you.

@tokejepsen
Copy link
Member Author

Hey @mottosso

Thanks for looking into this, that sounds very logical. I think what even worse about this situation is that I'm not actually doing the log call. Its Ftrack that has a debug log call with the non ascii characters.

For now we have just made sure that all names in Ftrack are ascii friendly.

@mottosso
Copy link
Member

That is odd. Pyblish shouldn't collect log messages from anything but logs coming from a pyblish handler. That sounds like a bug, and maybe that one is one we could tackle.

This is the one responsible, would you like to have a look? It should collect logs, but only from those coming from the plugin it's currently running.

@tokejepsen
Copy link
Member Author

I'm unsure about that the intended functionality of the handler is. Should it only take in logs from the pyblish logger?
Is this test supposed to fail?

import logging

from pyblish import util, api


class collect(api.ContextPlugin):

    def process(self, context):
        log = logging.getLogger("temp_logger")
        log.info("I should not be recorded!")


context = util.publish(plugins=[collect])
for result in context.data["results"]:
    for record in result["records"]:
        print record
        print record.name
        assert record.name.startswith("pyblish")

@mottosso
Copy link
Member

It should only collect log messages made with self.log from within the plug-in.

It's possible we'll need to replace the current logger if it cannot steer clear of picking accidentally getting other log messages.

In your example, the I should not be recorded should indeed not be recorded in the results list.

@tokejepsen
Copy link
Member Author

It's possible we'll need to replace the current logger if it cannot steer clear of picking accidentally getting other log messages.

Got any ideas of how to approach this?

Maybe I should make a test first and then fix it?

@tokejepsen
Copy link
Member Author

The specific issue of getting the Ftrack non-ascii user names will be solved with pyblish/pyblish-base#319, so I'm happy :)

Can't say that it solves the underlying issue with serializing non-ascii characters with pyblish-qml. So should we close this issue or keep it open?

@mottosso
Copy link
Member

Exactly, this is still an issue if (for whatever reason) the Python script performing the logging uses e.g. UTF-8 and includes a character not in the ASCII table.

The ideal solution is for me to understand Unicode and write Unicode-compatible code. One can dream.

Let's leave it open.

@virtualengineer
Copy link

virtualengineer commented Jun 18, 2019

I would like to add to this, as I am dealing with unicode / utf-8 strings in Pyblish and have noticed differences between Pyblish-Lite and Pyblish-QML.

OS: Windows 10
IDE : PyCharm / MayaCharm plugin
Python : 3.7
Pyblish: pyblish-lite, pyblish-qml, pyblish-maya

Update:
The script is being tested under Maya 2018, Python version 2.7.11
However, The pyblish-qml python executable is set to a python 3.7 exe file.
ex: pyblish_qml.api.register_python_executable("C:\python37-64\python.exe")

I have a Validator plugin to validate a model in Maya.

In my Python file, encoding type is at top:

# -*- coding: utf-8 -*-

class ValidateModel(pyblish.api.InstancePlugin):
    # pyblish label
    label = "Validate Models"
    # pyblish order
    order = pyblish.api.ValidatorOrder
    # pyblish hosts
    hosts = ["maya"]

    def process(self, instance):
        from maya import cmds
        model_name = instance.data["name"]
        model_node = instance

        # example of differences with assert / log

        # this works on pyblish-lite gui,  but will crash on pyblish-qml
        assert model_node == ["model"], (u"{0}: メッシュ名は 'model'でなければなりません。".format(model_name))
        self.log.info(u"{0}: Model検証に合格しました。".format(model_name))
         
        # this works on pyblish-qml gui, but does not display anything in the pyblish-lite gui
        assert model_node == ["model"], (u"{0}: メッシュ名は 'model'でなければなりません。".format(model_name).encode("utf-8"))
        self.log.info(u"{0}: Model検証に合格しました。".format(model_name).encode("utf-8"))

The strings are marked as unicode strings with the u before the quotes.
The difference is that the string must be encoded as utf-8 when using pyblish-qml, or else it will crash.
The utf-8 encoded string does not display at all in the pyblish-lite gui, so all error messages (assert) are missing.

@mottosso
Copy link
Member

mottosso commented Jun 18, 2019

Thanks for reporting this @virtualengineer.

differences between Pyblish-Lite and Pyblish-QML.

Do you mean between Python 2 and 3? If you're running Lite, you are probably (?) running Python 2, and the error suggest Python 3.

Would you be able to test this at a lower level, e.g. by using a QListView like Lite does?

# Untested
# From e.g. Maya Script Editor
from Qt import QtWidgets, QtCore
model = QtCore.QStringList(["{0}: Model検証に合格しました。".format("model name")])
view = QtWidgets.QListView()
view.setModel(model)

Then try running this from e.g. a separate script, with a new QApplication.

@virtualengineer
Copy link

virtualengineer commented Jun 18, 2019

I am currently testing under Maya 2018 using PyCharm / MayaCharm plugin.
However, the Pyblish-QML python executable is set to a python 3.7 .exe file.
pyblish_qml.api.register_python_executable("C:\python37-64\python.exe")

@mottosso
Copy link
Member

Ok, don't forget to test without MayaCharm, as it may also affect encoding as text is transferred between PyCharm and Maya.

Also keep in mind pyblish-qml is tested with Python 2.7-3.6 and that you may encounter other issues using 3.7.

@davidlatwe
Copy link
Collaborator

davidlatwe commented Aug 23, 2019

Seems I have fixed it :O
But not sure if I resolve this right, can anyone able to have a look or test ?
It's in my forked branch Fix251, and here's my test case.

Edit:
The error encoding was simply taking message from args[0], this is not enough but that's the idea for now.

I ran this in my sublime text editor, set the build system to Python 2.7, and running QML in Python 3.6

@virtualengineer
Copy link

Seems I have fixed it :O
But not sure if I resolve this right, can anyone able to have a look or test ?
It's in my forked branch Fix251

Thanks David.
I will test this out after it is merged.
Hope it works cause I currently have to use a safe_string util function so that I can safely switch back and forth between lite and qml without it crashing.

@mottosso
Copy link
Member

@virtualengineer If you are able, it would be better if you could test it before we merge and release; that way we'll know that the released version actually works.

cd some_install_dir
pip install git+https://github.com/davidlatwe/pyblish-qml.git@Fix251 --target some_install_dir

This will automatially clone and install his fork, along with installing it into that specific folder if you don't want to use a virtualenv or pollute your global packages.

@virtualengineer
Copy link

virtualengineer commented Aug 30, 2019

Well, I can't test with python3 at the moment due to an issue with the custom python setup I am working with. After the issue is resolved I will test the above code with python3.

As for Python27,
I tried installing PyQT5 for Python27 and disabled my "pyblish-gui safe" utf8 string encoding function.
Then in Maya, I tried using both the lite gui and the qml gui.
Under Python27, there were no problems displaying utf8 non-english strings in either gui.
No crashes, all good.

**note for other readers - PyQt5 for Python27 was installed using:
c:\python27\python -m pip install git+git://github.com/pyqt/python-qt5.git

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants