Debugging #33

dejmail · 2020-09-10T11:40:11Z

Hi there

This might be a question for DRF instead, but how exactly does one use pdb with this library. If one inserts a set_trace() the output on the django server keeps rolling past and so even though one is able to interact with pdb, the command prompt disappears under a torrent of HTTP requests. Is there any way to pause everything so I can debug ?

Thanks

The text was updated successfully, but these errors were encountered:

techdragon · 2021-11-10T04:25:18Z

I wouldn't mind some debugging insights, not for the reasons you asked... but since even using "BACKEND": "data_wizard.backends.immediate" I couldn't seem to get my IDE (Pycharm) to catch any errors from data_wizard. Which combined with the test setup complexity, makes it harder than it needs to be to work with this library. I'm trying to fix #31 because I'm using the very common django-storages library, and making zero progress because I'm getting no useful output from tests or debugging.

(see wq/django-data-wizard#33)

sheppard · 2021-11-18T07:25:39Z

I will add some documentation on debugging tips, but here are a few things to start:

General Tips

Given the wide variety of use cases and failure points, Data Wizard traps most errors by default, to ensure the user can get a short, hopefully informative message rather than a generic 500 error. The trapped errors are logged via python's logging module.
The threading backend (enabled by default) adds another layer of indirection when trying to identify an exception.
Thus, if you are writing a custom Iter or Serializer class, make sure each component works in isolation before trying to debug within the Data Wizard stack. (See examples below)
Once you have confirmed that itertable and the serializer are working individually, try running data_wizard without any web UI traffic via the CLI (./manage.py runwizard).
Once that is working, try running through the web UI with ./manage.py runserver and the immediate backend:

DATA_WIZARD = {
    "BACKEND": "data_wizard.backends.immediate"
}

Debugging File Loading/Parsing (IterTable)

To debug issues loading and parsing files, try using itertable directly:

from itertable import load_file

for row in load_file('/path/to/file.xlsx'):
    print(row)

Note that existing releases of itertable automatically suppress the OSError raised when a file is inaccessible, so it doesn't even make it back to Data Wizard. For the next release, I changed this to raise itertable.exceptions.LoadFailed unless require_existing is explicitly set to false.

If you are writing a custom Iter class, test the class with a similar loop:

from myapp import CustomIter

for row in CustomIter(filename='/path/to/file.xlsx'):
    print(row)

Debugging the Serializer (DRF)

To investigate validation issues, try instantiating the DRF serializer class directly.

from data_wizard import registry
Serializer = registry.get_serializer("My Model")
serializer = Serializer(data={"test": "data"})
serializer.is_valid(raise_exception=True)

Note that data_wizard traps any and all serializer errors for individual rows, saving only the error text to the Record table. The full stack trace is still sent to the Python logging module.

sheppard added a commit to wq/itertable that referenced this issue Nov 18, 2021

require_existing=True by default

b7b5ddc

(see wq/django-data-wizard#33)

sheppard pinned this issue Nov 18, 2021

sheppard added the question label Nov 19, 2021

sheppard closed this as completed in 466a240 Jun 8, 2023

sheppard unpinned this issue Jun 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Debugging #33

Debugging #33

dejmail commented Sep 10, 2020

techdragon commented Nov 10, 2021

sheppard commented Nov 18, 2021

Debugging #33

Debugging #33

Comments

dejmail commented Sep 10, 2020

techdragon commented Nov 10, 2021

sheppard commented Nov 18, 2021

General Tips

Debugging File Loading/Parsing (IterTable)

Debugging the Serializer (DRF)