-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClickHouse some ENUM types doesn't stores in "datasources" - psycopg2.errors.StringDataRightTruncation: value too long for type character varying(32) #13572
Comments
@hodgesrm when we try to add "dataset" then superset itself try to store selected table from Datasource fields names and types into internal PostgreSQL \ MySQL \ sqlite table with name |
Here's a detailed reproduction of this issue. It does not come up in SQLite when I use a virtual env and dev setup. It seems to be related to the VARCHAR(36) column size in PostgreSQL.
ClickHouse Enum types are potentially very large because they list all values. It's hard to give an upper bound for ClickHouse types but if you are storing ClickHouse native types names in VARCHAR I would allocate at least 500 characters. Another option obviously would be to convert to standard SQL types within Superset, which I understand is possible to do. |
I have encountered similar issues with Trino columns. Trino supports nested data types such as an array of struct with many fields. Some of these fields can be arrays of other struct etc. So the type of the column can be quite long
When you try to save a dataset containing columns like this it can fail because Superset's DataSet Model only allocates 32 characters to store the type of the column. In the UI you only see an error and it does not explain why it fails. In the server logs you can see the cause which is a column type is too long to fit into the 32 chars. However as a user you have no idea. If the type is less then 32 chars say STRUCT< F1 INT , F2 STRING>. Superset will accept it and will treat it as a string when rendering the value. Instead of trying to store the type of such columns it might be better to fall back to a type that indicates it's a complex type. For example if a column is of type ARRAY or STRUCT save it as COMPLEX or JSON. @villebro did you encounter this issue before? Any other solution? |
Thanks for raising this issue. This isn't the first time people have bumped into the 32 char limit, and with NoSQL/complex datatypes becoming ever more prevalent, we should definitely address this shortcoming. Let's just change it to On a related note, @betodealmeida did amazing work to improve support for complex data types on BigQuery in #16822. I'm personally a big fan of Clickhouse and Trino, and would love to see similar functionality and tests added for Clickhouse and Trino, too. So let's collaborate on this! |
Sounds good. We will explore falling back to TEXT when the fields is of STRUCT/ARRAY type. This should eliminate the 32 chars issue while keeping the current functionality. |
Fixed by #17360 , closing |
I setup superset 1.0.1 + master version clickhouse driver from git https://github.com/xzkostyan/clickhouse-sqlalchemy
I successfully added "database" and try to add new "dataset", and press "Add"
Expected results
Successful added dataset
Actual results
Additional context
Stacktrace
How exactly should converts ENUM8 type from ClickHouse to superset?
The text was updated successfully, but these errors were encountered: