Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: csv encode format #3881

Merged
merged 2 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/en/openmldb_sql/dml/LOAD_DATA_STATEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,10 @@ Importing supports two data formats: CSV and Parquet. Special attention should b
2. Dates cannot be in the `yyyy-MM-dd HH:mm:ss` format; for instance, `2022-2-2 00:00:00` will result in a parsing error.
5. Local mode does not support quote escaping for strings. If your strings contain quote characters, it is recommended to use the cluster mode.
6. If cluster mode encounters parsing failures during CSV reading, the failed column values are set to NULL, and the import process continues. In local mode, parsing failures result in direct errors, and the import is not continued.
7. It is recommended to use UTF-8 as the encoding format for CSV. Chinese character encodings like GB are not supported. If the Chinese characters in the table data are displayed as garbled after CSV import, please convert the format of the CSV source data before importing. You can refer to the following command:
```bash
iconv -f GBK -t UTF-8 gbk.csv > utf8.csv
```

## PutIfAbsent Explanation

Expand Down
4 changes: 4 additions & 0 deletions docs/zh/openmldb_sql/dml/LOAD_DATA_STATEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,10 @@ curl http://<ns_endpoint>/NameServer/UpdateOfflineTableInfo -d '{"db":"<db_name>
2. date不可以是年月日时分秒,例如`2022-2-2 00:00:00`将解析失败。
5. local的字符串不支持quote转义,所以如果你的字符串中存在quote字符,请使用cluster模式。
6. cluster如果读取csv时解析失败,将会把失败的列值设为NULL,继续导入流程,但local模式会直接报错,不会继续导入。
7. csv的编码格式推荐使用UTF-8,不支持GB一类的中文字符编码。如果csv导入后表数据中的中文为乱码,请先转换csv源数据的格式再导入,参考命令:
```bash
iconv -f GBK -t UTF-8 gbk.csv > utf8.csv
```

## PutIfAbsent说明

Expand Down
Loading