Add UTF8 BOM to csv export files #8637

LeeDr · 2016-10-12T14:41:07Z

Kibana version: 5.0

Elasticsearch version: 5.0

Server OS version:

Browser version:

Browser OS version:

Original install method (e.g. download page, yum, from source, etc.):

Description of the problem including expected versus actual behavior:
There have been some github issues (#8598)
and discuss threads (https://discuss.elastic.co/t/resolved-korean-text-is-broken-when-open-the-exported-csv-with-excel/53436/4)
about Excel not opening the CSV export files and showing the UTF8 characters properly. There's a work-around but it seems like we should be writing that byte order marker into the csv we create. If we do that then Excel does open files properly in my testing.

The work-around is;

Instead of just clicking the csv to open with Excel,
open Excel and use Data > Get External Data > From File and check that the
File origin is 65001: Unicode (UTF-8)

Kibana does seem to export UTF8 characters correctly into the CSV but doesn't set the BOM in the file. Here's how you can see that;

Steps to reproduce:

Go to any Kibana visualization
click the little arrow to collapse the visualization and see the Export links.
I exported Raw, but I'm pretty sure the only difference between Raw and Formatted is how dates are exported.

in my case (CPU Usage visualization from metricbeat) my data was this

$ cat '/c/Users/Lee/Downloads/CPU Usage.csv'
idle,sys,user
"0.9934333333333334","0.0018000000000000006","0.0015555555555555557"

and you can see there's no BOM like this;

$ head -c 3 '/c/Users/Lee/Downloads/CPU Usage.csv' | hexdump -C
00000000  69 64 6c                                          |idl|
00000003

Now this data only contains 7-bit lower ASCII chars so it opens in Excel just fine. So I modified it by pasting some Kanji chars into it using Notepad++ so now I have this;
```
idle,sys,漢字user
"0.9934333333333334","0.0018000000000000006","0.0015555555555555557"
```
Just saving with Notepad++ doesn't add the byte order marker. And when I open it with Excel I see the problem reported in Discuss and Github issues;
But Notepad++ has a menu item Encoding > Convert to UTF-8-BOM and when I use that and save it as a new file and check the first 3 bytes I see the BOM;
```
$ head -c 3 '/c/Users/Lee/Downloads/CPUUsageBOM.csv' | hexdump -C
00000000  ef bb bf                                          |...|
00000003
```
And when I open that with Excel (2016 on Windows 10) it appears correctly;

So I conclude that we should be writing that BOM in the CSV Export output file. It's proper i18n handling of data.

The text was updated successfully, but these errors were encountered:

LeeDr · 2016-10-12T14:43:19Z

cc @CharlesLdy

LeeDr · 2016-10-12T14:43:49Z

cc @Bargs

CharlesLdy · 2016-10-13T01:55:35Z

I think this is a better way to avoid the messy code. The question is that if I can only use it in Kibana version 5.0 after the feature be added? Now my production version is Kibana 4.5.4.

LeeDr · 2016-10-13T14:32:03Z

Since this would be a bug fix and not a new feature, it could be back-ported to the 4.6 branch and come out in the next release of that branch.

kobelb · 2016-10-13T14:51:12Z

It looks like adding the UTF-8 BOM doesn't fix all versions of Excel, specifically Excel 2011 for Mac. However, it's fixing it for Microsoft Excel 2016 on both Mac and Windows, and it's the technically correct way to format the file.

LeeDr added bug Fixes for quality problems that affect the customer experience P2 labels Oct 12, 2016

LeeDr mentioned this issue Oct 12, 2016

The export of Chinese data become messy code #8598

Closed

kobelb self-assigned this Oct 13, 2016

kobelb mentioned this issue Oct 13, 2016

Specifying the utf-8 charset when exporting aggregate tables #8662

Merged

jbudz added the PR sent label Oct 17, 2016

kobelb closed this as completed in #8662 Oct 28, 2016

LeeDr mentioned this issue Nov 21, 2016

Vislib Point Series updates #9044

Merged

1Copenut mentioned this issue Oct 4, 2024

[Alerts > Landing][SCREEN READER]: EuiButtonGroup with nested headings would be better as EuiTabs tabbed content #195077

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add UTF8 BOM to csv export files #8637

Add UTF8 BOM to csv export files #8637

LeeDr commented Oct 12, 2016

LeeDr commented Oct 12, 2016

LeeDr commented Oct 12, 2016

CharlesLdy commented Oct 13, 2016

LeeDr commented Oct 13, 2016

kobelb commented Oct 13, 2016

Add UTF8 BOM to csv export files #8637

Add UTF8 BOM to csv export files #8637

Comments

LeeDr commented Oct 12, 2016

LeeDr commented Oct 12, 2016

LeeDr commented Oct 12, 2016

CharlesLdy commented Oct 13, 2016

LeeDr commented Oct 13, 2016

kobelb commented Oct 13, 2016