-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add UTF8 BOM to csv export files #8637
Comments
cc @CharlesLdy |
cc @Bargs |
I think this is a better way to avoid the messy code. The question is that if I can only use it in Kibana version 5.0 after the feature be added? Now my production version is Kibana 4.5.4. |
Since this would be a bug fix and not a new feature, it could be back-ported to the 4.6 branch and come out in the next release of that branch. |
It looks like adding the UTF-8 BOM doesn't fix all versions of Excel, specifically Excel 2011 for Mac. However, it's fixing it for Microsoft Excel 2016 on both Mac and Windows, and it's the technically correct way to format the file. |
Kibana version: 5.0
Elasticsearch version: 5.0
Server OS version:
Browser version:
Browser OS version:
Original install method (e.g. download page, yum, from source, etc.):
Description of the problem including expected versus actual behavior:
There have been some github issues (#8598)
and discuss threads (https://discuss.elastic.co/t/resolved-korean-text-is-broken-when-open-the-exported-csv-with-excel/53436/4)
about Excel not opening the CSV export files and showing the UTF8 characters properly. There's a work-around but it seems like we should be writing that byte order marker into the csv we create. If we do that then Excel does open files properly in my testing.
The work-around is;
Instead of just clicking the csv to open with Excel,
open Excel and use Data > Get External Data > From File and check that the
File origin is 65001: Unicode (UTF-8)
Kibana does seem to export UTF8 characters correctly into the CSV but doesn't set the BOM in the file. Here's how you can see that;
Steps to reproduce:
Go to any Kibana visualization
click the little arrow to collapse the visualization and see the Export links.
I exported Raw, but I'm pretty sure the only difference between Raw and Formatted is how dates are exported.
in my case (CPU Usage visualization from metricbeat) my data was this
and you can see there's no BOM like this;
Now this data only contains 7-bit lower ASCII chars so it opens in Excel just fine. So I modified it by pasting some Kanji chars into it using Notepad++ so now I have this;
Just saving with Notepad++ doesn't add the byte order marker. And when I open it with Excel I see the problem reported in Discuss and Github issues;
But Notepad++ has a menu item
Encoding > Convert to UTF-8-BOM
and when I use that and save it as a new file and check the first 3 bytes I see the BOM;And when I open that with Excel (2016 on Windows 10) it appears correctly;
So I conclude that we should be writing that BOM in the CSV Export output file. It's proper i18n handling of data.
The text was updated successfully, but these errors were encountered: