Fix: characters with special characters are secretly two characters #440

nienna73 · 2023-04-05T00:20:41Z

When importing pretty much all of Jean's recordings, any character with a circumflex or a macron gets stored as two characters: ˆ+ e instead of ê. There's a script in progress that's supposed to find the unicode character for ˆ, which is \xcc\x82 and replace that with the correct single character.

This script isn't working.

The only way I've done this successfully is by doing it manually.

The script is here: https://github.com/UAlbertaALTLab/recording-validation-interface/blob/production/validation/management/commands/lookforcombinedcharacters.py

fbanados self-assigned this Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: characters with special characters are secretly two characters #440

Fix: characters with special characters are secretly two characters #440

nienna73 commented Apr 5, 2023

Fix: characters with special characters are secretly two characters #440

Fix: characters with special characters are secretly two characters #440

Comments

nienna73 commented Apr 5, 2023