Use Unicode Escape Sequence to replace encoded characters #2814

drasticactions · 2023-08-26T14:22:12Z

Using special characters within source files can break compiling on some computers with different regions and language settings. I have a ja-JP Windows 11 setup, and trying to compile the current master branch fails on find_bpe_rank due to the special characters recently introduced. Note that using a compiled build is fine; only compiling itself fails.

Using Unicode escape sequences should allow the code to be compiled on all setups without changing your computer's settings or switching regions. Trying out my changes and it seems like everything processes as it should, but hopefully others with more C++ experience know if I screwed something else up here.

e. Searching through the other repos, similar techniques have been done before, so I'm feeling more confident now in this.

The use of special characters within source files can break compiling on some computers with different region and language settings. Using Unicode escape sequences should allow for the code to be compiled on all setups without needing to change your computers settings or switch regions.

llama.cpp

* master: (773 commits) server : add `/detokenize` endpoint (ggerganov#2802) convert.py : advanced option (ggerganov#2753) llama : use Unicode Escape Sequence to replace encoded characters (ggerganov#2814) flake.nix : add rocm support and cleanup (ggerganov#2808) llama : move #includes out of _GNU_SOURCE conditional (ggerganov#2817) main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (ggerganov#1528) llama : use std::abs in llama_sample_tail_free (ggerganov#2800) k-quants : remove unnecessary tensor shape restrictions (ggerganov#2811) Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (ggerganov#2807) Fix HellaSwag (ggerganov#2805) flake : build llama.cpp on Intel with nix (ggerganov#2795) Handle null rope scaling value (ggerganov#2793) Fix spm whitespaces (ggerganov#2806) examples : skip unnecessary external lib in server README.md how-to (ggerganov#2804) llama : fix struct decl (ggerganov#2790) Faster perplexity computation (ggerganov#2786) llama : add llama_beam_search() (ggerganov#2267) convert.py : Get rope scale from HuggingFace models (ggerganov#2772) llama-bench : add model sizes (ggerganov#2771) convert.py : export rope freq_base when converting CodeLlama from an HF model (ggerganov#2773) ...

…erganov#2814) The use of special characters within source files can break compiling on some computers with different region and language settings. Using Unicode escape sequences should allow for the code to be compiled on all setups without needing to change your computers settings or switch regions.

cebtenzzre reviewed Aug 26, 2023

View reviewed changes

llama.cpp Show resolved Hide resolved

klosax approved these changes Aug 26, 2023

View reviewed changes

ggerganov merged commit c7d92e6 into ggerganov:master Aug 26, 2023
25 checks passed

drasticactions deleted the unicode-escape-sequence branch August 27, 2023 00:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Unicode Escape Sequence to replace encoded characters #2814

Use Unicode Escape Sequence to replace encoded characters #2814

drasticactions commented Aug 26, 2023 •

edited

Loading

Use Unicode Escape Sequence to replace encoded characters #2814

Use Unicode Escape Sequence to replace encoded characters #2814

Conversation

drasticactions commented Aug 26, 2023 • edited Loading

drasticactions commented Aug 26, 2023 •

edited

Loading