Toto je staršia verzia dokumentu!
Unicode NFC normalisation for Rclone on macOS
TL;DR: macOS devices create all filenames in Unicode Decomposed Normalisation Form (NFD), while every other major OS uses Composed Normalisation Form (NFC). This makes you, as a Mac user, the bad guy, because it is you who is incompatible with the rest of the world.
The upshot is that when you create files and folders with diacritics1) they will be copied to their devices with filenames stored as decomposed strings – a non‑standard for these OS'es.
through Rclone, they are copied to your clouds with filenames stored as decomposed strings. This creates three different problems:
- First, if other users on different OS'es are renaming the files you created (or renamed), they need to press Backspace twice when they want to remove a letter with diacritics (e.g. á or ü). Renaming a filename with a lot of diacritics, like XXX, can become pretty lengthy process. And note, this applies also to you when you are accessing these clouds from a web client (i.e. your browser).
- Second,
Technical background
Due to some technical under-the-hood changes that Apple has made when it switched from HFS+ to APFS file system in its devices back in 2017,
Text…
$ nm -gU /usr/lib/libiconv.2.dylib 00000000000f2700 D __libiconv_version 0000000000002360 T _iconv 000000000000267a T _iconv_canonicalize 0000000000002382 T _iconv_close 0000000000001049 T _iconv_open 000000000000238f T _iconvctl 0000000000002488 T _iconvlist 0000000000013ff8 T _libiconv_set_relocation_prefix |
$ nm -gU /usr/local/lib/libiconv.2.dylib 00000000000e3290 D __libiconv_version 0000000000003430 T _iconv_canonicalize 0000000000002ce0 T _libiconv 0000000000002d10 T _libiconv_close 00000000000016b0 T _libiconv_open 0000000000002d20 T _libiconv_open_into 0000000000015eb0 T _libiconv_set_relocation_prefix 0000000000003160 T _libiconvctl 0000000000003270 T _libiconvlist 0000000000015dd0 T _locale_charset |
$ nm -gU /usr/lib/libiconv.2.dylib
00000000000f2700 D __libiconv_version 0000000000002360 T _iconv 000000000000267a T _iconv_canonicalize 0000000000002382 T _iconv_close 0000000000001049 T _iconv_open 000000000000238f T _iconvctl 0000000000002488 T _iconvlist 0000000000013ff8 T _libiconv_set_relocation_prefix
$ nm -gU /usr/local/lib/libiconv.2.dylib 00000000000e3290 D __libiconv_version 0000000000003430 T _iconv_canonicalize 0000000000002ce0 T _libiconv 0000000000002d10 T _libiconv_close 00000000000016b0 T _libiconv_open 0000000000002d20 T _libiconv_open_into 0000000000015eb0 T _libiconv_set_relocation_prefix 0000000000003160 T _libiconvctl 0000000000003270 T _libiconvlist 0000000000015dd0 T _locale_charset
Solution: libiconv with UTF-8-MAC support
There is a patched libiconv library on GitHub which adds support for UTF‑8‑MAC encoding. Installing it allows you not only to convert between real UTF‑8 and UTF‑8‑MAC encodings
Testing whether the patched iconv works correctly
$ echo "test📖" | /usr/bin/iconv -f utf-8 -t utf-8-mac test� $ echo "test📖" | /usr/local/bin/iconv -f utf-8 -t utf-8-mac test📖
