Post History
I don't think you can persuade Bash to use non-UTF-8 characters internally—I expect much of its code assumes that strings are ASCII-compatible. So $'\u00FF' will always translate, inside Bash, to t...
#2: Post edited
I don't think you can persuade Bash to use non-UTF-8 characters internally—I expect much of its code assumes that strings are ASCII-compatible. So `$'\u00FF'` will always translate, inside Bash, to the UTF-8 byte sequence `c3 bf`, and that's what- If you want to take a UTF-8 byte sequence that Bash (or another utility) has produced and translate it to UTF-16, use iconv:
- ```
- $ printf %s $'\u1234\u00FF' | iconv -t utf-16 | xxd -g2 -e
- 00000000: feff 1234 00ff ..4...
- ```
- As you can see, you'll have to contend with the `\uFEFF` BOM if you do this.
- Note that this will barf if you give it `$'\xFF'`, which isn't valid UTF-8. Mixing the two encodings is a recipe for sadness.
- I don't think you can persuade Bash to use non-UTF-8 characters internally—I expect much of its code assumes that strings are ASCII-compatible. So `$'\u00FF'` will always translate, inside Bash, to the UTF-8 byte sequence `c3 bf`.
- If you want to take a UTF-8 byte sequence that Bash (or another utility) has produced and translate it to UTF-16, use iconv:
- ```
- $ printf %s $'\u1234\u00FF' | iconv -t utf-16 | xxd -g2 -e
- 00000000: feff 1234 00ff ..4...
- ```
- As you can see, you'll have to contend with the `\uFEFF` BOM if you do this.
- Note that this will barf if you give it `$'\xFF'`, which isn't valid UTF-8. Mixing the two encodings is a recipe for sadness.
#1: Initial revision
I don't think you can persuade Bash to use non-UTF-8 characters internally—I expect much of its code assumes that strings are ASCII-compatible. So `$'\u00FF'` will always translate, inside Bash, to the UTF-8 byte sequence `c3 bf`, and that's what If you want to take a UTF-8 byte sequence that Bash (or another utility) has produced and translate it to UTF-16, use iconv: ``` $ printf %s $'\u1234\u00FF' | iconv -t utf-16 | xxd -g2 -e 00000000: feff 1234 00ff ..4... ``` As you can see, you'll have to contend with the `\uFEFF` BOM if you do this. Note that this will barf if you give it `$'\xFF'`, which isn't valid UTF-8. Mixing the two encodings is a recipe for sadness.