Text makeup – About

text.makeup is made by me, Marcin Wichary. Contact me, or file a bug / suggest an idea.

Manifesto

I kind of love Unicode. There are so many stories hidden within all the codepoints, and so much strange complexity.

I want this tool to be somewhere at the intersection of “useful” and “fun.” You might want to just paste a string that’s giving you trouble, but not just that. Hopefully, you will also want to click around, learn, explore. I want information, but I also want stories.

This is meant to be a site for nerds, but specifically not Unicode nerds. (Many sites for Unicode nerds, filled with technical info and jargon already exist!)

Proof of concept

This site is a proof of concept. Only some aspects and some specific examples work (more information about coverage).

Do you think it could be useful for you or others? What would you want to see? Debugging horror stories? Escaping? Regional stuff? RegExps? Python? RTL? Should it be open sourced? More performant? Better designed? Please send encouragement and bug reports if you’d like it become a real thing!

Completeness of things

General Unicode information Complete + Current as of Sep 2024
Emoji Complete + Current as of Sep 2024 (but few stories)
HTML escaping Complete + Current as of Sep 2024
JavaScript escaping Complete + Current as of Sep 2024
URL escaping Almost complete, lacks support for international domains (Punycode encoding)
Precombined/normalized characters Complete as per JavaScript natural support, might need to be extended
Related characters Complete as per Unicode, might need to be extended
Quoted printable Complete, but no advanced MIME or character set support
Homoglyphs Complete set, but few stories yet
Control characters (and symbols for) Complete, could use more stories
Emoji shortcodes Slack only, complete + current as of Sep 2024
Typographical details Very limited
URL parameters Few only (just UTM and basic text fragments)
Languages Polish only
“Fake fonts” One only
Obsolete sequences Few only (mostly deprecated flags)
Emoticons/Kaomoji Few only
Custom/Proprietary emoji Few only
Mojibake Only code page 1252
RTL No support yet
What else should be here? Contact me or file an issue in GitHub

Sources

Data sources:

Unicode Data 16.0.0 · Current as of Sep 2024
Unicode confusables 16.0.0 · Current as of Sep 2024
HTML entities Current as of Aug 2024, not planned to ever be updated
Emoji 16.0.0 · Current as of Sep 2024
Emoji zero-width joiner sequences 15.1 · Current as of Aug 2024
Emoji shortcodes by Cal Henderson 15.1.2 · Current as of Sep 2024
Variation Selector 15/16 + Emoji 16.0.0 · Current as of Sep 2024
Obsolete emoji flags Current as of Sep 2024
Apple SF Symbols 6 beta Current as of Sep 2024

Fonts:

Noto Color Emoji 2.042. Supports Unicode up to 15.1. Current as of Sep 2024
Noto Emoji 3.002. Supports Unicode up to 15.1. Current as of Sep 2024
Noto Sans Symbols 2.003. Current as of Sep 2024
Noto Sans Symbols 2 2.008. Current as of Sep 2024
Unifont 15.1.05. Current as of Sep 2024

Libraries:

punycode.js 2.3.1. Current as of Sep 2024. Thank you to Mathias Bynens

Changelog

16 September 2024 0.22 added compatibility with older Firefox browsers without Intl.Segmenter
15 September 2024 0.21 a quick fix to stop IDN domains from breaking the tool (thank you to Stefan Ihringer), plus small mobile/keyboard/UI fixes
15 September 2024 0.2 proof of concept (current)
9 August 2024 0.1 pre-release

Privacy

All the string processing happens on the client. (Very slowly right now.)

Nods and acknowledgements

Thank you to Manuel Strehl for creating the site Codepoints!

Keyboard shortcuts

Shift+Esc Open/Close sidebar
Esc Defocus/Show all
Tab and Shift+Tab Jump between characters of interest
↑↓ or PgUp/PgDn Increment or decrement a selected character
+click More precise selection
+click Open Codepoints.net with more information