Unicode Escape/Unescape
Escape and unescape Unicode characters for various programming contexts
Enter plain text to escape or \\uXXXX sequences to unescape
What is Unicode Escaping?
Unicode escaping converts characters to \uXXXX format, where XXXX represents
the hexadecimal Unicode code point. This is essential for representing special characters,
international text, and emoji in source code, JSON, and other contexts where direct Unicode
input might not be supported or desired. For example, 'é' becomes \u00E9.
How to Use
- Enter or paste text in the input field
- Click "Escape to Unicode" to convert characters to
\uXXXXformat - Click "Unescape Unicode" to convert escape sequences back to characters
- Copy the result using the "Copy Output" button
Escaping Example
Input:
Café ☕ 你好 Escaped Output:
Caf\u00E9 \u2615 \u4F60\u597D Unescaping Example
Input:
Hello \u0057orld \uD83D\uDE00 Unescaped Output:
Hello World 😀 Common Use Cases
- Creating portable source code with international characters
- Embedding Unicode in JSON strings for API responses
- Working with legacy systems that don't support UTF-8
- Representing emoji and special symbols in code
- Debugging Unicode-related issues in applications
- Ensuring cross-platform compatibility of text data
Surrogate Pairs for Emoji
Characters beyond U+FFFF (including most emoji) require surrogate pairs - two
\uXXXX sequences. For example, 😀 (U+1F600) becomes
\uD83D\uDE00. The first code (high surrogate, D800-DBFF) and second code
(low surrogate, DC00-DFFF) combine to represent the character. This tool handles
surrogate pairs automatically.
Character Escaping Rules
This tool keeps ASCII printable characters (space through ~, codes 32-126) as-is for readability, except backslash which is always escaped. All other characters including:
- Control characters (newline, tab, etc.)
- Non-ASCII letters (é, ñ, 你, etc.)
- Special symbols (©, ™, €, etc.)
- Emoji and pictographs (😀, 🎉, etc.)
are converted to \uXXXX format.
Language Support
Unicode escape sequences (\uXXXX) are supported in many programming languages:
- JavaScript/JSON: Native support for
\uXXXX - Java: Native support for
\uXXXX - C#: Native support for
\uXXXX - Python: Use
\uXXXXin Unicode strings - C/C++: Use
\uXXXXin Unicode string literals
Privacy Note
All escaping and unescaping happens in your browser. Your text never leaves your device, ensuring complete privacy and security.
Frequently Asked Questions
What is Unicode escaping and why is it used?
Unicode escaping converts characters to \uXXXX format, where XXXX is the hexadecimal code point. It's used in programming to represent special characters, non-ASCII text, and emoji in source code that might not support UTF-8 encoding. For example, 'é' becomes '\u00E9'. This ensures code portability across different systems and editors.
How do surrogate pairs work for emoji?
Characters beyond U+FFFF (like emoji) require surrogate pairs - two \uXXXX sequences. For example, 😀 (U+1F600) becomes '\uD83D\uDE00'. The first code (high surrogate, D800-DBFF) and second code (low surrogate, DC00-DFFF) combine to represent the character. This tool handles surrogate pairs automatically.
Which characters are escaped vs kept as-is?
ASCII printable characters (space through ~, codes 32-126) are kept as-is for readability, except backslash which is always escaped. All other characters including newlines, tabs, Unicode letters, and emoji are converted to \uXXXX format. This balances readability with proper escaping.
Can I use this for JSON strings?
Yes! Unicode escape sequences are valid in JSON strings. This tool is perfect for creating JSON with international characters or emoji that need to work across all systems. JSON parsers automatically decode \uXXXX sequences back to the original characters.
What's the difference between \uXXXX and \xXX?
\uXXXX is Unicode escape with 4 hex digits, supporting characters up to U+FFFF (or surrogate pairs for higher). \xXX is byte escape with 2 hex digits, limited to 0-255. Unicode escaping is more versatile and works with all Unicode characters including international text and emoji.
Does this work with all programming languages?
Most modern languages support \uXXXX Unicode escapes including JavaScript,
Java, C#, Python (with 'u' prefix), and JSON. Some languages use different formats like
\U00XXXXXX (8 digits) or \x{XXXX}. Check your
language's documentation for the exact syntax it supports.
Is my data sent to a server?
No, all escaping and unescaping happens in your browser using JavaScript. Your text never leaves your device, ensuring complete privacy and security. This is especially important when working with sensitive content or user data.