Unicode Escape/Unescape

Escape and unescape Unicode characters for various programming contexts

Enter plain text to escape or \\uXXXX sequences to unescape

What is Unicode Escaping?

Unicode escaping converts characters to \uXXXX format, where XXXX represents the hexadecimal Unicode code point. This is essential for representing special characters, international text, and emoji in source code, JSON, and other contexts where direct Unicode input might not be supported or desired. For example, 'é' becomes \u00E9.

How to Use

  1. Enter or paste text in the input field
  2. Click "Escape to Unicode" to convert characters to \uXXXX format
  3. Click "Unescape Unicode" to convert escape sequences back to characters
  4. Copy the result using the "Copy Output" button

Escaping Example

Input:

Café ☕ 你好

Escaped Output:

Caf\u00E9 \u2615 \u4F60\u597D

Unescaping Example

Input:

Hello \u0057orld \uD83D\uDE00

Unescaped Output:

Hello World 😀

Common Use Cases

  • Creating portable source code with international characters
  • Embedding Unicode in JSON strings for API responses
  • Working with legacy systems that don't support UTF-8
  • Representing emoji and special symbols in code
  • Debugging Unicode-related issues in applications
  • Ensuring cross-platform compatibility of text data

Surrogate Pairs for Emoji

Characters beyond U+FFFF (including most emoji) require surrogate pairs - two \uXXXX sequences. For example, 😀 (U+1F600) becomes \uD83D\uDE00. The first code (high surrogate, D800-DBFF) and second code (low surrogate, DC00-DFFF) combine to represent the character. This tool handles surrogate pairs automatically.

Character Escaping Rules

This tool keeps ASCII printable characters (space through ~, codes 32-126) as-is for readability, except backslash which is always escaped. All other characters including:

  • Control characters (newline, tab, etc.)
  • Non-ASCII letters (é, ñ, 你, etc.)
  • Special symbols (©, ™, €, etc.)
  • Emoji and pictographs (😀, 🎉, etc.)

are converted to \uXXXX format.

Language Support

Unicode escape sequences (\uXXXX) are supported in many programming languages:

  • JavaScript/JSON: Native support for \uXXXX
  • Java: Native support for \uXXXX
  • C#: Native support for \uXXXX
  • Python: Use \uXXXX in Unicode strings
  • C/C++: Use \uXXXX in Unicode string literals

Privacy Note

All escaping and unescaping happens in your browser. Your text never leaves your device, ensuring complete privacy and security.

Frequently Asked Questions

What is Unicode escaping and why is it used?

Unicode escaping converts characters to \uXXXX format, where XXXX is the hexadecimal code point. It's used in programming to represent special characters, non-ASCII text, and emoji in source code that might not support UTF-8 encoding. For example, 'é' becomes '\u00E9'. This ensures code portability across different systems and editors.

How do surrogate pairs work for emoji?

Characters beyond U+FFFF (like emoji) require surrogate pairs - two \uXXXX sequences. For example, 😀 (U+1F600) becomes '\uD83D\uDE00'. The first code (high surrogate, D800-DBFF) and second code (low surrogate, DC00-DFFF) combine to represent the character. This tool handles surrogate pairs automatically.

Which characters are escaped vs kept as-is?

ASCII printable characters (space through ~, codes 32-126) are kept as-is for readability, except backslash which is always escaped. All other characters including newlines, tabs, Unicode letters, and emoji are converted to \uXXXX format. This balances readability with proper escaping.

Can I use this for JSON strings?

Yes! Unicode escape sequences are valid in JSON strings. This tool is perfect for creating JSON with international characters or emoji that need to work across all systems. JSON parsers automatically decode \uXXXX sequences back to the original characters.

What's the difference between \uXXXX and \xXX?

\uXXXX is Unicode escape with 4 hex digits, supporting characters up to U+FFFF (or surrogate pairs for higher). \xXX is byte escape with 2 hex digits, limited to 0-255. Unicode escaping is more versatile and works with all Unicode characters including international text and emoji.

Does this work with all programming languages?

Most modern languages support \uXXXX Unicode escapes including JavaScript, Java, C#, Python (with 'u' prefix), and JSON. Some languages use different formats like \U00XXXXXX (8 digits) or \x{XXXX}. Check your language's documentation for the exact syntax it supports.

Is my data sent to a server?

No, all escaping and unescaping happens in your browser using JavaScript. Your text never leaves your device, ensuring complete privacy and security. This is especially important when working with sensitive content or user data.