Unicode code points are unique numeric identifiers assigned to every character in the Unicode standard, represented in hexadecimal notation as U+XXXX. For example, the letter 'A' is U+0041, the copyright symbol © is U+00A9, and the emoji 👋 is U+1F44B. Unicode encompasses over 149,000 characters across 159 scripts, including Latin, Chinese, Arabic, emoji, mathematical symbols, and historical scripts. Each code point maps one-to-one to a character, providing a universal character identification system that works across all platforms and languages.
The converter performs bidirectional transformation: characters to code points and code points to characters. Encoding text displays the Unicode notation for each character: 'Hello' becomes 'H = U+0048, e = U+0065, l = U+006C, l = U+006C, o = U+006F'. Decoding code points renders them as actual characters: entering 'U+1F44B' displays the waving hand emoji 👋. This tool helps developers verify character encoding, find the correct code point for symbols, and debug encoding issues where unexpected characters appear.
Unicode code points are organized into planes and blocks. The Basic Multilingual Plane (BMP, U+0000 to U+FFFF) contains most common characters including Latin, Greek, Cyrillic, and CJK. Supplementary planes (U+10000 to U+10FFFF) include emoji, rare CJK characters, and historical scripts. Code points are written with 4-6 hex digits, zero-padded to the appropriate length (U+0041 for ASCII 'A', U+1F44B for emoji). Understanding this structure helps navigate Unicode documentation and identify why certain characters require more bytes in UTF-8 or UTF-16 encoding.