 
 
| Unicode code point | Character | 
| U+0041 | A | 
| U+00E9 | é | 
| U+03B8 | θ (the Greek theta) | 
| U+20AC | € (the euro) | 
| Character | Unicode code point | Byte values in file (UCS-2) | 
| A | U+0041 | 0x00, 0x41 | 
| é | U+00E9 | 0x00, 0xE9 | 
| θ (theta) | U+03B8 | 0x03, 0xB8 | 
| € (euro) | U+20AC | 0x20, 0xAC | 
char type.  This is why C also defines the wchar_t
 type, which can hold a 32-bit character (at least in GNU systems).  To
avoid both of these disadvantages, UTF-8 was introduced.| Character | Unicode code point | Byte values in file (UTF-8) | 
| A | U+0041 | 0x41 | 
| é | U+00E9 | 0xC3, 0xA9 | 
| θ (theta) | U+03B8 | 0xCE, 0xB8 | 
| € (euro) | U+20AC | 0xE2, 0x82, 0xAC | 
wchar_t type), although it is a little more difficult to get the length of the string.$Id: encoding.html,v 1.3 2002/01/13 12:20:00 verthezp Exp $
$Name: R0_90_0 $