ezEngine  Milestone 7
ezUnicodeUtils Class Reference

Helper functions to work with Unicode. More...

#include <UnicodeUtils.h>

Classes

struct  UtfInserter
 [internal] Small helper class to append bytes to some arbitrary container. Used for Utf8 string building. More...
 

Public Member Functions

template<typename ByteIterator >
ezUInt32 DecodeUtf8ToUtf32 (ByteIterator &szUtf8Iterator)
 
template<typename UInt16Iterator >
ezUInt32 DecodeUtf16ToUtf32 (UInt16Iterator &szUtf16Iterator)
 
template<typename WCharIterator >
ezUInt32 DecodeWCharToUtf32 (WCharIterator &szWCharIterator)
 
template<typename ByteIterator >
void EncodeUtf32ToUtf8 (ezUInt32 uiUtf32, ByteIterator &szUtf8Output)
 
template<typename UInt16Iterator >
void EncodeUtf32ToUtf16 (ezUInt32 uiUtf32, UInt16Iterator &szUtf16Output)
 
template<typename WCharIterator >
void EncodeUtf32ToWChar (ezUInt32 uiUtf32, WCharIterator &szWCharOutput)
 

Static Public Member Functions

static bool IsASCII (ezUInt32 uiChar)
 Returns whether a character is a pure ASCII character (only the first 7 Bits are used)
 
static bool IsUtf8ContinuationByte (char uiByte)
 Checks whether the given byte is a byte in a UTF-8 multi-byte sequence.
 
static ezUInt32 GetUtf8SequenceLength (char uiFirstByte)
 Returns the number of bytes that a UTF-8 sequence is in length, which is encoded in the first byte of the sequence.
 
static ezUInt32 ConvertUtf8ToUtf32 (const char *pFirstChar)
 Converts the UTF-8 character that starts at pFirstChar into a UTF-32 character.
 
static ezUInt32 GetSizeForCharacterInUtf8 (ezUInt32 uiCharacter)
 Computes how many bytes the character would require, if encoded in UTF-8.
 
static void MoveToNextUtf8 (const char *&szUtf8, ezUInt32 uiNumCharacters=1)
 Moves the given string pointer ahead to the next Utf8 character sequence. More...
 
static void MoveToPriorUtf8 (const char *&szUtf8, ezUInt32 uiNumCharacters=1)
 Moves the given string pointer backwards to the previous Utf8 character sequence. More...
 
static bool IsValidUtf8 (const char *szString, const char *szStringEnd=ezMaxStringEnd)
 Returns false if the given string does not contain a completely valid Utf8 string.
 
static bool SkipUtf8Bom (const char *&szUtf8)
 If the given string starts with a Utf8 Bom, the pointer is incremented behind the Bom, and the function returns true. More...
 
static bool SkipUtf16BomLE (const ezUInt16 *&szUtf16)
 If the given string starts with a Utf16 little endian Bom, the pointer is incremented behind the Bom, and the function returns true. More...
 
static bool SkipUtf16BomBE (const ezUInt16 *&szUtf16)
 If the given string starts with a Utf16 big endian Bom, the pointer is incremented behind the Bom, and the function returns true. More...
 
template<typename ByteIterator >
static ezUInt32 DecodeUtf8ToUtf32 (ByteIterator &szUtf8Iterator)
 Decodes the next character from the given Utf8 sequence to Utf32 and increments the iterator as far as necessary.
 
template<typename UInt16Iterator >
static ezUInt32 DecodeUtf16ToUtf32 (UInt16Iterator &szUtf16Iterator)
 Decodes the next character from the given Utf16 sequence to Utf32 and increments the iterator as far as necessary.
 
template<typename WCharIterator >
static ezUInt32 DecodeWCharToUtf32 (WCharIterator &szWCharIterator)
 Decodes the next character from the given wchar_t sequence to Utf32 and increments the iterator as far as necessary.
 
template<typename ByteIterator >
static void EncodeUtf32ToUtf8 (ezUInt32 uiUtf32, ByteIterator &szUtf8Output)
 Encodes the given Utf32 character to Utf8 and writes as many bytes to the output iterator, as necessary.
 
template<typename UInt16Iterator >
static void EncodeUtf32ToUtf16 (ezUInt32 uiUtf32, UInt16Iterator &szUtf16Output)
 Encodes the given Utf32 character to Utf16 and writes as many bytes to the output iterator, as necessary.
 
template<typename WCharIterator >
static void EncodeUtf32ToWChar (ezUInt32 uiUtf32, WCharIterator &szWCharOutput)
 Encodes the given Utf32 character to wchar_t and writes as many bytes to the output iterator, as necessary.
 

Static Public Attributes

static const ezUInt16 Utf16BomLE = 0xfffe
 Byte Order Mark for Little Endian Utf16 strings.
 
static const ezUInt16 Utf16BomBE = 0xfeff
 Byte Order Mark for Big Endian Utf16 strings.
 

Detailed Description

Helper functions to work with Unicode.

Member Function Documentation

void ezUnicodeUtils::MoveToNextUtf8 ( const char *&  szUtf8,
ezUInt32  uiNumCharacters = 1 
)
inlinestatic

Moves the given string pointer ahead to the next Utf8 character sequence.

The string may point to an invalid position (in between a character sequence). It may not point to a zero terminator already.

void ezUnicodeUtils::MoveToPriorUtf8 ( const char *&  szUtf8,
ezUInt32  uiNumCharacters = 1 
)
inlinestatic

Moves the given string pointer backwards to the previous Utf8 character sequence.

The string may point to an invalid position (in between a character sequence), or even the \0 terminator, as long as there is a valid string before it (and the user knows when to stop).

bool ezUnicodeUtils::SkipUtf16BomBE ( const ezUInt16 *&  szUtf16)
inlinestatic

If the given string starts with a Utf16 big endian Bom, the pointer is incremented behind the Bom, and the function returns true.

Otherwise the pointer is unchanged and false is returned.

bool ezUnicodeUtils::SkipUtf16BomLE ( const ezUInt16 *&  szUtf16)
inlinestatic

If the given string starts with a Utf16 little endian Bom, the pointer is incremented behind the Bom, and the function returns true.

Otherwise the pointer is unchanged and false is returned.

bool ezUnicodeUtils::SkipUtf8Bom ( const char *&  szUtf8)
inlinestatic

If the given string starts with a Utf8 Bom, the pointer is incremented behind the Bom, and the function returns true.

Otherwise the pointer is unchanged and false is returned.


The documentation for this class was generated from the following files: