FirebirdSQL logo

Unicode

Most current development tools support Unicode, implemented in Firebird with the character sets UTF8 and UNICODE_FSS.UTF8 comes with collations for many languages.UNICODE_FSS is more limited and was previously used mainly by Firebird internally for storing metadata.Keep in mind that one UTF8 character occupies up to 4 bytes, thus limiting the size of CHAR fields to 8,191 characters (32,767/4).

Note

The actual “bytes per character” value depends on the range the character belongs to.Non-accented Latin letters occupy 1 byte, Cyrillic letters from the WIN1251 encoding occupy 2 bytes in UTF8, characters from other encodings may occupy up to 4 bytes.

The UTF8 character set implemented in Firebird supports the latest version of the Unicode standard, thus recommending its use for international databases.

Client Character Set

While working with strings, it is essential to keep the character set of the client connection in mind.If there is a mismatch between the character sets of the stored data and that of the client connection, the output results for string columns are automatically re-encoded, both when data are sent from the client to the server and when they are sent back from the server to the client.For example, if the database was created in the WIN1251 encoding but KOI8R or UTF8 is specified in the client’s connection parameters, the mismatch will be transparent.