Man page of auc_nconv(3) and auc_conv(3)
Index
- NAME
- SYNOPSIS
- DESCRIPTION
- UNICODE TRANSFORMATION FORMATS (auc_utf_t)
- FLAGS TO ALTER MODE OF OPERATION (auc_flag_t)
- RETURN VALUE (auc_bytes_t)
- SUPPORTED CHARSETS
- NOTES
- SEE ALSO
- COPYRIGHT AND LICENSE
NAME
auc_conv, auc_nconv - automatically convert text to Unicode
SYNOPSIS
C/C++ #include <auc.h> auc_bytes_t * auc_conv(const char *str, auc_utf_t utf, auc_flag_t flags) auc_bytes_t * auc_nconv(const char *bstr, size_t blen, auc_utf_t utf, auc_flag_t flags)
DESCRIPTION
AutoUniConv provides functions that automatically detect and convert text from a variety of charsets to one of the common Unicode Transformation Formats.
auc_conv() automatically converts plain C strings that may contain text encoded in all supported 8-bit charsets. The function takes a pointer to a plain C string str, the desired Unicode Transformation Format utf and a specification of flags as an argument.
auc_nconv() automatically converts byte strings that may contain text encoded in any supported charset. It should be preferred at least whenever a string could contain UTF-16 and/or UTF-32 encoded text or the length of the byte string is already known. The function takes a pointer to a byte string bstr, its length blen (excluding NUL termination), the desired Unicode Transformation Format utf and a specification of flags as an argument.
Whenever the functions are invoked, auc_errno(3) is reset to AUC_NOERR.
UNICODE TRANSFORMATION FORMATS (auc_utf_t)
auc_utf_t provides named constants of all Unicode Transformation Formats supported by AutoUniConv. The set comprises the following constants:
- AUC_UTF8
-
UTF-8
- AUC_UTF16LE
-
UTF-16LE
- AUC_UTF16BE
-
UTF-16BE
- AUC_UTF32LE
-
UTF-32LE
- AUC_UTF32BE
-
UTF-32BE
FLAGS TO ALTER MODE OF OPERATION (auc_flag_t)
AutoUniConv provides a set of named constant flags that are evaluated by auc_conv() and auc_nconv(). These flags allow to alter the functions' mode of operation and may be combined with each other to suite the user's requirements best.
The set comprises the following constants:
- AUC_DEFAULT
-
Default mode of operation. In this mode, the functions attempts to replace characters that could not be decoded with a predefined placeholder, the tilde character ("~"). No error will be generated in this case and no warnings will be printed to stderr either.
- AUC_STRICT
-
Require the functions to terminate on the first decoding error that may occur.
- AUC_WARN
-
Print a warning to stderr whenever a decoding error occurs.
A combination of flags may be achieved by simply adding them to another (i.e. "AUC_STRICT + AUC_WARN").
RETURN VALUE (auc_bytes_t)
Both auc_conv() and auc_nconv() return an auc_bytes_t data structure, which comprises the byte string bytes encoded in the requested Unicode Transformation Format, the byte string's length len and a specification of the used format utf.
The data structure is defined as follows:
C/C++ typedef struct { char *bytes; /* byte string */ size_t len; /* its length */ auc_utf_t utf; /* used UTF */ } auc_bytes_t;
The memory allocated by an auc_bytes_t structure should be freed using auc_free_bytes_t(3).
If an error occurs during processing, the functions return a pointer to NULL and set auc_errno(3) to an appropriate value.
For in depth information on AutoUniConv's error handling facilities, have a look at auc_errno(3).
SUPPORTED CHARSETS
For a list of all supported charsets, please have a look at the User Manual.
NOTES
auc_conv() and auc_nconv() are thread-safe.
SEE ALSO
auc_free_bytes_t(3), auc_errno(3), auc_strerror(3), auc_version(3), auc_version_string(3), auc_utf_t_to_name(3)
AutoUniConv User Manual, AutoUniConv Software Specification
http://www.lingua-systems.com/unicode-converter/autouniconv-library/
COPYRIGHT AND LICENSE
Copyright (c) 2010 Lingua-Systems Software GmbH


