Package com.ibm.icu.text
Class CharsetRecog_Unicode
java.lang.Object
com.ibm.icu.text.CharsetRecognizer
com.ibm.icu.text.CharsetRecog_Unicode
- Direct Known Subclasses:
CharsetRecog_Unicode.CharsetRecog_UTF_16_BE
,CharsetRecog_Unicode.CharsetRecog_UTF_16_LE
,CharsetRecog_Unicode.CharsetRecog_UTF_32
This class matches UTF-16 and UTF-32, both big- and little-endian. The
BOM will be used if it is present.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class
(package private) static class
(package private) static class
(package private) static class
(package private) static class
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) static int
adjustConfidence
(int codeUnit, int confidence) (package private) static int
codeUnit16FromBytes
(byte hi, byte lo) (package private) abstract String
getName()
Get the IANA name of this charset.(package private) abstract CharsetMatch
match
(CharsetDetector det) Test the match of this charset with the input text data which is obtained via the CharsetDetector object.Methods inherited from class com.ibm.icu.text.CharsetRecognizer
getLanguage
-
Constructor Details
-
CharsetRecog_Unicode
CharsetRecog_Unicode()
-
-
Method Details
-
getName
Description copied from class:CharsetRecognizer
Get the IANA name of this charset.- Specified by:
getName
in classCharsetRecognizer
- Returns:
- the charset name.
-
match
Description copied from class:CharsetRecognizer
Test the match of this charset with the input text data which is obtained via the CharsetDetector object.- Specified by:
match
in classCharsetRecognizer
- Parameters:
det
- The CharsetDetector, which contains the input text to be checked for being in this charset.- Returns:
- A CharsetMatch object containing details of match with this charset, or null if there was no match.
-
codeUnit16FromBytes
static int codeUnit16FromBytes(byte hi, byte lo) -
adjustConfidence
static int adjustConfidence(int codeUnit, int confidence)
-