Predefined Character Classes
The following predefined character classes can also be used in a class definition:
[:ALPHA:]
-
Latin letters a..z and A..Z.With an accent-insensitive collation, this class also matches accented forms of these characters.
[:DIGIT:]
-
Decimal digits 0..9.
[:ALNUM:]
-
Union of
[:ALPHA:]
and[:DIGIT:]
. [:UPPER:]
-
Uppercase Latin letters A..Z.Also matches lowercase with case-insensitive collation and accented forms with accent-insensitive collation.
[:LOWER:]
-
Lowercase Latin letters a..z.Also matches uppercase with case-insensitive collation and accented forms with accent-insensitive collation.
[:SPACE:]
-
Matches the space character (ASCII 32).
[:WHITESPACE:]
-
Matches horizontal tab (ASCII 9), linefeed (ASCII 10), vertical tab (ASCII 11), formfeed (ASCII 12), carriage return (ASCII 13) and space (ASCII 32).
Including a predefined class has the same effect as including all its members.Predefined classes are only allowed within class definitions.If you need to match against a predefined class and nothing more, place an extra pair of brackets around it.
'Erdbeere' similar to 'Erd[[:ALNUM:]]eere' -- true
'Erdbeere' similar to 'Erd[[:DIGIT:]]eere' -- false
'Erdbeere' similar to 'Erd[a[:SPACE:]b]eere' -- true
'Erdbeere' similar to [[:ALPHA:]] -- false
'E' similar to [[:ALPHA:]] -- true
If a class definition starts with a caret, everything that follows is excluded from the class.All other characters match:
'Framboise' similar to 'Fra[^ck-p]boise' -- false
'Framboise' similar to 'Fr[^a][^a]boise' -- false
'Framboise' similar to 'Fra[^[:DIGIT:]]boise' -- true
If the caret is not placed at the start of the sequence, the class contains everything before the caret, except for the elements that also occur after the caret:
'Grapefruit' similar to 'Grap[a-m^f-i]fruit' -- true
'Grapefruit' similar to 'Grap[abc^xyz]fruit' -- false
'Grapefruit' similar to 'Grap[abc^de]fruit' -- false
'Grapefruit' similar to 'Grap[abe^de]fruit' -- false
'3' similar to '[[:DIGIT:]^4-8]' -- true
'6' similar to '[[:DIGIT:]^4-8]' -- false
Lastly, the already mentioned wildcard ‘_
’ is a character class of its own, matching any single character.