U+002D(1); # HYPHEN-MINUS U+002E(1); # FULL STOP U+0030(1); # DIGIT ZERO U+0031(1); # DIGIT ONE U+0032(1); # DIGIT TWO U+0033(1); # DIGIT THREE U+0034(1); # DIGIT FOUR U+0035(1); # DIGIT FIVE U+0036(1); # DIGIT SIX U+0037(1); # DIGIT SEVEN U+0038(1); # DIGIT EIGHT U+0039(1); # DIGIT NINE # Required by Catalan U+00B7; # MIDDLE DOT # remove ligatures as we do not yet support canonicalizing to sequences -U+00E6; # LATIN SMALL LETTER AE -U+0153; # LATIN SMALL LIGATURE OE