http://www.perlmonks.org?node_id=1021221


in reply to Remove unicode "whitespace"

$ unichars -au '\s' ---- U+00009 CHARACTER TABULATION ---- U+0000A LINE FEED (LF) ---- U+0000C FORM FEED (FF) ---- U+0000D CARRIAGE RETURN (CR) ---- U+00020 SPACE ---- U+00085 NEXT LINE (NEL) ---- U+000A0 NO-BREAK SPACE ---- U+01680 OGHAM SPACE MARK ---- U+0180E MONGOLIAN VOWEL SEPARATOR ---- U+02000 EN QUAD ---- U+02001 EM QUAD ---- U+02002 EN SPACE ---- U+02003 EM SPACE ---- U+02004 THREE-PER-EM SPACE ---- U+02005 FOUR-PER-EM SPACE ---- U+02006 SIX-PER-EM SPACE ---- U+02007 FIGURE SPACE ---- U+02008 PUNCTUATION SPACE ---- U+02009 THIN SPACE ---- U+0200A HAIR SPACE ---- U+02028 LINE SEPARATOR ---- U+02029 PARAGRAPH SEPARATOR ---- U+0202F NARROW NO-BREAK SPACE ---- U+0205F MEDIUM MATHEMATICAL SPACE ---- U+03000 IDEOGRAPHIC SPACE $ uniprops -a U+200E U+200E ‹U+200E› \N{LEFT-TO-RIGHT MARK} \pC \p{Cf} All Any Assigned Bidi_C Bidi_Control BidiC InGeneralPunctuation C Other Case_Ignorable CI Cf Format Changes_When_NFKC_Casefolded CWKCF Common Zyyy Default_Ignorable_Code_Point DI General_Punctuation Graph Pat_WS Pattern_White_Space PatWS Print X_POSIX_Graph X_POSIX_Print Age=1.1 Bidi_Class=L Bidi_Class=Left_To_Right BC=L Block=General_Punctuation Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered CCC=NR Canonical_Combining_Class=NR Script=Common Decomposition_Type=None DT=None East_Asian_Width=Neutral Grapheme_Cluster_Break=CN Grapheme_Cluster_Break=Control GCB=CN Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group JG=NoJoiningGroup Joining_Type=T Joining_Type=Transparent JT=T Line_Break=CM Line_Break=Combining_Mark LB=CM Numeric_Type=None NT=None Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1 Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2 Present_In=6.0 IN=6.0 SC=Zyyy Script=Zyyy Sentence_Break=FO Sentence_Break=Format SB=FO Word_Break=FO Word_Break=Format WB=FO _Case_Ignorable