perl handling of utf8
evilgoblin
created: 2006-05-01 18:25:15
I had a couple of questions: How do I single out a single character with Unicode code point for any operation (say replacement or removal), in the regex do I use \x or \X ? what is the difference between the 2?
Also I had another question on the "eq" operator. Say if $var1 is a byte sequence with the internal UTF-8 flag on, and $var2 is the exact same byte sequence with the UTF8 flag off, what would be the return value on "$var1 eq $var2"? I tested this by reading in a string and doing Encode::_utf8_on($string) on it and then comparing the two. The return value is true, but could some1 explain the behaviour? I would think that one variable having the flag on and the other off would return a FALSE value regardless of the byte sequence therein. Thanks

Considered: astaines: Re-title 'perl UTF-8 questions'?
Unconsidered: g0n - enough keep votes (Keep: 17, Edit: 7, Reap: 0)

Re: perl handling of utf8
created: 2006-05-01 20:38:04
in the regex do I use \x or \X ? what is the difference between the 2?

rtfm perlre

Re: perl handling of utf8
created: 2006-05-01 23:44:32

I give an example of replacing unprintable ASCII characters with utf-8 in node 354858.

After Compline,
Zaxo

perlmonks.org content © perlmonks.org and Anonymous Monk, evilgoblin, Zaxo

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03