Substr cannot extract last special character

by juo (Curate)
I noticed when using substract it cannot work to extract the last character if that last character is a special symbol. For example (microns,ohm,...) If I have a string like : 24 and I want to get the unit of that string and use :

my $ohm = '24'; $unit = substr $ohm, -1; print "$unit\n"; # This will return a ? in my CMD be aware that you c +annot see the Ohm character

Does anybody have any idea how to resolve this. I noticed in a previous post state that we could use the Hex code to do things like this (\x{00A1}) but does anybody has any idea how to get the HEX code for a given special symbol? I found Unibook rather difficult to get the right code, as it is looking in a labyrinth for the right one.

Re: Substr cannot extract last special character
by Samy_rio (Vicar) on Aug 19, 2005 at 03:54 UTC

    Hi, If i understood your question correctly, here is my coding:

    my $ohm = '24'; print "Before substract : $ohm"; if ($ohm =~ /[^!-~\s]/g)#This helps to find the non ascii character { $unit=$&; $ohm=~s/$unit//e; } print "\nUnit : $unit\nAfter substract : $ohm"; # This will return a +? in my CMD be aware that you cannot see the Ohm character

    I think it helps you.

    Velusamy R.

      Sorry juo, I misunderstood the question, here is my suggestion:

      my $ohm = '24 34 234 32 23 1 23'; print "Before substract : $ohm\n"; while ($ohm =~ /[^!-~\s]/g)#This helps to find the non ascii character { $unit=$&; $ohm=~s/$unit//e; $foo ='&#x'.sprintf("%04X",ord($unit)).";"; print "Unit : $unit\tHex : $foo\n"; }

      In CMD, display the following hexadecimal values:

      Hex : Ù Hex : Ã Hex : Æ Hex : Ï Hex : Ð Hex : ½ Hex : §

      These hexadecimal values are viewed in IE as HTML and XML file, I am getting exact symbol which are present in the code.

      Please, try this.

      Velusamy R.

Re: Substr cannot extract last special character
by newroz (Monk) on Aug 19, 2005 at 08:09 UTC
    Hi, It shows , when used -2 offset, due of two byte length of unicode chars.
    #!/usr/bin/perl my $ohm = qw(24); $unit = substr $ohm,-2; print "$unit","\n";
      In particular, if you embed utf8 chars in the program source, you have to let perl know with the utf8 pragma:
      my $x =''; print length($x), "\n"; # prints 2
      compared with
      use utf8; my $x =''; print length($x), "\n"; #' prints 1


