Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

frac12 to decimal

by CColin (Scribe)
on Feb 09, 2009 at 03:38 UTC ( #742338=perlquestion: print w/replies, xml ) Need Help??
CColin has asked for the wisdom of the Perl Monks concerning the following question:

Hi Am trying to parse some HTML in TreeBuilder which returns fractional values '1/2', '1/4', '3/4' and would like to represent these as decimals. I hoped I could go into HTML::Entities and find out how to simply spot the mapping of the HTML values to fractions and change it to decimals. I can see in the source that HTML is converted to the following "frac" values (and vice versa):
frac14 => chr(188), frac12 => chr(189), frac34 => chr(190),
But I can't see how the "frac" values are converted to display fractions, or how I can convert these fractions to decimals instead. Ideally I want frac14, frac12 and frac34 to convert to values to
.25, .5, .75
respectively. I suppose I could ignore the modules and go and try to change all the HTML by regexes, but that kind of misses the point and makes everything else harder to maintain. Thanks for any guidance.

Replies are listed 'Best First'.
Re: frac12 to decimal
by karoshi (Novice) on Feb 09, 2009 at 03:52 UTC

    Behold, the hash holding those is declared with "use vars":

    use vars qw(%entity2char %char2entity);

    So, after that package has been loaded and compiled, you could reach into that package and change just the values of them keys:

    # switch package package HTML::Entities; $entity2char{frac14} = '.25'; $entity2char{frac12} = '.5'; $entity2char{frac34} = '.75'; # switch back to your package package main; # go on with your code
      Hi Thanks - I guess that answers the question on changing the behaviour of the module without changing the module, but having said that this particular code did not bring about the desired behaviour. That's still the piece I can't figure out what's happening? The hash in the module:
      Contains the following key value pairs that seem to govern this behaviour.
      frac14 => chr(188), frac12 => chr(189), frac34 => chr(190),
      Presumably chr(188) and its brethren are taken straight from the HTML, then fracNN to which it is mapped is somehow translated to display as the tiny one character wide fraction symbol. But stating:
      # switch package package HTML::Entities; $entity2char{frac14} = '.25'; $entity2char{frac12} = '.5'; $entity2char{frac34} = '.75'; # switch back to your package package main; # go on with your code
      Does not seem to change that behaviour?

        from HTML/

        # Make the opposite mapping while (my($entity, $char) = each(%entity2char)) { $entity =~ s/;\z//; $char2entity{$char} = "&$entity;"; }

        so try: change %char2entity as well

        package HTML::Entities; $char2entity{'.25'} = '¼'; $char2entity{'.5'} = '½'; $char2entity{'.75'} = '¾'; package main; ...

        or post some code to look at.

        ½ is a HTML entity, it maps directly to chr(188)/U+00BC, which in ISO88591 is vulgar fraction one quarter , and all computer programs which know how to display ISO88591 will draw on the screen ½
Re: frac12 to decimal
by kennethk (Abbot) on Feb 09, 2009 at 03:47 UTC
    Any time you see chr, you are invoking the Perl internal character set. The display values are therefore not 1/2, etc. but a single character that looks similar. As such, to convert to a decimal, you need to associate that fraction character with the appropriate decimal value. The inverse function of chr is ord, so you could go through the unicode character set (unicode tables) to find all appropriate values. I would recommend checking the CPAN modules HTML::Fraction and String::Fraction for some mappings you could "borrow".

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://742338]
Approved by GrandFather
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2017-05-26 00:10 GMT
Find Nodes?
    Voting Booth?