Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

passing token output to a variable

by SilverShadow (Initiate)
on Jan 05, 2013 at 23:29 UTC ( #1011827=perlquestion: print w/replies, xml ) Need Help??
SilverShadow has asked for the wisdom of the Perl Monks concerning the following question:

How to pass output of "$token->as_is" to a variable in the following code to be able to strip out extra spaces before printing it on screen, also for doing other things with the output later as well.

I don't like to use extra modules to not make the code any bigger. so i prefer to use regex on the fly during the final stage.

the commented # are my retries so u can ignore it.

and I wonder why you guys output very small code font on this site, its very hard to read unless clicking on the download link which is not very comfortable to follow up on reading by keep clicking to display codes.


use HTML::TokeParser::Simple; my $p = HTML::TokeParser::Simple->new(url => ' +xx'); my $level; while (my $tag = $p->get_tag('div')) { my $class = $tag->get_attr('id'); next unless defined($class) and $class eq 'content'; $level += 1; while (my $token = $p->get_token) { $level += 1 if $token->is_start_tag('div'); $level -= 1 if $token->is_end_tag('div'); #$_ = s/<([\w-\:]+)>(.*?)<\/\1>/$2 /g; #print $_; next unless $token->is_text; #$cleaned = $token->as_is =~ s/\s{2,}/ /gs; # should remove ex +tra spaces #print $cleaned; print $token->as_is; unless ($level) { last; } } }

Replies are listed 'Best First'.
Re: passing token output to a variable
by frozenwithjoy (Priest) on Jan 05, 2013 at 23:50 UTC
    The way you have it written, $cleaned just gets set to the # of substitutions that occurred. See this example:
    #!/usr/bin/env perl use strict; use warnings; use feature 'say'; my $string = "This is a string with variable numbers + of spaces."; say "original: $string"; my $number_of_substitutions = $string =~ s|\s{2,}| |g; say "cleaned: $string"; say "# of substitutions: $number_of_substitutions"; __END__ original: This is a string with variable numbers of + spaces. cleaned: This is a string with variable numbers of spaces. # of substitutions: 6
    One alternative for you would be:
    my $cleaned = $token->as_is; $cleaned =~ s/\s{2,}/ /g; # I took out the /s modifier. I thought it +was only for transliteration (e.g., $cleaned =~ tr/ //s).
    A second alternative, if you are using 5.14+, is non-destructive substitution (with the /r modifier):
    my $cleaned = $token->as_is =~ s/\s{2,}/ /gr;
      Thank you frozenwithjoy, your code works perfect as a standalone code but when using it with my code it doesn't the script just return or exit to command prompt without giving any output, warnings or error messages!
        nevermind, fixed now! Kind Regards
Re: passing token output to a variable
by Athanasius (Bishop) on Jan 06, 2013 at 00:41 UTC
      Thank you very much bro, it works, what about the node reply notifications to e-mail? :) cuz i can't find it anywhere on the setting
Re: passing token output to a variable (parens force assignment before substitution)
by Anonymous Monk on Jan 06, 2013 at 02:57 UTC

    to see your program how perl sees it run

    perl -MO=Deparse,-p

    Then write that expression as

    ( $cleaned = $token->as_is ) =~ s/\s{2,}/ /gs;

    parens force assignment before substitution

      Thanks, I will try that

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1011827]
Approved by frozenwithjoy
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2018-12-13 04:09 GMT
Find Nodes?
    Voting Booth?
    How many stories does it take before you've heard them all?

    Results (61 votes). Check out past polls.