Re^4: Parsing with RegEx into Array

by mr_p (Scribe)
on Jun 25, 2010 at 22:09 UTC

in reply to Re^3: Parsing with RegEx into Array
in thread Parsing with RegEx into Array

What does the 's' mean? Can you point me to any documentation on this if u know of?

I am also unable to print the utf8. Below is my code.

my $curLink1 = utf8::decode($curLink); # Use UNICODE semantics my $item1 = utf8::decode($item); my $fileName="/tmp/out_file.html"; use open OUT => ':utf8'; open OUT_FILE, "> $fileName"; print OUT_FILE "<item>$item1</item>"; close OUT_FILE; #open (my $fh, '>:encoding (UTF-8)', $fileName); #print $fh "<item>$item1</item>"; #close $fh;

I tried the commented code too. I also tried to print $item, which is encoded. $item or $item1 does not print in file, but it does print on STDOUT.

Thanks for you help.

Replies are listed 'Best First'.
Re^5: Parsing with RegEx into Array
by ikegami (Pope) on Jun 25, 2010 at 22:19 UTC

    For the match operator. "/s" causes "." to match any byte/character. Without it, "." matches any byte/character except 0x0A/newline. Operators are documented in perlop. There's probably more info perlre.

    open(my $fh_in, '<:encoding(UTF-8)', ...) or die ...; ... my @allItems = $file_in =~ m{<item>(.*?)</item>}sg; ... open(my $fh_out, '>:encoding(UTF-8)', ...) or die ...; print $fh_out ...;
      Is that the same thing I have for writing file. The code that is commented.
Re^5: Parsing with RegEx into Array
by Corion (Pope) on Jun 25, 2010 at 22:16 UTC

    You never check whether opening the output file succeeded. See open. For your original query about regular expressions, see perlre and maybe perlretut.

      I did have || die, but I removed it because when I put the code in perlmonks.

