http://www.perlmonks.org?node_id=963950

Ekimino has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to parse an RTF file as plain text, I have succeded in separating things, but I've been going around in circles with the output, which doesn't seem to work the way I want to. I think the abnormal behaviour comes from the RTF file. The output should go like this: $user:$hash . But it will only output $hash\n unless I separate them with a \t. Here goes the code:

#!/usr/bin/perl use strict; use warnings; chomp(my @file = <>); my @users; my @hashes; #sub listar{ # while(@file){ # if($_ =~ /Data Found/g){ # $_ =~ s/\\par//g; # $_ =~ s/Data Found: //g; # printf "$_\n"; # } # } #} # Separate Users and add each one @users. foreach(@file){ chomp $_; if($_ =~ /User=/g){ $_ =~ s/Data Found: //g; $_ =~ s/\\par//g; $_ =~ s/User=//g; unless($_ =~ /magela/){ push(@users, $_); } } } # Separate Hashes and add each one to array @hashes foreach(@file){ chomp $_; if($_ =~ /Pass=/g){ $_ =~ s/Data Found: //g; $_ =~ s/\\par//g; $_ =~ s/Pass=//g; push(@hashes, $_); } } pop @users; my $n = @users; printf "Numero de Usuarios: $n\n"; my $h = @hashes; printf "Numero de Hashes: $h\n"; my $f = 0; while($f < $n){ <b>printf "$users[$f]\t$hashes[$f]\n"</b>; $f += 1; }

Output with \t:

marianonc624d56fbf18eb79236e942c1478bc4e fermins 8e6c5623ad9a544731661e3f872bb5f2 monicar 1cf5bd31c0bf0cb33eae5d75adfc2094

Output without \t

:5c0b18186c48d9e29f773cca0939b9c1 :c624d56fbf18eb79236e942c1478bc4e :8e6c5623ad9a544731661e3f872bb5f2 :1cf5bd31c0bf0cb33eae5d75adfc2094

How I want it

username:hash

Before suspicious minds come along, Yes, this are passwords hashes from a VM inside my lab. How should I handle RTF without using a module? Thank you.

Replies are listed 'Best First'.
Re: Parsing an RTF File as plain text.
by chromatic (Archbishop) on Apr 08, 2012 at 04:46 UTC

    Are you only printing this to a screen or to a file? Is it possible these files contain \r or \b characters? What if you added s/\s/ /g; after you chomp?

    How should I handle RTF without using a module?

    The only reliable way to do so is to read the (competing) specifications, experiment a lot, build up a comprehensive test suite, and consider some form of prayer and fasting. If the whitespace experiment doesn't fix things, you're probably much better off using a module.


    Improve your skills with Modern Perl: the free book.

      The file did contain \r..\b ending lines, I read about it in the llama, but I couldn't find the exact page.

      For the record: The solution was to remove all types of whitespace as superdoc said. s/\s//g and adjust the code accordingly, thank you everyone for your time.

Re: Parsing an RTF File as plain text.
by tangent (Parson) on Apr 08, 2012 at 00:06 UTC
    May or may not be connected but why do you use printf? Try it with plain old print
Re: Parsing an RTF File as plain text.
by NetWallah (Canon) on Apr 08, 2012 at 00:14 UTC
    Questions:
    • Why are you using 'printf' instead of 'print' ?
    • Where is the ":" in your output coming from ?
    Please do not put html formatting (<b>) in your code. It makes it hard to pinpoint the mistakes.

                 All great truths begin as blasphemies.
                       ― George Bernard Shaw, writer, Nobel laureate (1856-1950)

      print, printf, say, everyone of those, gives me the same error. the : comes from

      while($f < $n){ print "$users[$f]:$hashes[$f]\n"; $f += 1; }
        Ekimino - Are you posting as Anonymous ?
        That makes tracking responses difficult.

        Anyway - without some view of test data, it is difficult to see what the problem may be - for instance, your @users array could well be empty.

        The best bet is to post a small sample of re-populated data for @users, and @hash, and your print code.

        If that does not work as you expect, we can help.

                     All great truths begin as blasphemies.
                           ― George Bernard Shaw, writer, Nobel laureate (1856-1950)

Re: Parsing an RTF File as plain text.
by james2vegas (Chaplain) on Apr 08, 2012 at 09:21 UTC
    Why not use a something like RTF::Tokenizer to do the RTF heavy lifting?