Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Parsing an RTF File as plain text.

by Ekimino (Novice)
on Apr 07, 2012 at 22:46 UTC ( #963950=perlquestion: print w/ replies, xml ) Need Help??
Ekimino has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to parse an RTF file as plain text, I have succeded in separating things, but I've been going around in circles with the output, which doesn't seem to work the way I want to. I think the abnormal behaviour comes from the RTF file. The output should go like this: $user:$hash . But it will only output $hash\n unless I separate them with a \t. Here goes the code:

#!/usr/bin/perl use strict; use warnings; chomp(my @file = <>); my @users; my @hashes; #sub listar{ # while(@file){ # if($_ =~ /Data Found/g){ # $_ =~ s/\\par//g; # $_ =~ s/Data Found: //g; # printf "$_\n"; # } # } #} # Separate Users and add each one @users. foreach(@file){ chomp $_; if($_ =~ /User=/g){ $_ =~ s/Data Found: //g; $_ =~ s/\\par//g; $_ =~ s/User=//g; unless($_ =~ /magela/){ push(@users, $_); } } } # Separate Hashes and add each one to array @hashes foreach(@file){ chomp $_; if($_ =~ /Pass=/g){ $_ =~ s/Data Found: //g; $_ =~ s/\\par//g; $_ =~ s/Pass=//g; push(@hashes, $_); } } pop @users; my $n = @users; printf "Numero de Usuarios: $n\n"; my $h = @hashes; printf "Numero de Hashes: $h\n"; my $f = 0; while($f < $n){ <b>printf "$users[$f]\t$hashes[$f]\n"</b>; $f += 1; }

Output with \t:

marianonc624d56fbf18eb79236e942c1478bc4e fermins 8e6c5623ad9a544731661e3f872bb5f2 monicar 1cf5bd31c0bf0cb33eae5d75adfc2094

Output without \t

:5c0b18186c48d9e29f773cca0939b9c1 :c624d56fbf18eb79236e942c1478bc4e :8e6c5623ad9a544731661e3f872bb5f2 :1cf5bd31c0bf0cb33eae5d75adfc2094

How I want it

username:hash

Before suspicious minds come along, Yes, this are passwords hashes from a VM inside my lab. How should I handle RTF without using a module? Thank you.

Comment on Parsing an RTF File as plain text.
Select or Download Code
Re: Parsing an RTF File as plain text.
by tangent (Curate) on Apr 08, 2012 at 00:06 UTC
    May or may not be connected but why do you use printf? Try it with plain old print
Re: Parsing an RTF File as plain text.
by NetWallah (Abbot) on Apr 08, 2012 at 00:14 UTC
    Questions:
    • Why are you using 'printf' instead of 'print' ?
    • Where is the ":" in your output coming from ?
    Please do not put html formatting (<b>) in your code. It makes it hard to pinpoint the mistakes.

                 All great truths begin as blasphemies.
                       ― George Bernard Shaw, writer, Nobel laureate (1856-1950)

      print, printf, say, everyone of those, gives me the same error. the : comes from

      while($f < $n){ print "$users[$f]:$hashes[$f]\n"; $f += 1; }
        Ekimino - Are you posting as Anonymous ?
        That makes tracking responses difficult.

        Anyway - without some view of test data, it is difficult to see what the problem may be - for instance, your @users array could well be empty.

        The best bet is to post a small sample of re-populated data for @users, and @hash, and your print code.

        If that does not work as you expect, we can help.

                     All great truths begin as blasphemies.
                           ― George Bernard Shaw, writer, Nobel laureate (1856-1950)

Re: Parsing an RTF File as plain text.
by chromatic (Archbishop) on Apr 08, 2012 at 04:46 UTC

    Are you only printing this to a screen or to a file? Is it possible these files contain \r or \b characters? What if you added s/\s/ /g; after you chomp?

    How should I handle RTF without using a module?

    The only reliable way to do so is to read the (competing) specifications, experiment a lot, build up a comprehensive test suite, and consider some form of prayer and fasting. If the whitespace experiment doesn't fix things, you're probably much better off using a module.


    Improve your skills with Modern Perl: the free book.

      The file did contain \r..\b ending lines, I read about it in the llama, but I couldn't find the exact page.

      For the record: The solution was to remove all types of whitespace as superdoc said. s/\s//g and adjust the code accordingly, thank you everyone for your time.

Re: Parsing an RTF File as plain text.
by james2vegas (Chaplain) on Apr 08, 2012 at 09:21 UTC
    Why not use a something like RTF::Tokenizer to do the RTF heavy lifting?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://963950]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2015-04-25 09:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Who makes your decisions?







    Results (477 votes), past polls