go ahead... be a heretic | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
I frequently have to parse all kinds of output and every case is different, but I tend to go through the following process, each step makes the next one trivial, once you get the hang of it:
1) identify the lexical structure of the material -- can it be multiline, does indentation matter, etc.? 2) create a simple lexical analyser out of a hash of regexes and token names. 3) create a thrower or two that ejects white space and/or empty lines, comments etc. 4) create a trivial parser that calls the trivial lexer and thrower and has a subroutine to manage each type of opening landmark (encounter with an identifying string), typically loading it into a suitable structure or printing directly at the end of the section (via closing landmark) 5) if not printing as we go, traverse and print the structure Update: code example of a lexer 1; One world, one people In reply to Re: Perl: Extracting specific text from a .txt file and outputting into a new format
by anonymized user 468275
|
|