Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Parsing a Formatted Text File

by GotToBTru (Prior)
on Mar 23, 2015 at 14:51 UTC ( [id://1120994]=note: print w/replies, xml ) Need Help??


in reply to Parsing a Formatted Text File

One of the keys in processing files like this is knowing how confident you can be in the location of any piece of data. The most likely thing to mess this up would be data that exceeds its usual field size and causes an extra line in the output. Or data missing that results in a shorter document than you expect. You have very few fields with tags to help you identify them, so position is going to be how you identify what you're seeing at any given location in the document. I would use regexes and code very defensively, making sure dates look like dates, phone numbers like phone numbers, prices like prices.

Dum Spiro Spero

Replies are listed 'Best First'.
Re^2: Parsing a Formatted Text File
by marinersk (Priest) on Mar 23, 2015 at 20:28 UTC

    Huge fan of defensive coding.

    I saw the "skip 58 lines" concept above and shuddered. More war stories than I can count stem from inheriting that kind of blind faith logic.

    Give me something -- anything I can rely on -- in the data itself, and the shakes subside.

    I know it works. It works a lot. But it also fails a lot. Line counting for me is a tool of last resort unless I can be convinced the data format is solid.

    And I'm a hard sell in that department. Too many war wounds.

    :: shudder ::

Re^2: Parsing a Formatted Text File
by Pharazon (Acolyte) on Mar 23, 2015 at 14:57 UTC
    I would say overall because I know our client has a program that is outputting these files that the positioning will be fairly good but I am absolutely prepped to regex the crap out of the fields that I can because that formatting is human controlled and on the other end of the process that formatting will matter.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1120994]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-24 19:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found