Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^5: Converting a Text file to XML

by graff (Chancellor)
on Nov 18, 2011 at 03:07 UTC ( #938732=note: print w/ replies, xml ) Need Help??


in reply to Re^4: Converting a Text file to XML
in thread Converting a Text file to XML

If you're going to manually insert field delimiters, then you could just switch to using split:

my ( $title, $date, $this, $that ) = split /\|/;
But it's likely that the data will mostly fall into a few dominant format groups, with some long tail of "outliers". You could either apply a list of regex matches (if the first one doesn't work, try the next one, and so on), or you could try some simple diagnostics to divide the data into subsets according to the absense/presence/type of difficulty: if there's more than one 4-digit string, that's one problem; if there are no double quotes (or an odd number of quotes), that's another problem, ... This will reduce the number of cases that need to be fixed by hand in order to be parsable.


Comment on Re^5: Converting a Text file to XML
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://938732]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (13)
As of 2014-12-18 13:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (51 votes), past polls