Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

The first thing I want to point out is that splitting on commas is not enough to parse CSV. Your parser will break on data that includes a comma: Bring your water, beer, and trash bag. That alone is enough of a reason for me to say, use Text::CSV_PP instead so you get a real parser.

Your XML generator is fragile and does lots of unneccesary work. You can see that it's fragile because you're already going through a lot of pain trying to change your generator when the requirements change. I have to second dHarry and recommend you use XML::Simple instead. Even when the hash structure changes, generating output can be as simple as print XMLout($hash_ref);

The bulk of the unneccesary work in the XML generator is that you assign a variable for each of the values in the hash when there's no reason to do so. Just use the hash directly...

my $gXtext = <<"GXEOF"; <?xml version="1.0" encoding="UTF-8"?> <package xmlns="http://greateventbulatine/event/organizer" xmlns:xsi=" +http://www +.w3.org/2001/XMLSchema-instance"> <theme>${$gXHHRef}{'picnic theme'}</theme>

For escaping HTML entities, you might want HTML::Entities.

Now, if, after all that, you still want fragile code that's hard to maintain, well, you've already spelled out what you want to do, so do it. You have to store the last header and add a conditional that loops through the file until your conditions are met. Assuming that each record is in a seperate file, I suppose I'd go at it something like this...

use strict; use warnings; use XML::Simple; my @headers = parseCVSLine(<CSV>); my @record = parseCVSLine(<CSV>); my @extra_records; while ( my $line = <CSV> ) { my @xrec = parseCVSline(<CSV>); push @extra_records, \@xrec if defined } my %data; @data{@headers} = @record; # but I still don't know what to do with those extra records so ... print XMLout(\%data); # yet another lame CSV "parser", use Text::CSV sub parseCVSLine { return $_[0] ? split /,/ : undef; }

I still don't know what the boundary between records is. Normally, you would expect a line terminator but with multiline records, you need to use something other than the typical line terminator. In the code above, I assumed that the boundary was the file but perhaps that's wrong. In any case, you really ought to use a clear record boundary.

I really think you should use the modules so you don't end up where you are now: with a collection of scripts that all have fragile parsers and you have to go tweak every script every time there's a minor change to the data format. Like, what happens when people want a link to the map to get to the picnic? Because you've locked yourself into this format, you'll have to tweak every file that touches that data. Try to take a more fluid approach where you can. My lame example script above doesn't care at all what data is in the files and will still just work.

Above all, my number one reason to recommend the modules is, you could have been done by now and you would have something that's stable, robust, easy to read and, therefore, easy to maintain.


In reply to Re^5: Multiline CSV and XML by rowdog
in thread Multiline CSV and XML by sanju7

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others scrutinizing the Monastery: (9)
    As of 2014-07-30 06:37 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (229 votes), past polls