Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Parsing an file that has xml like syntax

by choroba (Cardinal)
on Apr 02, 2014 at 18:58 UTC ( [id://1080822]=note: print w/replies, xml ) Need Help??


in reply to Parsing an file that has xml like syntax

I usually process similar files line by line, keeping the status (project, job) in a variable that survives the loop. It is not clear from your description whether a job can have more than one type and file, so I assumed it can't.
#!/usr/bin/perl use warnings; use strict; sub output { my ($id, @jobs) = @_; print join("\t", $id, @{$_}{qw(JOBID TYPE FILE)}), "\n" for @jobs; } my @jobs; my $id; while (<>) { my ($tag, $value) = m{<(.*)>(.*)</\1>} or next; if ('PROJECT_ID' eq $tag) { output($id, @jobs); $id = $value; @jobs = (); } else { $tag =~ s/[0-9]+$//; $#jobs++ if 'JOBID' eq $tag; $jobs[-1]{$tag} = $value; } } output($id, @jobs); # Don't forget to output the last job.
لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Replies are listed 'Best First'.
Re^2: Parsing an file that has xml like syntax
by Anonymous Monk on Apr 02, 2014 at 19:20 UTC

    Wow Choroba that's great. You wrote that very quickly. Thanks ever so much. This looks like this will do exactly what I am looking for. Yes you are right there can only be one TYPE* and FILE* associated with each JOBID. Once I get these values for each JOBID I ultimately need to supply these values as arguments to a program and execute it. So when reading the first JOBID. I need the value for TYPE1 and FILE1 and with these I then execute another program with those values. Then once that has been executed I then read the next JOBID in the file and get the values for TYPE2 and FILE2 and then with those values execute another program again. And keep doing this until there are no more JOBID's defined in this file.

Re^2: Parsing an file that has xml like syntax
by crusty_collins (Friar) on Apr 02, 2014 at 19:33 UTC
    I think this is the way to do it.
    use strict; use Data::Dumper; my $config; while (<DATA>) { chomp; my ($key, $variable)= ( $_ =~ /<(\w+?)>([^<]*)/); if ($key =~ m/JOBID/){ push (@{$config->{JOBID}},$variable); }else{ $config->{$key} = $variable; } } close FILE; print Dumper $config; __DATA__ <PROJECT_ID>12345</PROJECT_ID> <JOBID>101</JOBID> <JOBID>102</JOBID> <JOBID>103</JOBID> <TYPE1>add</TYPE1> <FILE1>/tmp/file_data_gros</FILE1> <JOBID>102</JOBID> <TYPE2>delete</TYPE2> <FILE2>/tmp/file_myvalues</FILE2>
    Output
    $VAR1 = { 'JOBID' => [ '101', '102', '103', '102' ], 'FILE2' => '/tmp/file_myvalues', 'TYPE2' => 'delete', 'FILE1' => '/tmp/file_data_gros', 'TYPE1' => 'add', 'PROJECT_ID' => '12345' };

      Hi crusty_collins I think I understand what you are doing here. Are you creating an array of jobid values and then associating each jobid value with the corresponding value for TYPE and FILE? So for instance if this was seen in the file:

      <PROJECT_ID>12345</PROJECT_ID> <JOBID>101</JOBID> <TYPE1>add</JOBID> <FILE1>/tmp/file_data_gros</FILE1> <JOBID>104</JOBID> <TYPE2>delete</TYPE2> <FILE2>/tmp/file_myvalues</FILE2>

      I would need to parse this file like so: found value for first JOBID and also found values for the associated FILE1 and TYPE1 and will now execute my program with these values:

      `myprogram -action add -file /tmp/file_data_gros`

      Then loop would then carry on to see if there was another JOBID and if there is, which there is in this example, then my program would execute:

      `myprogram -action delete -file /tmp/file_myvalues

      then loop would carry on to see if there was another JOBID and if there was if would find the value to the next TYPE* and FILE* fields defined and then execute my program again

      I don't fully understand your code but how would I loop over this file to do the above and execute my program?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1080822]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2024-04-26 00:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found