Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Parse a file into a hash

by raybies (Chaplain)
on Apr 05, 2012 at 13:01 UTC ( #963670=note: print w/replies, xml ) Need Help??

in reply to [Resolved] Parse a file into a hash

kazak, we're going to need a whole lot more information than this in order to help you. Sometimes in the process of giving out that information, the answers become obvious, so don't be afraid to share.

My first question would be, how do I determine whether a data entry in a file is a key or a value?

Then I would think about how to write a regex that could detect a key, a value, and a key/value pair.

From your example data given, it's not clear why key5 and key 6 are empty, when you put value 5 and value 6 in your example. Shouldn't they go together? If not, how do you determine that a data value goes with a key?

Is the data somehow identifiable?

Is key5 and 6 empty because they appear alone? Is that what makes them empty?

Here's a thought:

Suppose you could detect key or value, would this pseudocodish outer loop help?

my %hashnew; my $current_key = undef; my $current_value = undef; while (<DATA>) { my ($key, $value) = assign_key_value_from_current_line ($_); $current_key = $key if defined $key; $hashnew{$current_key} = $value; }

You'd then have to create the sub assign_key_value_from_current_line such that if only a key appears on the line, the value is set to undef and returned, and if the value appears then undef is stuffed into the first returned value for the key, and the system would use the first key. You may want an additional check in the main loop for the case where there's no key yet, but a bunch of values that would be tossed on the ground. In the case where both appear, the current key is reassigned and the new value likewise is recorded.

If you can't figure out difference between key and value, then it may be impossible, but from what you've given I don't know. Also start small, if you can't get all of the solution all at once, perhaps try a few simple steps, perhaps you can organize your data, or get it "part way" completed. In such a case, you might discover that it's good enough for what you're trying to do.

And as moritz notes, it'd be nice to see what you've tried already.

Replies are listed 'Best First'.
Re^2: Parse a file into a hash
by kazak (Beadle) on Apr 05, 2012 at 20:21 UTC
    Thank you for your reply, yes it helped a bit. File I need to parse is a .csv file with a fileds that must be converted to an ACLs, there are two fields: User name , asset name . One user can have either one or multiple asset names, but user name is unique. So unique user name I'm trying to use as a key and asset names I trying to use as a values of these unique keys. So one user John Doe may have either one asset name: John Doe = { "KJhkh23"} or multiple: John Doe = { "KJhkh23", "0jUfh4631",....."N"}. File was populated manually and irregulary, I mean in "User Name" field we may have one name "John Doe" but in field " Asset name" we may have a column of values. The problem is to define a first field as a key(John Doe) and the second as a value (KJhkh23), and if on the next line we can't detect a key, assume that this asset belongs to last available key (John Doe) . I'm new to perl so this code may look like a total mess for coders.
    #!/usr/bin/perl -w use warnings; use strict; my @tmp; ### Start Configuration my $src_file = "Servers.csv"; ### End Configuration my %stack = (); my $current_key = undef; my $current_value = undef; open( SRC, "<", $src_file ); while (<SRC>) { chomp; s/#.*//; s/\"//g; s/\;/\-/; push @tmp, $_ if $_ =~ m/^\;/ and next; my ($key,$value) = split /\-/, $_; push @tmp, $value; if (defined $key) { $current_key = $key; push @tmp, $value; $stack{$current_key} = @tmp; } close(SRC);

      You should probably show us some sample data, so we can tell what "can't detect a key" means. But in general, you're talking about creating a hash of arrays. So for every key/value pair, you'll want to do something like this:

      push @{$stack{$key}}, $value;

      That pushes the value onto the array referenced by the key within the hash. Later, you'll be able to go through them with:

      for my $key (keys %stack){ for my $value (@{$stack{$key}}){ # do stuff with $key and $value } }

      Aaron B.
      My Woefully Neglected Blog, where I occasionally mention Perl.

        Thanks for your reply. I don't know how to preserve initial format of the file here, so I'll replace empty cells of.csv with null.

        Username; Asset Name

        Corinna Mayer;JKDef4574

        Janek Huska;NAdf8f48g5

        Eric Shtits;JSd5345kPFl



        Erik Fisher;UiO8trve

        As you can see Eric Shtits owns 3 asset names in according to this .csv file, so two fields that supposed to be keys are empty.And I need to implement a solution that will be able to detect that all three asset names belong to one person. Regards, Kazak

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://963670]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (3)
As of 2018-06-25 04:30 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.