Re: Help Extract these lines
by hdb (Monsignor) on Jun 17, 2013 at 19:33 UTC
|
For any of the proposed codes to work, the "id" and "name" entries have to come on alternating lines. Under this assumption, we do not need to bother reading line by line:
use strict;
use warnings;
use Data::Dumper;
{ local $/; $_=<DATA>; }
my %rec = reverse /"([^:]*?)",/g;
print Dumper \%rec;
__DATA__
"id": "xx-ada-qwebasd",
"name": "telphone",
"id": "fasda-asd-123123-fkja123a",
"name": "car",
"id": "97f921-a312-fas2",
"name": "ball",
| [reply] [d/l] |
Re: Help Extract these lines
by Preceptor (Deacon) on Jun 17, 2013 at 18:25 UTC
|
while ( <DATA> ) ...
Takes one line at a time.
print "Line is:", $_, " END\n";
Will give you a hint as to what's going on:
Line is:id: xx-ada-qwebasd,
END
Line is:name: telphone,
END
Line is:id: fasda-asd-123123-fkja123a,
END
Line is:name: car,
END
Line is:id: 97f921-a312-fas2,
END
Line is:name: ball, END
What you need to do is grab two lines at a time - there's various ways of doing that. Off the top of my head, reading a second line (if fixed format), or use of grep. I'll have a thought and see if I can come up with something elegant
| [reply] [d/l] [select] |
|
I thought of using a range operator to grab the two lines. But I'm confused as to why it is not grabbing the two lines.
| [reply] |
|
#!/usr/bin/perl
use strict;
use warnings;
my %rec;
while ( my $line = <DATA>)
{
$line .= <DATA>;
$line =~ s/\"//g;
$line =~s/,//g;
my ( $id, $name ) = ( $line =~ m/id: (\S+)\nname: (\S+)/mg );
print "$id = $name\n";
$rec{$id} = $name;
}
__DATA__
"id": "xx-ada-qwebasd",
"name": "telphone",
"id": "fasda-asd-123123-fkja123a",
"name": "car",
"id": "97f921-a312-fas2",
"name": "ball",
That I think does the trick. (Basically, grabs two lines in a go, but isn't ideal if your data structure is more complicated). I suspect there's something more clever you could do to parse a file and grab out pattern matching, but I think most of those would involve reading the file in a scalar context and reading the whole lot (which may be fine, but can go wrong with large files). | [reply] [d/l] |
Re: Help Extract these lines
by 2teez (Vicar) on Jun 17, 2013 at 19:04 UTC
|
use warnings;
use strict;
use Data::Dumper;
my $value;
my %data;
while (<DATA>) {
chomp;
$value = $1 if /.id.+\s"(.+?)"/;
if (/.name.+\s"(.+?)"/) {
$data{$1} = $value;
}
}
$Data::Dumper::Pair = ":"; # specify hash key/value separator
print Dumper \%data;
__DATA__
"id": "xx-ada-qwebasd",
"name": "telphone",
"id": "fasda-asd-123123-fkja123a",
"name": "car",
"id": "97f921-a312-fas2",
"name": "ball",
Output:
$VAR1 = {
'ball':'97f921-a312-fas2',
'car':'fasda-asd-123123-fkja123a',
'telphone':'xx-ada-qwebasd'
};
If you tell me, I'll forget.
If you show me, I'll remember.
if you involve me, I'll understand.
--- Author unknown to me
| [reply] [d/l] [select] |
Re: Help Extract these lines
by Laurent_R (Canon) on Jun 17, 2013 at 18:52 UTC
|
Maybe something like this (quick proposal, untested):
my %my_hash;
while <DATA> {
chomp;
my $id = (split /:/, $_)[1] if /id/;
if (/name/) {
my $name =~ /: "(\d+)"/;
$my_hash{$name} = $id;
}
}
| [reply] [d/l] |
|
"untested."
Yup, Re: Help Extract these lines doesn't even compile. Syntax error at Ln 2 ( which, in turn, produces a spurious error re closing curly braces ).
If you didn't program your executable by toggling in binary, it wasn't really programming!
| [reply] |
|
Alright, I just typed an approach for a solution directly on the web page, not having access to a Perl compiler when I did it (and I said that I could not test). What I typed does not compile, but it takes less than 5 seconds to find out why and to correct it. One really easy to find compile error (adding parens after the while keyword) and one other small stupid mistake just as easy to find and to correct, and it basically works:
use Data::Dumper;
my %my_hash;
my $id;
while (<DATA>) {
chomp; s/\r//g;
$id = (split /:/, $_)[1] if /id/;
if (/name/) {
my $name = $1 if /\: "(\w+)"/;
$my_hash{$name} = $id;
}
}
print Dumper \%my_hash;
__DATA__
"id": "xx-ada-qwebasd",
"name": "telphone",
"id": "fasda-asd-123123-fkja123a",
"name": "car",
"id": "97f921-a312-fas2",
"name": "ball",
which prints:
$ perl 2_lines.pl
$VAR1 = {
'ball' => ' "97f921-a312-fas2",',
'car' => ' "fasda-asd-123123-fkja123a",',
'telphone' => ' "xx-ada-qwebasd",'
};
I admit the data still needs a bit more cleaning, but that's really boilerplate code, and I think that this untested code basically gave a very good track towards the solution.
| [reply] [d/l] [select] |
Re: Help Extract these lines
by ww (Archbishop) on Jun 17, 2013 at 21:37 UTC
|
Perhaps worth nothing: most of the proposed solutions above do NOT extend easily or gracefully if the sample data is just one element of a multi-record set.
To me (YMMV), that raises the question: "Is the data stored in a reasonable fashion?" (If so, it's easy enough to extend by initially storing the data to arrays, and then using an AOA or HOA.)
If you didn't program your executable by toggling in binary, it wasn't really programming!
| [reply] |