Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^4: Relational table with perl DBI

by Neighbour (Friar)
on Mar 13, 2013 at 12:30 UTC ( #1023184=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Relational table with perl DBI
in thread Relational table with perl DBI

First off, very nice how you improved on the insert query, for example when building the $field_placeholders.

In order to get the relations properly, you'll learn about another nifty feature of perl: hashes. A hash is an unsorted list of key-value pairs.
We're going to loop through the files, accumulating the data per file in $hr_output and accumulating all per-file data in $ar_data.
The prefixes hr_ and ar_ are not required by perl, but they help me to distinguish between references to hashes (hashref, or hr) and references to arrays (arrayref, or ar).

my $ar_data = []; #Opening, reading, and extracting column content of each concord.n.txt + file foreach my $output_concord_file (@output_concord_files){ #Note : for the concord file, no processing implied -> stored by d +efault in $output_concord_file open (my $fh, '<:utf8', "$output_concord_file") || die "Couldn't o +pen $output_concord_file : $!"; my $hr_output = {}; $hr_output->{concord_file} = $output_concord_file; while (<$fh>){ if ($_ =~ /=\[=(.*)=\]=/){ $hr_output->{url} = $1; } if ($_ =~ /=\[\+(.*)\+\]=/){ $hr_output->{sys_time} = $1; } if ($_ =~ /=E-(.*)=event/){ push(@{$hr_output->{events}}, $1); } } push(@{$ar_data}, $hr_output); close ($fh); }
I'm not quite sure what you intended to do to those html-pages, and i'm not sure that what's happening is what you really want, so I'm just going to skip that :P
Next up is inserting all accumulated data:
if (@{$ar_data}) { # If there is any data, prepare the query my @fields = keys %{$ar_data->[0]}; # Extract all the keys from th +e first element of ar_data my $fieldlist = join(',', @fields); my $field_placeholders = join(',', map {'?'} @fields); my $sth_insert_article = $dbh->prepare( qq( INSERT INTO article ($fieldlist) VALUES ($field_placeholders) )); my $sth_select_last_insert_id = $dbh->prepare( qq( SELECT LAST_INS +ERT_ID )); my $sth_insert_event = $dbh->prepare( qq( INSERT INTO event (event) VALUES (?) )); my $sth_insert_article_event = $dbh->prepare( qq( INSERT INTO article_event_index (id_article, id_event) VALUES +(?,?) )); # Now we loop over and insert all data foreach my $hr_output (@{$ar_data}) { # Insert article data my $inserted_records = $sth_insert_article->execute(@{$hr_outp +ut}{@fields}); # Ok, I admit, this one is really tricky to read :) if ($inserted_records != 1) { die "Error inserting article [$h +r_output->{url}], only [$inserted_records] got inserted: " . $sth_ins +ert_article->errstr; } # Get article ID my $article_id = ($dbh->selectcol_arrayref($sth_select_last_in +sert_id))->[0]; # Insert events foreach my $event (@{$hr_output->{events}}) { $inserted_records = $sth_insert_event->execute($event); if ($inserted_records != 1) { die "Error inserting event [ +$event], only [$inserted_records] got inserted: " . $sth_insert_event +->errstr; } # Get event ID my $event_id = ($dbh->selectcol_arrayref($sth_select_last_ +insert_id))->[0]; # Insert article_event combo $inserted_records = $sth_insert_article_event->execute($ar +ticle_id, $event_id); if ($inserted_records != 1) { die "Error inserting article +-event [$article_id, $event_id], only [$inserted_records] got inserte +d: " . $sth_insert_article_event->errstr; } } } }
This follows the loop I described in my previous post. There are a few tricks here, which I'll leave for you to figure out (you can still ask though :)).
The most unreadable trick is @{$hr_output}{@fields} which produces an array of the value-parts of the hashref for all @fields, in order of those @fields.


Comment on Re^4: Relational table with perl DBI
Select or Download Code
Replies are listed 'Best First'.
Re^5: Relational table with perl DBI
by M15U (Acolyte) on Mar 13, 2013 at 13:55 UTC

    Thank you very much for your answer. I though that at some point I'll have to use hashes, but I was always afraid of them. I understand your code and I tried to incorporate it in my own. I've already seen on the web the @{$hr_output}{@fields} syntax, though in this context it's the only time when I get it.

    However I have a small program in the code. I get :

    "DBD::mysql::st execute failed: Unknown column 'event' in 'field list' at db2.pl line 289. DBD::mysql::db selectcol_arrayref failed: Unknown column 'LAST_INSERT_ID' in 'field list' at db2.pl line 292. Can't use an undefined value as an ARRAY reference at db2.pl line 292. "

    I don't really see where do the script bug.

    And also the first "if" error managment loop gives me :

    "Use of uninitialized value $inserted_records in numeric ne (!=) at db2.pl line 290. Use of uninitialized value $inserted_records in concatenation (.) or string at db2.pl line 290. Error inserting article http://rss.feedsportal.com/c/32788/f/524037/s/29456ceb/l/0Llci0Btf10Bfr0Cfrance0Cfaits0Edivers0Chautes0Epyrenees0Eemportes0Epar0Eune0Eavalanche0Eils0Es0Een0Esortent0Epresque0E78664640Bhtml/story01.htm, only [] got inserted: Unknown column 'event' in 'field list' at db2.pl line 290."

    Though thank you very much again.

      I think I know where the problem is. When I print "@fields" if get : concord_file sys_time url event

      The thing is I have two tables:

      article : id_article url html_extr concord_file sys_time

      event : id_event event

      I don't really know how to reorder this two fields

        Now I really got it. Using the placeholders method for filling the tables requires the precise order of the column. Using a hash means that all data is scrambled

        Plus, in this aproach the 'event' column is in the same table as the article.

        Is there a away to reorder a hash index following a specific pattern ? how can I get the 'event' column to be treate separetly of the article table ?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1023184]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (13)
As of 2015-07-29 21:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (269 votes), past polls