jdlev has asked for the wisdom of the Perl Monks concerning the following question:
Hi Monks!
I'm trying to do something pretty simple, but haven't worked with Perl in a while so was looking for some recommendations.
Basically, I have a string that looks like this:
"QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A.J. Green WR John Brown WR Davante Adams TE Martellus Bennett FLEX Jeremy Hill DST Panthers"
Then I have a database with columns QB, RB1, RB2, WR1, WR2, WR3, TE, FLEX, DST.
What methods would you guys use to insert the data into a database? Would you split each string using the split function, regex it, load it into an array, or use an entirely different method and then submit everything using DBI?
TIA :)
I love it when a program comes together - jdhannibal
Re: Recommendations for breaking up string?
by GotToBTru (Prior) on Sep 16, 2015 at 18:58 UTC
|
See this post for an idea that could be adapted to this. It's a simple approach to parse the line into a hash in position => name form. In fact, your application would be simpler. The hash could be used to create the DML.
Update: maybe not simpler - the repetition of positions will cause my approach problems!
Update 2:
use strict;
use warnings;
use Data::Dumper;
my $string = "QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A.J. Gr
+een WR John Brown WR Davante Adams TE Martellus Bennett FLEX Jeremy H
+ill DST Panthers";
my @words = split ' ',$string;
my $i = 0;
while ($i <= $#words) {
if ($words[$i] =~ m/^[A-Z]{2,4}$/) {
$words[$i++] .= ' ->';
} else {
$words[$i-1] .= ' ' . splice @words,$i,1
}
}
print Dumper(\@words);
Output:
$VAR1 = [
'QB -> Carson Palmer',
'RB -> Chris Ivory',
'RB -> Eddie Lacy',
'WR -> A.J. Green',
'WR -> John Brown',
'WR -> Davante Adams',
'TE -> Martellus Bennett',
'FLEX -> Jeremy Hill',
'DST -> Panthers'
];
| [reply] [d/l] [select] |
Re: Recommendations for breaking up string? (sexeger)
by jeffa (Bishop) on Sep 16, 2015 at 23:57 UTC
|
Fun problem! Here is my take. The bulk of the work is done via a regex, and since the target delimiters appear first, reversing the string makes the matching easier. I do not recommend this solution for production, but rather as a learning experience. It assumes the roles are uppercase characters between 2 and 4 chars in length.
use strict;
use warnings;
my $str = scalar reverse "QB Carson Palmer RB Chris Ivory RB Eddie Lac
+y WR A.J. Green WR John Brown WR Davante Adams TE Martellus Bennett F
+LEX Jeremy Hill DST Panthers";
for ($str =~ /(.*?[A-Z][A-Z][A-Z]?[A-Z]?)/g) {
$_ = scalar reverse $_;
chomp( my ($role,@name) = split );
print "$role -> @name\n";
}
__DATA__
DST -> Panthers
FLEX -> Jeremy Hill
TE -> Martellus Bennett
WR -> Davante Adams
WR -> John Brown
WR -> A.J. Green
RB -> Eddie Lacy
RB -> Chris Ivory
QB -> Carson Palmer
jeffa
L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)
| [reply] [d/l] |
Re: Recommendations for breaking up string?
by trippledubs (Deacon) on Sep 16, 2015 at 22:24 UTC
|
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dump;
my @positions = qw/QB RB WR WR WR TE FLEX DST/;
my $str = 'QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A.J. Green
+ WR John
Brown WR Davante Adams TE Martellus Bennett FLEX Jeremy Hill DST Panth
+ers';
my $ds;
my $currentPosition;
for my $word (split /\s+/,$str) {
if (my ($position) = grep { $word eq $_ } @positions) {
$currentPosition = $position;
push @{$ds->{$currentPosition}}, [];
} else {
push @{$ds->{$currentPosition}[-1]}, $word;
}
}
dd $ds;
{
DST => [["Panthers"]],
FLEX => [["Jeremy", "Hill"]],
QB => [["Carson", "Palmer"]],
RB => [["Chris", "Ivory"], ["Eddie", "Lacy"]],
TE => [["Martellus", "Bennett"]],
WR => [["A.J.", "Green"], ["John", "Brown"], ["Davante", "Adams"]]
+,
}
| [reply] [d/l] |
Re: Recommendations for breaking up string?
by Anonymous Monk on Sep 16, 2015 at 20:34 UTC
|
I changed the DB schema - suggest you rethink yours. Anytime there are column names FOO1 FOO2 FOO3 that's a red flag of an awkward schema, however SQL Schema theory is beyond the scope of this reply.
#!/usr/bin/perl
# http://perlmonks.org/?node_id=1142225
use strict;
use warnings;
use DBI;
$_ = 'QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A.J. Green WR J
+ohn Brown WR Davante Adams TE Martellus Bennett FLEX Jeremy Hill DST
+Panthers';
my $dbfilename = 'db.1142225';
unlink $dbfilename;
my $db = DBI->connect_cached("dbi:SQLite(RaiseError=>1):$dbfilename");
$db->do('create table players (team text, position text, player text)'
+);
my %dups = /\b([A-Z]{2,})\b(?=.*\b(\1)\b)/g; # insert counts if needed
for my $position (keys %dups)
{
my $n = 1;
s/\b$position\K\b/$n++/ge;
}
# make hash
my %db = /\b([A-Z]{2,}\d*) (.*?)(?= [A-Z]{2}|$)/g;
#use Data::Dump 'pp'; pp \%db;
# remove team (this is a guess)
my $team = delete $db{'DST'} or die "no team";
for my $pos (keys %db)
{
$db->do('insert into players values (?, ?, ?)', {}, $team, $pos, $db
+{$pos});
}
Here's the DB as dumped by sqlite db.1142225 .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE players (team text, position text, player text);
INSERT INTO "players" VALUES('Panthers','QB','Carson Palmer');
INSERT INTO "players" VALUES('Panthers','TE','Martellus Bennett');
INSERT INTO "players" VALUES('Panthers','WR2','John Brown');
INSERT INTO "players" VALUES('Panthers','RB1','Chris Ivory');
INSERT INTO "players" VALUES('Panthers','FLEX','Jeremy Hill');
INSERT INTO "players" VALUES('Panthers','WR3','Davante Adams');
INSERT INTO "players" VALUES('Panthers','WR1','A.J. Green');
INSERT INTO "players" VALUES('Panthers','RB2','Eddie Lacy');
COMMIT;
DB Hint: come up with the several ways you will query the DB and that might help your DB schema (sample query: "get all RB". This is
slightly awkward with this schema, how is it with the RB1 RB2 column name schema? )
| [reply] [d/l] [select] |
Re: Recommendations for breaking up string?
by poj (Abbot) on Sep 17, 2015 at 07:53 UTC
|
#!perl
use strict;
use DBI;
use Data::Dump 'pp';
my $dbh = get_dbh(); # connect
# field names
my @fname = qw(QB RB1 RB2 WR1 WR2 WR3 TE FLEX DST);
my $fld = join ',',@fname;
# placeholders
my $ph = '?'.(',?' x $#fname);
# build regex from field names without numbers
my $re = join '\s*(.*?)\s*',(map { s/\d// ; $_}@fname),'' ;
# prepare
my $sql = "INSERT INTO test_table (ID,$fld)
VALUES (NULL, $ph)";
my $sth = $dbh->prepare($sql);
# input data
while (<DATA>){
if ( my @f = /$re$/ ){
$sth->execute(@f);
}
}
# print results
pp $dbh->selectall_arrayref('SELECT * FROM test_table');
# connect / create test table
sub get_dbh{
my $database = "test"; my $user = "user"; my $pw = "password";
my $dsn = "dbi:mysql:$database:localhost:3306";
my $dbh = DBI->connect($dsn, $user, $pw,
{ RaiseError=>1, AutoCommit=>1 } );
#$dbh->do('drop table test_table');
$dbh->do("
CREATE TABLE IF NOT EXISTS test_table (
ID int(11) NOT NULL AUTO_INCREMENT,
QB varchar(45) DEFAULT NULL, RB1 varchar(45) DEFAULT NULL,
RB2 varchar(45) DEFAULT NULL, WR1 varchar(45) DEFAULT NULL,
WR2 varchar(45) DEFAULT NULL, WR3 varchar(45) DEFAULT NULL,
TE varchar(45) DEFAULT NULL, FLEX varchar(45) DEFAULT NULL,
DST varchar(45) DEFAULT NULL, UNIQUE KEY ID_UNIQUE (ID)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;");
#$dbh->do('delete from test_table');
return $dbh;
}
__DATA__
QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A.J. Green WR John Br
+own WR Davante Adams TE Martellus Bennett FLEX Jeremy Hill DST Panthe
+rs
poj | [reply] [d/l] |
Re: Recommendations for breaking up string?
by sundialsvc4 (Abbot) on Sep 16, 2015 at 19:49 UTC
|
use strict;
use warnings;
my $str = "QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A.J. Green
+ WR John Brown WR Davante Adams TE Martellus Bennett FLEX Jeremy Hill
+ DST Panthers";
my $key = undef;
my $result = {};
for my $wd (split(/\b(QB|RB|WR|TE|FLEX|DST)\b/,$str)) {
# As writ, "split()" will return the key followed by the string
# for that key. We abuse "defined($key)" to separate the two.
# Probably awful way to do it but it works.
# '$result' has a hash-bucket for each key and an array-of-strings
# for the content of the bucket. Perl's "auto-vivification"
# makes this easy, if cryptic, to write. (If a bucket does not yet
# exist, it magically appears.)
# The following statement is a further smelly hack.
# (The first string returned is empty.)
next if $wd eq "";
print "wd $wd\n";
if (defined($key)) {
push @{$result->{$key}}, $wd;
$key = undef;
}
else {
$key = $wd;
}
}
foreach my $k (keys $result) {
print "$k is: " . join(", ", @{$result->{$k}}) . "\n";
}
It does not update a database, of course, and you will also notice that it stacks any number of (say ...) RB entries into a list, instead of distinguishing between RB1 and RB2 (or assuming there are only two). Probably the only saving grace of it, if any, is the use of split with a particular regular-expression.
| [reply] [d/l] |
|
c:\@Work\Perl>perl -wMstrict -MData::Dump -le
"my $str = 'QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A. J. Gre
+en '
. 'WR John Brown WR WR Grace WR Pele TE Billy Bob Bennett '
. 'FLEX A.P. Hill DST Panthers';
print qq{[[$str]]};
;;
my $result = {};
;;
my $position = qr{ \b (?: QB | RB | WR | TE | FLEX | DST) \b }xms;
my $player = qr{ \S+ (?: \s+ \S+)*? }xms;
;;
while ($str =~ m{ ($position) \s+ ($player) (?= \s+ $position | \z) }
+xmsg) {
;;
my ($posn, $name) = ($1, $2);
;;
push @{$result->{$posn}}, $name;
;;
}
;;
dd $result;
"
[[QB Carson Palmer RB Chris Ivory RB Eddie Lacy WR A. J. Green WR John
+ Brown WR WR Grace WR Pele TE Billy Bob Bennett FLEX A.P. Hill
+DST Panthers]]
{
DST => ["Panthers"],
FLEX => ["A.P. Hill"],
QB => ["Carson Palmer"],
RB => ["Chris Ivory", "Eddie Lacy"],
TE => ["Billy Bob Bennett"],
WR => ["A. J. Green", "John Brown", "WR Grace", "Pele"],
}
The $player regex could be refined to make it more discriminative of human names, but that's always tricky.
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
|
|