http://www.perlmonks.org?node_id=417584

This is a script that will import data into a Access-Database, using Active States "PerlScript"-Extension, along with a part in VBScript. This is very fast, at least faster than using win32::odbc or even win32::ole working directly with access-objects).

it is also a nice example for mixing script-languages for the MS-Scripting Host.

Installation:

1) Copy the source to a file file (eg. txt2mdb.wsf) and add this files` directory to %PATH (the windows system path, not the hash %PATH!)

2) Create 2 subdirs in that directory named XDS and MDB.

  • XDS is the dir where the recordSpecs go into
  • MDB will hold empty databases

    Invocation:

    txt2mdb.wsf <recordSpec> <[new[:template]:]Database> <Table> file1 file2 file3
    #Simple Invocation, the database exists: txt2mdb.wsf XXSUM.XDS c:\temp\xxum.mdb "EK" c:\satnt\xxsum.e #simply the same: txt2mdb.wsf XXSUM c:\temp\xxum.mdb EK c:\satnt\xxsum.e #Create the Database from an empty one <recordSpec>.mdb: txt2mdb.wsf XXSUM new:c:\temp\xxum.mdb EK c:\satnt\xxsum.e #if the name of the empty database differs from the recordSpec: txt2mdb.wsf OLAP_1 neu:OLAP:c:\temp\xxum.mdb O1 c:\satnt\olap.txt

    Searching

    The script will search the subdirs "mdb" and "xds" (for every member of %PATH) for appropriate files. The first will be used.

    record Specifications

    The script supports three kinds of specs

  • recordSpecs for CSV-files
  • recordSpecs for line-based files with fixed length
  • special recordSpecs using perl code

    Example for CSV-files

    Input-File: ------------- 730112200;12345;XXX;F999087 recordSpec: ----------------- <?xml version="1.0" encoding="ISO-8859-1"?> <record kind="csv" separator=";"> <ignore regex="(^$|^[ \t]+$|^[ \t]*#)/" /> <field name="ANR" > $_ =~ s/00$//; </Field> <field name="VKNR" /> <filler > <field name="fbpos" /> </datensatz>

  • Fields will be separated at ";"
  • Lines that match the regex in the ignore-tags will be ignored.
  • Fields must be listed in the same order they have in the table.
  • If there is perl-code in the field-tag, this code will be evaluated for every record.
  • The actual value of the field is $_, it is possible to change that value.

    Example for fixed length files

    Inputfile: ------------- #Arztnr;VKNR;fbpos 730112200---01234----F999087 recordSpec: ----------------- <?xml version="1.0" encoding="ISO-8859-1"?> <record kind="csv" separator=";"> <ignore regex="(^$|^[ \t]+$|^[ \t]*#)/" /> <field name="Arztnr" pack="A7" /> <filler pack="X5" /> <field name="VKNR" pack="A5" /> <filler pack="X4" /> <field name="fbpos" pack="A7" /> </datensatz>

  • Lines that match the regex in the ignore-tags will be ignored.
  • Fields must be listed in the same order they have in the table.
  • The length of the field will be determined by the pack-attribute.
  • If there is perl-code in the field-tag, this code will be evaluated for every record.
  • The actual value of the field is $_, it is possible to change that value.

    Pack-Atribute:

  • A - normal Ascii-char
  • X - Ignore (for Filler)
  • x - Step back

    Special Example

    Inputfile: ------------- ... ... Arzt: 73011220 ... ... K 012345\tF999087 ... recordSpec: ----------------- <?xml version="1.0" encoding="ISO-8859-1"?> <record kind="special"> <special> <![CDATA[ # remember Arztnr if ( /^Arzt: ([0-9]{7})/ ) { $satz{"ArztNr"}=$1; } # record is fully read if ( /^K ([0-9]{5})\t([A-Z][0-9]{6})/ ) { # remember VKNR and fbpos $satz{"VKNR"} = $1; $satz{"fbpos"} = $2; # write record &update(); # Important!: reset data for next record %record=(); } ]]> </special> </record>
  • The code in the special-tag will be packed into a Function, that is called for every line of the inputfile.
  • The Hash %record is defined outside the function, so it is there to store values for the record.
  • calling &update() writes the values to the databes (keys are the field-names of the database)

    Caveats:

  • Currently the CSV-lines are simply splitted at the separator. What would you prefer: Text::CSV, regex, ...?
  • No support for "real" fixed length, records (will come if there is somebody interested. i didnīt need it so i didnīt implement it.)

    Finally, here is code:
    <job id="PERLandVBS"> <script language="VBScript"> dim Fields() dim Constants() dim Constants_init dim Constants_values() dim jet dim db dim rs function openDatabase (Database, Table) on error resume next set jet = CreateObject ("DAO.DBEngine.36") if Err.Description <> "" then openDatabase = Err.Descr +iption: exit function set db = jet.OpenDatabase(Database) if Err.Description <> "" then openDatabase = Err.Descr +iption: exit function set rs = db.OpenRecordset(Table) if Err.Description <> "" then openDatabase = Err.Descr +iption: exit function openDatabase = 1 end function function closeDatabase () on error resume next rs.Close if Err.Description <> "" then closeDatabase = Err.Desc +ription: exit function db.Close if Err.Description <> "" then closeDatabase = Err.Desc +ription: exit function closeDatabase = 1 end function sub rememberFields (Names) redim preserve Fields(Ubound(Names)) for i=0 to Ubound(Names) Fields(i) = Names(i) next end sub sub rememberConstant (name, wert) 'WScript.Echo ("rememberConstant " & name & " -> " & wert) ' if ( Constants_init ) then redim preserve Constants(Ubound(Constants)) redim preserve Constants_values(Ubound(Constants)) else redim Constants (0) redim Constants_values(0) Constants_init = 1 end if Constants(Ubound(Constants)) = name Constants_values(Ubound(Constants)) = wert end sub function importFields (values) on error resume next dim i rs.AddNew if Err.Description <> "" then importFields = Err.Descripti +on: exit function dim Fieldname for i=0 to UBound(values) Fieldname = Fields(i) if values(i) <> "" then rs.fields(Fieldname).value = values(i) if Err.Description <> "" then importFields = Err.D +escription & " (" & Fields(i) & ": " & values(i) & ")" : exit functio +n end if next if Constants_init then for i=0 to UBound(Constants) if Constants(i) <> "" then rs.fields(Constants(i)).value = Constants_valu +es(i) if Err.Description <> "" then importFields = E +rr.Description & " (" & Fields(i) & ": " & values(i) & ")" : exit fun +ction end if next end if rs.Update if Err.Description <> "" then importFields = Err.Descripti +on: exit function importFields = 1 end function </script> <script language="PerlScript"> use strict; use XML::Parser; use Unicode::String qw ( utf8 ); use Data::Dumper; DESTROY {} our $VERSION = '0.3'; wshprint ("TXT2DB Version $VERSION\n"); use Win32::API; use File::Copy; Win32::API->Import( "winmm.dll", "LONG timeGetTime()" ); my @PATH; for ( keys %ENV ) { if ( /^path$/i ) { @PATH = split ";", $ENV{$_}; last; } } wshdie ( "Too less arguments:\n" . "invocation: cscript.exe txt2db.wsf <recordSpec> <[new[:te +mplate]:]Database> <Table> <file1..n>\n" ) if $main::WScript->Arguments->Count < 3; my $recordSpec = $main::WScript->Arguments(0); wshdie ("No recordSpec specified!\n") unless defined $recordSp +ec; my $frecordSpec = findxds ( $recordSpec ); my $Database = $main::WScript->Arguments(1); wshdie ("No Database specified!\n") unless defined $Database; $Database = finddb ( $Database ); my $Table = $main::WScript->Arguments(2); wshdie ("No Table specified!\n") unless defined $Table; my @files; my $packstr; for (my $i=3; $i<$main::WScript->Arguments->Count;$i++) { push @files, $main::WScript->Arguments($i); } wshdie ("No files specified for import !\n") unless @files; my $kind; my $separator; my @Fields; my @Constants; my @ignore; my $ret; my $lines_file = 0; my $records_file = 0; my $saetze_gesamt = 0; my $specialcode = ""; my %satz; my $firstupdate = 1; my $startzeit; my $laufzeit; my $timeLeft; my $pzeit; my $bytes_gesamt = 0; my $bytes_gelesen = 0; &event ("read recordSpec"); read_recordSpec(); for my $k ( @Constants ) { my @dup = grep { (not defined $_->{'filler'}) && ($_->{'na +me'} eq $k->{'name'}) } @Fields; if ( scalar @dup > 0 ) { wshdie ("Constant '$k->{'name'}' already declared as F +ield!\n"); } &rememberConstant ($k->{name}, $k->{wert}); } if ( $specialcode ) { $specialcode .= "\n}"; eval $specialcode; } wshdie ("Error compiling <special>-code: $@") if $@; &event ("open Database <$Database>, Table: <$Table>"); $ret = openDatabase($Database, $Table); wshdie ("Error opening Database: $ret\n") unless $ret == 1; rememberFields ( [ map { $_->{name} } grep { not defined $_->{ +'filler'} } @Fields ] ); &event ("starting Import..."); $startzeit = timeGetTime() / 1000; import(); &event ("close Database"); $ret = closeDatabase(); wshdie ("Error closing the Database: $ret\n") unless $ret == 1 +; &event ("end Import"); sub read_recordSpec { my $parser = new XML::Parser (Style=>'Subs',Handlers => {C +har=>\&code}); $parser->parsefile ($frecordSpec); if ( $kind eq "fix" ) { wshdie ("Error reading recordSpec: specified \"separat +or\" along with kind=\"fix\"!\n") if $separator; wshdie ("Error reading recordSpec: No fields specified +!\n") unless @Fields; } elsif ( $kind eq "csv" ) { wshdie ("Error reading recordSpec: No \"separator\" sp +ecified!\n") unless $separator; wshdie ("Error reading recordSpec: no Fields specified +!\n") unless @Fields; } elsif ( $kind eq "special" ) { } else { wshdie ("Error reading recordSpec: Unknown value for f +ür \"kind\"!\n"); } } sub field { my $expat = shift; my $Field = utf8(shift)->latin1; my %attribute = map { utf8($_)->latin1 } @_; wshdie ("Error reading recordSpec: No fieldname specified! +\n") unless defined $attribute{'name'}; if ( $kind eq "fix" ) { wshdie ("Error reading recordSpec: No Attribute \"pack +\" specified ($attribute{'name'})!\n") unless defined $attribute{'pac +k'}; } $packstr.= $attribute{'pack'}; push @Fields, \%attribute; } sub filler { my $expat = shift; my $Field = utf8(shift)->latin1; my %attribute = map { utf8($_)->latin1 } @_; if ( $kind eq "fix" ) { wshdie ("Error reading recordSpec: No Attribute \"pack +\" specified ($attribute{'name'})!\n") unless defined $attribute{'pac +k'}; $packstr.= $attribute{'pack'}; } if ( $kind eq "csv" ) { push @Fields, { filler => 1 }; } } sub record { my $expat = shift; my $Field = utf8(shift)->latin1; my %attribute = map { utf8($_)->latin1 } @_; $kind = lc($attribute{kind}) if defined $attri +bute{kind}; $separator = qr/$attribute{separator}/ if defined $attribu +te{separator}; } sub ignore { my $expat = shift; my $Field = utf8(shift)->latin1; my %attribute = map { utf8($_)->latin1 } @_; my $re = qr/$attribute{regex}/; push @ignore, $re; } sub code { my $expat = shift; my $text = utf8(shift)->latin1; return if $text =~ /^$/; return if $text =~ /^[ \t\n]+$/ms; my @context = $expat->context; if ( $context[-1] =~ /^special$/i ) { &specialc ($text); } else { $Fields[-1]->{code} .= $text; } } sub specialc { my $text = shift; $specialcode = "\$specialcode = sub { \$_ = \$_[0]; " unle +ss $specialcode; $specialcode .= "$text"; } sub constant { my $expat = shift; shift; my %attribute = map { utf8($_)->latin1 } @_; wshdie ("Error reading recordSpec: No Attribute \"name\" s +pecified!\n") unless defined $attribute{'name'}; wshdie ("Error reading recordSpec: No Attribute \"value\" +specified ($attribute{'name'})!\n") unless defined $attribute{'wert'} +; my $ts = $main::WScript->StdIn; if ( $attribute{'wert'} =~ /!([^!]+)!$/ ) { my $a = ""; while ( $a eq "" ) { wshprint ("Enter constant value ($attribute{'name' +}):"); $a = $ts->ReadLine(); chomp $a; if ( $attribute{"match"} ) { $a = "" unless $a =~ m/$attribute{"match"}/; } } $attribute{'wert'} = $a; } push @Constants, \%attribute; } sub import { $saetze_gesamt = 0; for my $Field ( @Fields ) { if ( $Field->{'code'} ) { my $code = "\$Field->{'code'}=sub { \$_=\$_[0]; $F +ield->{code}; return \$_; }"; eval $code; wshdie ("Error reading recordSpec: Error in Userco +de: $@\n") if $@; } } for ( @files ) { for ( glob ($_) ) { $bytes_gesamt += (stat $_)[7]; } } for ( @files ) { $records_file = 0; $lines_file = 0; for ( glob ($_) ) { import_file ($_); } } } my $i; sub import_file { my $file = shift; $pzeit = timeGetTime()/1000; %satz = (); &event ("open file <$file>"); open IN, $file or wshdie ("Import-Error: Cannot open <$fil +e>!\n"); while ( my $line = <IN> ) { $lines_file++; $bytes_gelesen += length($line); chomp ($line); my $no = 0; for ( @ignore ) { $no = 1, next if $line =~ $_; } next if $no; my @f; if ( $kind eq "fest" ) { $records_file++; $saetze_gesamt++; eval { @f = unpack ($packstr, $line); }; wshdie ("Error using unpack: $@") if $@; } elsif ( $kind eq "special" ) { chomp $line; my $s = &$specialcode ($line) || 0; $records_file+=$s; $saetze_gesamt+=$s; } else { $records_file++; $saetze_gesamt++; @f = split ($separator, $line); } if ( timeGetTime() / 1000 > int($pzeit) + 5 ) { $laufzeit = (timeGetTime() / 1000) - $startzeit; $timeLeft = (($bytes_gesamt/$bytes_gelesen) * $lau +fzeit) - $laufzeit; &event ( sprintf ("%.2f", ($bytes_gelesen/$bytes_g +esamt) * 100), "% imported, (", ft (int($timeLeft)), " minutes left). +"); $pzeit = timeGetTime()/1000; } next if $kind eq "special"; my $j = 0; for my $Field ( @Fields ) { my $sub = $Field->{'code'}; $f[$j] = &$sub ($f[$j]) if $sub; if ( defined $Field->{filler} ) { $f[$j] = undef; } $j++; } my $r = importFields ([grep { defined $_ } @f]); wshdie ("Error writing record: $r in line $lines_file\ +n") unless $r == 1; } &event ("close file <$file>, $records_file records importe +d."); close IN; } sub update { my @f; if ( $firstupdate ) { $firstupdate = 0; my $f = [ sort keys %satz ]; rememberFields ($f); } for ( sort keys %satz ) { push @f, $satz{$_}; } my $r = importFields (\@f); wshdie ("Error writing record: $r in line $lines_file\n") +unless $r == 1; } sub status { my $laufzeit; my $percent; return (timeLeft => $timeLeft, percent => $percent); } sub findxds { my $recordSpec = shift; $recordSpec =~ s/\.xds$//i; my $e = ""; for ( "./", @PATH) { $e = "$_/xds/$recordSpec.xds" if -f "$_/xds/$recordSpe +c.xds" && -e "$_/xds/$recordSpec.xds"; } wshdie ("recordSpec '$recordSpec' not found!\n") unless $e + ; return $e; } sub finddb { my $Database = shift; my $sb; my $e; if ( $Database =~ /^new:([^:]+:)?(.+)$/ ) { if ( $1 ) { $sb = $1; $Database = $2; $sb =~ s/:$//; } else { $sb = $recordSpec; $Database = $2; } for ( "./", @PATH) { if ( -f "$_/mdb/$sb.mdb" && -e "$_/mdb/$sb.mdb" ) { if ( -e "$Database" ) { wshdie ("Database \"$Database\" existiert +bereits!\n"); } else { copy "$_/mdb/$sb.mdb", "$Database"; last; } } } } wshdie ("Database '$Database' not found!\n") unless -e $Da +tabase; return $Database; } sub wshdie { my $message = join "", @_; unless ( $message =~ /\n$/ms ) { my @caller = caller; $message .= " in Module $caller[0] line $caller[2]!"; } wshprint ($message); $main::WScript->Quit (-1); exit; } sub wshprint { $main::WScript->Echo ( join ("", @_) ); } sub ft { my $t = $_[0]; my $m = int($t / 60); my $s = $t-($m*60); return sprintf("%2d", $m) . ":" . sprintf("%2d", $s); } sub event { my @t = localtime; wshprint (sprintf("%02d", $t[2]) . ":" . sprintf("%02d", +$t[1]) . ":" . sprintf("%02d", $t[0]) . " ". join ("", @_) ); }