If you get the strings like you say are there, then you can use them as numbers. In Perl, you don't have to call a function to convert a string to a number, if that string is a number, you can just use it like one. Here I just added 10 to the "string" to show that feature. Of course once "$string" is a "number", leading zero'es are suppressed unless you use some kind of printf statement to add them back into the printout. A common idiom to suppress leading zeroes is $number_string+=0;
#!usr/bin/perl
use warnings;
use strict;
my @input = qw /0001144204-09-017358
0001144204-10-065610
0001042167-15-000175
0000053669-16-000051 /;
foreach my $string (@input)
{
$string =~ tr/-//d;
print "string = $string\n";
print "string +10 as number: ", $string + 10,"\n";
}
__END__
prints:
string = 000114420409017358
string +10 as number: 114420409017368
string = 000114420410065610
string +10 as number: 114420410065620
string = 000104216715000175
string +10 as number: 104216715000185
string = 000005366916000051
string +10 as number: 5366916000061
Update: I ran this on Win XP, 32 bit.
normally, 2,147,483,647 would be max int, but Perl 5.22 was able to get 104,216,715,000,185 from the addition. | [reply] [d/l] |
Those are the strings you are transforming, but it looks like you are struggling on extracting the your lines. What do your literal lines look like?
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
| [reply] |
#!/usr/bin/perl -w
use strict;
use warnings;
use File::stat;
use lib "c:/strawberry/perl/site/lib";
#This program will extract the header information in 10K and 10Q filin
+gs
#as well as file sizes.
#Specify the directory containing the files that you want to read;
#my $files_dir = 'C:\Rick Francis\Data\SEC Filings\Filing Doc';
my $files_dir = 'E:\research\audit fee models\filings\Test';
#Specify the directory containing the results/output;
#my $write_dir = 'C:\Rick Francis\Data\SEC Filings\Header Data\Revised
+\DataTest.txt';
my $write_dir = 'E:\research\audit fee models\filings\filenames\filen
+ames.txt';
#Open the directory containing the files you plan to read;
opendir(my $dir_handle, $files_dir) or die "Can't open directory $!";
#Initialize file counter variable;
my $file_count = 0;
#Loop for reading each file in the input directory;
while (my $filename = readdir($dir_handle)) {
next unless -f $files_dir.'/'.$filename;
print "Processing $filename\n";
#Initialize the variable names.
my $line_count=0;
my $access_num=-99;
my $cik=-99;
my $form_type="";
my $form="";
my $report_date=-99;
my $file_date=-99;
my $name="";
#my $sic=-99;
#my $sic1=-99;
my $file_name="";
my $htm="";
my $url="";
my $slash='/';
#Open the input file;
open my $FH_IN, '<',$files_dir.'/'.$filename or die "Can't open $filen
+ame";
#Within the file loop, read each line of the current file;
while (my $line = <$FH_IN>) {
next unless -f $files_dir.'/'.$filename;
if ($line_count > 500000) { last;}
#The following steps obtain basic data from various lines in the file;
if($line=~m/^\s*ACCESSION\s*NUMBER:\s*/m){$access_num=$1; $access_nu
+m =~ tr/-//d;}
| [reply] [d/l] |
| [reply] |