Re: Reading tab/whitespace delimited text file

Replies are listed 'Best First'.
Re^2: Reading tab/whitespace delimited text file by reaper9187 (Scribe) on Oct 22, 2012 at 05:31 UTC
Thanks a lot for helping me As i said earlier, the above code is only a section of the entire file. There are multiple such sections `SCTYPE SSDESDL QDESDL LCOMPDL QCOMPDL UL 90 30 5 55 BSPWRMINP BSPWRMINN 20` [download] . . `CELL SCTYPE LWACH1A ACTIVE CHTYPE CHRATE SPV LVA ACL NCH YES BCCH 1 A3 1 SDCCH 0 A3 15 TCH FR 1 0 A3 13 TCH FR 2 0 A3 13 TCH FR 3 0 A3 13 TCH HR 1 0 A3 26 TCH HR 3 0 A3 26 CBCH 0 A3 1` [download] . . `CELL LOL LOLHYST TAOL TAOLHYST LUC082A 120 3 61 0 DTCBP DTCBN DTCBHYST NDIST NNCELLS 4 2 10 1` [download] . . `ACTIVE CHTYPE CHRATE SPV LVA ACL NCH YES BCCH 0 A3 0 SDCCH 0 A3 0 TCH FR 1 16 A3 32 TCH FR 2 0 A3 32` [download] Again the file is pretty large and i cannot mention all of the formats. Just need to get an idea on how to do it. I can then extend it over the entire file	[reply] [d/l] [select]
Re^3: Reading tab/whitespace delimited text file by BrowserUk (Patriarch) on Oct 22, 2012 at 06:31 UTC
Yuck! I thought (hoped) that this type of file format -- mixed, fixed-format records -- had died long ago; but they seem to keep reinventing it :) For your first example, the trick is to define a regex that will match the fields in the header line: `my $reHeader = '(\b\w+\s)?' x 10; ## Adjust the repeat value to cover + the maximum no of fields` [download] and use that to construct an unpack template to parse the following values line. This is not 'nice code', but it demostrates the technique: #! perl -slw use strict; use Data::Dump qw[ pp ]; my $reHeader = '(\b\w+\s)?' x 10; my %data; until( eof( DATA ) ) { ## Read the header line and remove the newline chomp( my $header = <DATA> ); ## parse the fields using the regex, ignoring undefined fields my @keys = grep defined, $header =~ $reHeader; ## trim the trailing whitespace from the keys s[\s$][] for @keys; ## Use the capture position arrays (@- & @+) ## to work out the field widths and construct a template my $tmpl = join ' ', map{ defined( $-[$_] ) ? do{ my $n = $+[$_] - $-[$_]; "a$n" } : () } 1 .. $#+; ## read and chomp the values line chomp( my $vals = <DATA> ); ## Extract the value fields using the template my @vals = unpack $tmpl, $vals; ## trim leading & trailing whitespace s[^\s][],s[\s*$][] for @vals; ## Add the key/value pairs to the hash @data{ @keys } = @vals; ## discard the blank line between the grouped pairs of lines. <DATA>; } pp \%data; ## display the hash constructed __DATA__ TRHYST TROFFSETP TROFFSETN AWOFFSET BQOFFSET 2 0 5 3 HIHYST LOHYST OFFSETP OFFSETN BQOFFSETAFR 5 3 0 3 CELLR DIR CAND CS LUC083A MUTUAL BOTH NO [download] Outputs: `C:\test>junk79 { AWOFFSET => 5, BQOFFSET => 3, BQOFFSETAFR => 3, CAND => "BOTH", CELLR => "LUC083A", CS => "NO", DIR => "MUTUAL", HIHYST => 5, LOHYST => 3, OFFSETN => "", OFFSETP => 0, TRHYST => 2, TROFFSETN => "", TROFFSETP => 0, }` [download] Extending that to apply it to all your other sections will require a little ingenuity and a lot of painstaking testing. I do hope for your sake that the number and ordering of the different sections is well-defined, else you've got an even worse task on your hands. Note:This assumes that field names do not contain spaces. If they do, you are in shit street. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP Neil Armstrong	[reply] [d/l] [select]
Re^4: Reading tab/whitespace delimited text file by reaper9187 (Scribe) on Oct 22, 2012 at 06:55 UTC
Hi, I can't thank you enough for the help. i know it looks pretty messy but the good part is i don't need to read every section (thank god for that.!). I would have been in deep shit otherwise. Anyways, thanks for the heads up.	[reply]
Re^4: Reading tab/whitespace delimited text file by reaper9187 (Scribe) on Nov 01, 2012 at 12:38 UTC
why is the code not able to read the following ??? `CELL LUC325C CELLR DIR CAND CS LUC325B MUTUAL BOTH NO KHYST KOFFSETP KOFFSETN LHYST LOFFSETP LOFFSETN 3 0 3 0 TRHYST TROFFSETP TROFFSETN AWOFFSET BQOFFSET 2 0 5 3 HIHYST LOHYST OFFSETP OFFSETN BQOFFSETAFR 5 3 0 3` [download] The value for cell key should be LUC325C but i keep getting LUC3.. thats it ..!! help appreciated ..!!	[reply] [d/l]
Re^5: Reading tab/whitespace delimited text file by BrowserUk (Patriarch) on Nov 01, 2012 at 13:08 UTC
Re^6: Reading tab/whitespace delimited text file by reaper9187 (Scribe) on Nov 02, 2012 at 10:57 UTC
Some notes below your chosen depth have not been shown here


Problems? Is your data what you think it is?
	PerlMonks