Perl Monk, Perl Meditation | |
PerlMonks |
Re: Fixed Position Column Recordsby BrowserUk (Patriarch) |
on Jul 22, 2007 at 02:12 UTC ( [id://628059]=note: print w/replies, xml ) | Need Help?? |
I'm assuming that you don't know the lengths of each field in advance. Ie. That you are hoping to use the same code to process similar files containing fixed length records where the lengths of the fields can vary from one file to the next? To that end, I've come up with an (imperfect) mechanism to allow the program to determine the offsets by inspection. It requires two passes of the file.
The output
The above shows why it is imperfect. It 'found' an extra column at the end of the second column. However, the more lines in the file, statistically, the less likelyhood of word breaks 'lining up' throughout the file. It shouldn't happen too often on files of any great length. (Famous last words:) Whether that's a flaw you can live with is your decision. I tried to think of a heuristic to determine when a column should be combined with a neihbour, but it will depend entirely on the file and the data. I've used the 'a' template which pads fields with spaces because it makes for ease of alignment for printing, but use 'A' if you want the trailing spaces stripped. Update: I thought of a heuristic that would probably work, but it would require at least a third pass. Left or right justified, one end or the other of every field should contain a non-space char in every record. Another pass that inspected the first and last chars of each field could detect 'false' columns. You'd then need to decide whether that column should be combined with the preceding or the following field. Another heuristic is called for, but whatever you come up with, it is possible to dream up scenarios in which it would fail. In the case above, the fact that the field that follows the false field has a non-space in the first char in every record in the file is a strong indication that the false field should be combined with its precedant. But had the following field been a right-justified field, then things would be less clear cut. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
In Section
Seekers of Perl Wisdom
|
|