Re: Problem with larger files (and s/)

by jethro (Monsignor)
on Jun 24, 2008 at 22:40 UTC

in reply to Problem with larger files (and s/)

The NOT\t\tNULL issue is trivially corrected with
if (not s/NOT NULL/\t\tNOT NULL/s) { s/NULL/\t\tNULL/s; }
(at least I hope so, didn't test it).

The following regex works too:

To find the solution to the other problem I would suggest loading that 9k file into an editor (not word, not wordpad, but you probably know that), cut it by half and feeding it to your program again. If the problem is still there when it is 300 bytes long you know it isn't because of size, but probably the wrong file format.

If not, put another print statement before all the regexp to see if perl reads the file incorrectly or mangles it in your code (which to me looks perfectly ok).

Replies are listed 'Best First'.
Re^2: Problem with larger files (and s/)
by Cloudster (Novice) on Jun 25, 2008 at 14:54 UTC
    Thank you very much for your trivial correction, it was spot-on. I'm learning Perl via an older copy of O'Reilly's Perl CD collection, and had not yet seen an example of using /s as an if test. Works like a charm.

    As far as the Unicode problem, see my reply to the previous reply.

    *facepalm again*
      Sorry, the 'trivially' wasn't meant as in 'trivial problem' but more like 'less sophisticated solution', i.e. without packing it into one regex or using negative lookahead (which I wanted to mention, but I couldn't remember the name, which was ample proof to me that it is a non-trivial solution). ;-)
        I had replied to your message, but apparently it vanished. I had no problem with your usage of the term 'trivial', and I appreciate you not giving a highly-dense solution as I'm still learning the language and high density would require more brainpower than I'm willing to devote to it. I also plan on posting this code on for other DBAs to use and modify, and by using a simpler, 'trivial' implementation, the code will be more easily modified by others. If they know how to make it more dense, let 'em.

