Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: input record separator and split

by Laurent_R (Monsignor)
on May 28, 2014 at 21:21 UTC ( #1087716=note: print w/ replies, xml ) Need Help??


in reply to input record separator and split

Not only does $/ not accept regex, but it also looks fairly useless to add the "\s+" pattern in this context. At most, it would remove additional spaces from the chunks you get, but that can easily be done as a second step.

The second thing that I don't get is that you split your file on "Query" and then try to split your lines on almost the same pattern. Unless I missed something, it does not seem to me to make much sense with the data sample you provided.

Lastly, a 72320825-line file is pretty big, but I would not qualify it as huge (unless the lines are really very long), I am using much larger files on an almost daily basis and don't get any trouble so long as I am not doing something stupid sus as trying to load everything into memory (il might just take some time, but it does not fail). Anyway, since this line:

@blastblock = split(/Query=/, $_);
is overwriting the @blastblock array each time through the loop, I don't really believe that you ran out of memory because of the size of the input data. I would suggest that you try to look at line 54725380 to figure out if there is something wrong with it. One possible to view it might be a one-liner such as this one:
perl -e '$/ = "\nQuery="; while (<>) { print and last if $. == 5472538 +0;}' file.txt
It might have to be adapted depending on your data, but see if this works. More generally, I suspect that your split fails because your data might have a very large section (possibly the whole file) without ever matching the record splitting pattern. So the first thing to be done is to remove the \s+ from your input record delimiter and see whether that works.


Comment on Re: input record separator and split
Select or Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1087716]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (12)
As of 2015-07-29 19:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (267 votes), past polls