Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re^2: split with a delimiter, every 4 time it occurs.

by Anonymous Monk
on Jul 01, 2013 at 08:09 UTC ( #1041778=note: print w/ replies, xml ) Need Help??


in reply to Re: split with a delimiter, every 4 time it occurs.
in thread split with a delimiter, every 4 time it occurs.

Thanks Davido, that worked exactly as I wanted. but I have 1 more question, Is this okay in dealing with very big data(in case of speed & performance)


Comment on Re^2: split with a delimiter, every 4 time it occurs.
Re^3: split with a delimiter, every 4 time it occurs.
by sundialsvc4 (Monsignor) on Jul 01, 2013 at 11:34 UTC

    I would suggest that it really depends on how “very” your “big” is.   You might have to read a large file in arbitrarily-sized sections, which can get messy when you are doing this sort of thing.   Usually it’s best to cross such bridges if-and-when you get there.   What sort of data-volume and speed requirements might you be dealing with?

Re^3: split with a delimiter, every 4 time it occurs.
by davido (Archbishop) on Jul 01, 2013 at 15:55 UTC

    There are several things that could be improved upon if memory is an issue. In fact, if memory is an issue, split probably shouldn't play any part in your solution, since it returns a list, which consumes a lot more space than the original string. natatime will be reasonably memory efficient, but expanding that nice, compact string into a list will be memory-expensive.

    As for time, or computational efficiency (which is what you were asking about), it's an O(n) solution. Where you will get into trouble is where "n" grows large enough to send you into swap memory. There may be O(n) solutions with smaller per-iteration costs, but you won't ever turn this into O(log n) or O(1); you'll always be at least O(n).

    If I were designing it with memory in mind, I would probably use index and substr together to find every fifth comma, and to extract the portion of the string that resides between every fifth comma. I would iterate one small segment at a time so that I'm never making a copy of a large string. If that approach turned out not to be fast enough, I would re-implement the same simple algorithm using Inline::C.


    Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1041778]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (8)
As of 2014-07-22 09:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (109 votes), past polls