Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: regex on gigabyte string

by BrowserUk (Pope)
on Jan 26, 2013 at 18:27 UTC ( #1015522=note: print w/ replies, xml ) Need Help??


in reply to regex on gigabyte string

Whilst loading strings > 4GB is no problem on a 64-bit Perl (assuming you have the memory), unfortunately, there are still many places in the core where such huge strings are simply not supported.

Two examples:

  1. substr doesn't accept offsets > 2GB
  2. Regexes don't operate on strings > 2GB.

Its a pain in the lower lumbar region, but probably won't change any time soon.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: regex on gigabyte string
by focusonz (Initiate) on Jan 26, 2013 at 19:34 UTC

    Whoa back!

    I am using a construct of if( substr($bigstring, $begtagidx, 5) eq "<c r=" ) Where $begtagidx is out to 4 billion and have not seen problem.

    But the data verification process is not yet terminated so I will have to get back to you cloistered people on this.

    thanks for the pearls of scripture !
      Where $begtagidx is out to 4 billion and have not seen problem.

      Okay. It seems that limitation has been lifted with 5.16 (I still use 5.10.1 as my primary Perl where it is the case):

      say $];; 5.016001 $s = 'fred'; $s x= 1024**3;; print substr( $s, -4 );; fred

      But the 2GB limit on regex still persists in 5.16:

      [19:51:25.70] C:\test>\perl64-16\bin\perl \perl64\bin\p1.pl [0] Perl> say $];; 5.016001 [0] Perl> $s = 'fred'; $s x= 1024**3;; [0] Perl> ++$n while $s =~ /fred/g; say $n;; Use of uninitialized value $n in say at (eval 9) line 1, <STDIN> line +3. [0] Perl> $s = 'fr'; $s x= 1024**3;; [0] Perl> ++$n while $s =~ /fr/g; say $n;; Use of uninitialized value $n in say at (eval 11) line 1, <STDIN> line + 5. [0] Perl> $s = 'fr'; $s x= 1020**3;; [0] Perl> ++$n while $s =~ /fr/g; say $n;; 1061208000 [0] Perl>

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1015522]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2016-07-28 22:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What is your favorite alternate name for a (specific) keyboard key?


















    Results (258 votes). Check out past polls.