Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Re: non-exact regexp matches

by Abigail-II (Bishop)
on Jun 23, 2004 at 15:19 UTC ( #369067=note: print w/replies, xml ) Need Help??

in reply to Re^4: non-exact regexp matches
in thread non-exact regexp matches

You are talking about regexes, but your example shows the most trivial regex one can image, namely one that doesn't contain any characters that are special. Do you want to match any possible regex, or are you just looking for matching strings? The latter is far, far more easier than the former - and the latter doesn't need the regex engine at all.


Replies are listed 'Best First'.
Re^2: non-exact regexp matches
by vinforget (Beadle) on Jun 23, 2004 at 15:32 UTC
    Optimally, I want to match any regexp, but I am not sure if regexps in perl can handle this in a stable fashion. I've been using regexp to report all nested pattern matches with positions of matches using $-[0]:
    m/(regexp)(?{ print $-[0] )(?!)/;

    but everywhere I look most people say to stay away from this stuff because:
    1) it is not stable
    2) it may not be supported in newer versions of perl.
    So I'm not sure if I should take a more specific yet stable approach, or a generalisable yet potentialy unstable approach.
      I don't see the connection between using the (?{ }) and (?!) to report all matches and your original question of finding "partial" matches.

      But so, you want any regexp to match fuzzy. However, then your example is unclear - it's picking out positions in the regex (not in the string) to indicate where characters should be changed. Do you also want to be able to change special characters? Is it ok to introduce characters in the regex to make it match? (That would be easy, just add a | as the first character in the regex).


        I refined my question a little more. I have a string of letters [ACGTacgtNn] from which I want to find a particular instance of a regexp, let's say:

        I can do this just fine, but what if I want to match the above regexp with a tolerance of 2 minmatches for single characters. Below I have an example:
        $buf =~ m/(A)(C)(C)(A)(A)(C)([ACGTacgtNn]{6})(CTA[ACGTacgtNn]{1})(A)(T +)(G)([ACGTacgtNn]{1,2})(G)(A)(T)(G)(T)(T)(?{ print $-[0]," ",scalar@-,"\n"; })(?!)/;
        this will print the position of the match in $buf, followed by 19 (the number of submatches). I want to be able to return a match from 17-19 submathes, not just all 19. Thanks. Vince

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://369067]
[1nickt]: Tux How is the danger ameliorated in, eg.g, MS Excel? Some sort of sandbox with rules?
[1nickt]: Tux is cbstream working for you?
[Tux]: I have no idea! Read this article
[choroba]: yes, if the formula contains just +, -, *, /, and digits, calculate it, otherwise fail, or something like that.
[choroba]: or detection of malicious strings like | cmd or what was the security issue about.
[choroba]: re slow PM, yesterday half of my clicks in RATS ended in request timeout.
[Tux]: 1nickt - yes, it works oké
[Tux]: choroba as many formula's have references to other cells, that would only be possible on constants. CSV_XS has no knowledge of the contents of the rest of the document
[Tux]: But it could be an option for Spreadsheet::Read (but I am not tempted to do that. patches welcome?)
[choroba]: makes sense

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (8)
As of 2017-10-18 10:54 GMT
Find Nodes?
    Voting Booth?
    My fridge is mostly full of:

    Results (244 votes). Check out past polls.