http://www.perlmonks.org?node_id=11159751

Bod has asked for the wisdom of the Perl Monks concerning the following question:

As I mentioned in Yet another Encoding issue..., I am writing an AI chatbot based around AI::Chat that holds a conversation in Turkish and corrects any mistakes in the Turkish supplied by the user. Of course, there are not always mistakes so a correction is not always needed.

I've promoted the AI that
"if there are no mistakes that need correcting reply with the single word "Perfect" and do not add any other words to your reply."

But being AI, it can be unpredictable! Sometimes, it will quote the Turkish and then write "Perfect" on a separate line.

Currently I check for whether there is a correction like this:

if ($reply !~ /^perfect/i) { $chatReply->{'correction'} = $reply; }

I don't want to check for "Perfect" anywhere in the reply as it might form part of a valid correction. So I am thinking of checking that "Perfect" appears either at the start of the reply or at the end like this:

if ($reply !~ /^perfect/i and $reply !~ /perfect$/i) { $chatReply->{'correction'} = $reply; }

But is there a way of combining those two regexps into just one? It seems there should be...

Replies are listed 'Best First'.
Re: Regexp match start or end
by hippo (Archbishop) on Jun 02, 2024 at 18:09 UTC

    Adjust the good/bad sets as required.

    use strict; use warnings; use Test::More; my @good = ( 'perfect weather', 'Nobody\'s Perfect', 'PeRfEcT!!', ); my @bad = ( 'imperfect', 'Perfecto!', 'perfection', ); my $re = qr/^perfect\b|\bperfect$/i; plan tests => @good + @bad; like $_, $re, "match for '$_'" for @good; unlike $_, $re, "no match for '$_'" for @bad;

    🦛

      ++

      Muy bien. Using \b as delimiters will eliminate most rogue issues such as:

      No es perfecto. Esto es imperfecto.

      More kudos for promoting testing utilities.

Re: Regexp match start or end
by syphilis (Archbishop) on Jun 02, 2024 at 13:40 UTC
    But is there a way of combining those two regexps into just one?

    I think (untested):
     =~ /^perfect|perfect$/i will return true if and only if "perfect"(case-insensitive) appears at either the beginning or the end (or both).
     !~ /^perfect|perfect$/i will return true if and only if "perfect"(case-insensitive) appears neither at the beginning nor the end.

    Update: As LanX has noted, beginnings such as "perfection" and endings such as "imperfect" will also be matched.

    Cheers,
    Rob
Re: Regexp match start or end
by LanX (Saint) on Jun 02, 2024 at 13:39 UTC
    > But being AI, it can be unpredictable!

    And because of this I'd rather stick with separated, simple rules which can be easily maintained.

    Better expect more trouble to come.

    > with the single word "Perfect"

    Your logic is flawed though, this

    $reply !~ /^perfect/i

    will not only match (well reject) "single words" at the beginning.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

Re: Regexp match start or end
by Boldra (Deacon) on Jun 05, 2024 at 14:02 UTC

    One approach I've found works well is telling chatgpt to respond in json or yaml, and defining a format where it has space to express itself as "comments". It's pretty reliable if it's a system instruction and you provide some examples. I bet XML would work well too.



    - Boldra
Re: Regexp match start or end
by harangzsolt33 (Chaplain) on Jun 04, 2024 at 04:13 UTC
    You could combine the two regex's like this :

        ($reply =~ m/^perfect|perfect$/i) or ...

    and then type whatever you want to happen if the reply does not start or end with the word "perfect." But you know, I like hippo's version better, because he uses the \b modifier in the regex which makes sure that perfect is a whole word...because what if AI responds to you with the word "imperfect"? Then your regex catches that and thinks "Oh, it looks like AI said "perfect." It ends with "perfect," and that's all that matters!" Lol I'm no English expert. Maybe there are other words which begin or end with "perfect," so the \b modifier is a really good idea to use here!!!

        ($reply =~ m/^perfect\b|\bperfect$/i) or ...