Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Parsing Chess Algebra Notation

by dragonchild (Archbishop)
on May 04, 2004 at 14:13 UTC ( [id://350341]=note: print w/replies, xml ) Need Help??


in reply to Parsing Chess Algebra Notation

Algebraic notation is relative to the current board position. So, you have to parse it backwards.

The last character may or may not be a '+', indicating check. (I'm going to assume that there is no analysis on the game, so I will ignore !, !!, ?, ??, ?!, and !?.) So, we have to see if that exists. If it does, we may or may not want to verify that, after the move, a state of check exists.

After we look for the '+', the last two characters are guaranteed to be the destination square in the form /[a-h][1-8]/.

Any remaining characters are the specification for the piece and/or starting-square coordinates. (Remember, we're still parsing the string in reverse.)

  • If there is nothing there, look for a pawn that could make it to this square.
  • If there is an 'x', that means that there is a capture. (This notation, while legal, isn't common.)
  • If there is a K, Q, R, N, or B, look for the appropriate piece that could make it to this square. There should be only one of that piece type that can legally make this move.
  • If there is [a-h1-8]+, that means that there is either more than one officer of that piece type that can make this move or it is a pawn capture (ed5 is a legal notation). Remember, Qa2b3 means that there are three queens, potentially at a2, a4, and b4. All of them can move to b3.

The key is parsing it backwards.

------
We are the carpenters and bricklayers of the Information Age.

Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

Replies are listed 'Best First'.
Re: Re: Parsing Chess Algebra Notation
by EdwardG (Vicar) on May 04, 2004 at 16:35 UTC
    Remember, Qa2b3 means that there are three queens, potentially at a2, a4, and b4. All of them can move to b3.

    ??

    I would interpret Qa2b3 to only mean that a Queen at a2 is moving to b3 - I don't know how you could derive the existence and position of two other Queens from this.

    And from my playing experience I had thought that when two or more of the same piece could move to a given square, you added the minimal extra information to disambiguate, either the rank number or file letter. For example in the following position there are two rooks that can move Ra4, so to disambiguate I add the rank, thus R8a4.

    +---+---+---+---+---+---+---+---+ 8 | R | | | | | | | | +---+---+---+---+---+---+---+---+ 7 | | | | | | | | | +---+---+---+---+---+---+---+---+ 6 | | | | | | | | | +---+---+---+---+---+---+---+---+ 5 | | | | | | | | | +---+---+---+---+---+---+---+---+ 4 | | | | | | | | | +---+---+---+---+---+---+---+---+ 3 | | | | | | | | | +---+---+---+---+---+---+---+---+ 2 | | | | | | | | | +---+---+---+---+---+---+---+---+ 1 | R | | | | | | | | +---+---+---+---+---+---+---+---+ a b c d e f g h

    There is an excellent description of this notation at wikipedia

     

      I think he meant that with Qa2b3 that there could potentially be up to three other queens, precisely because you should only provide the minimal disambiguation.
      • If Qb3 is insufficient,
        there must be another queen that can move on b3.
      • If Qab3 is insufficient,
        there must be another queen in column a that can move on b3.
      • If Q2b3 is insufficient,
        there must be another queen in rank 2 that can move on b3.
      • Since both Qab3 and Q2b3 are insufficient, the board must be crawling with queens poised to attack b3.

      --
      [ e d @ h a l l e y . c c ]

        As a side note:

        Qb3 should give an error if

        There are more than one Q on the board and they can move to b3.

        That will be an interesting check!

      This discussion seems to assume English Short Algebraic Notation. For a parser this seems unwise.

      People use and generate chess notation and they will enjoy being able to follow their usual habits, especially if they are the sort to fall into english descriptive or long algebraic under stress.

      Any good chess notation parser will need to maintain or reference a board image.

Re: Re: Parsing Chess Algebra Notation
by cyocum (Curate) on May 07, 2004 at 14:05 UTC

    I took your advice and tried it backwards. As you will notice, I did this in a step by step fashion. I am sure that there are much more effient ways to do this. Also, it does not yet support the edge cases that you are talking about (that includes promotion). I just wanted to layout a bit more code for general commments and improvments. Also, thanks for all the help. I really appreciate it!

    use strict; use warnings; my $move = "Qe4++"; #reverse as per dragonchild's suggestion my @parsed = reverse split //, $move; print join('', @parsed) . "\n"; my $check; my $row; my $col; my $piece; #convert the letters into appropriate numbers #there is probably a faster way to to this my %col_convert = (a => 0, b => 1, c => 2, d => 3, e => 4, f => 5, g => 6, h => 7); foreach my $elem (@parsed) { #check for, well, check if($elem eq '+') { #start the process to see if this is a checkmate my $nextElem = shift @parsed; print "nextElem $nextElem" . "\n"; #if it is then print if($nextElem eq '+') { print "Checkmate\n"; next; } else { #otherwise put it back on the stack for the next #iteration unshift @parsed, $nextElem; } #then print check in this instance print "Check\n"; }elsif($elem =~ m/[1-8]/) { $row = $elem; print "Row: $row\n"; }elsif($elem =~ /[a-h]/) { #convert to number $col = $col_convert{$elem}; print "Col: $col\n"; }elsif($elem =~ m/x/) { #this is a capture to that square print "Capture on $col $row\n"; }elsif($elem =~ /[KQRNB]/) { $piece = $elem; print "Piece to move is $piece\n"; } else { die "not a valid move\n"; } }

      Why are you making it so complicated for yourself? All you want to do is build a string that corresponds to the move. This means that you're attempting to build a state representation by parsing another state representation, one character at a time. The following may or may not be complete, but it's a good start. (I didn't do castling, leaving that as an exercise for the reader. I'd probably special-case it.)

      And the test code ...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://350341]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (4)
As of 2024-03-28 16:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found