|laziness, impatience, and hubris|
The application uses Dancer::Session::YAML to create and store persistent sessions. I easily identified the problematic session file (I add access time to a session) and found the problem at the end of the file:
Ouch! Seems like the file has been overwritten without being truncated first. The application runs on Starman which forks workers to serve the requests. It seemed to me as if two instances had been trying to write the session file - a race condition. After discussing the issue with a colleague, I inspected the source code of the Dancer::Session::YAML module. The last subroutine was the only one printing anything:
The problem is the open line: it might overwrite (clobber) the file even before the lock is granted. To demonstrate the problem, I wrote the following script:
The parent opens the file, but while it sleeps, the child writes something to it. The parent then wakes up, gets the lock and writes to the file - but it just writes over whatever the child has written. If the child's output is longer than the parent's, the invalid session problem turns up.
I have just read the chapter on file-locking in Programming Perl (the 4th edition). To obtain the exclusive lock, the book recommends using sysopen, but after some googling and experimenting I found a simpler solution (only works if the file already exists, though, otherwise sysopen is inevitable — or maybe +>> can help?):
The +< mode does not overwrite the file. truncate deletes the previous content of the file (or shortens the file), so the previous content does not matter.
Proud of myself, I opened the Dancer's bug tracker... only to find the problem has already been fixed on github in a completely different way: the session is written through the Dancer::FileUtil module, using its atomic_write:
There is more than one way to... you know what. Anyway, I learned a lot.
Update: The note on file existence with +<.
لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ