In my video app (also using ffmpeg), I don't write out JPEG frames but raw video frames of a fixed size, thus skipping (part of) the compression and the decompression steps. This may or may not work well with your Tk widget. You might be able to not construct a new $image every time but reuse the one you have - I'm using OpenGL to draw my content, so I need the raw pixels anyway.
Also see Windows Webcam access., which also treats the handling of FFmpeg as a video source and reading the frames in Perl.