|Perl: the Markov chain saw|
Pixel scraping flash to extract textby water (Deacon)
|on Oct 22, 2005 at 11:24 UTC||Need Help??|
water has asked for the
wisdom of the Perl Monks concerning the following question:
Sorry if this isn't a pure perl question; there might be perl involved in automating the solution....
I need to screen scrape a flash webapp -- yep, I can't get access to HTML.
The webapp presents tables of data I'm authorized to view, but I'd like to put them data in spreadsheet so I can sort and plot the data.
Maybe I can use perl to drive the flash app thru IE (haven't tried, but probably) using samie, but the flash app doesn't offer any way to dump data out of the darned thing...
ok, this is indeed horrible, but if I had to page thru the data screens and save screen shots as jpeg images or something, is there any way to pull text out of a jpeg using OCR or something? Quite horrible, indeed -- screen scraping at the pixel level -- but these data are worth it.
Suggestions / ideas / comments most welcome --
(PS The folks providing this app wont take the time to modify it in any way or talk to me at this point, so the obvious "ask for a clean data dump" doesn't work here.)
(PPS The other fallback is have an employee type in data from the screen -- that might take a few days of effort -- so there's a reasonable human fallback soln.)