Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

extract text from powerpoint

by fionbarr (Friar)
on Jan 05, 2010 at 14:58 UTC ( [id://815771]=perlquestion: print w/replies, xml ) Need Help??

fionbarr has asked for the wisdom of the Perl Monks concerning the following question:

I need to pull the text out of PPT files...I need to do this on Linux so OLE is out. I'm not finding any libraries (and really not much at all)...any ideas?

Replies are listed 'Best First'.
Re: extract text from powerpoint
by marto (Cardinal) on Jan 05, 2010 at 15:04 UTC

    You may be able to use OpenOffice Impress and OpenOffice::UNO, though you'd need to do some further research.

    Martin

      thank you...I hadn't seen that reference though I have seen mention of using OO Impress to save the file as a DOC file and then look at that.

        Well that's one approach. If I had to do this task I'd probably try using OpenOffice::UNO to automate the presentation opening, text selection and extract rather than save to another file format for parsing/extracting. Let us know how you get on.

        Martin

Re: extract text from powerpoint
by LTjake (Prior) on Jan 06, 2010 at 13:48 UTC

    At $work we've installed the catdoc package (sudo apt-get install catdoc on ubuntu) which provides a few binary tools:

    • catdoc - extract text from word docs
    • catppt - extract text from powerpoint presentations

    These have worked well in the absence of any pure-perl libraries. HTH.

    --
    "Go up to the next female stranger you see and tell her that her "body is a wonderland."
    My hypothesis is that she’ll be too busy laughing at you to even bother slapping you.
    " (src)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://815771]
Approved by MidLifeXis
Front-paged by MidLifeXis
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2024-04-19 22:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found