http://www.perlmonks.org?node_id=11224

GridMonk has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

I seek PowerPoint/Win32/Linux enlightenment (not the X WM...).
Help me, O Wise Ones!

This is a challenge with which I will have to deal in the next couple of months. I'd like to get some ideas and feedback before I get started so that I can plan it well.

Scenario:
Manipulation of MS PowerPoint files from a Linux server.

Environment:
RH Linux 6.1 connected to Internet via cable modem
Win95 routed to outside using ip_masquerading / the Linux box as a router


The Challenges:

1) Retrieve PowerPoint file from remote site.
- At this point, I am not sure if I will be able to use FTP or not. How do I retrieve a PPT file from a remote site? (I have permission to do so.)

2) Be able to manipulate all text in the PPT file.
- This is for a translation project, so I need to get the text and compare it to a dictionary file, etc. I can do this with plain text files, but does anyone know how to do this with PPT? I assume there must be some way to do it with Win32, perhaps by using the 'Save As HTML' feature and then parsing the HTML? A more direct way would be preferred, of course...

3) Allow remote user to be able to use the script
- I will also want a remote user to be able to call the script, have the script get the file, parse the contents, and display a custom lexicon/dictionary for that PPT file. I have no problem building displaying the custom lexicon. My problem is that the remote user can only see the Linux box, not the Win95 box, due to the ip_masquerading setup. So the script has to either A) run on the Linux box, or B) be called on the Win95 box from the Linux box and return results to the user somehow from there. Since I assume that the program will have to use Win32 to manipulate the PPT file, I assume that something like B) would be in order.

Thanks in advance for any help/input into any area of this challenge!

gM

Replies are listed 'Best First'.
RE: PowerPoint, Linux/Win32
by reisu (Initiate) on May 12, 2000 at 08:41 UTC
    1) Just like retrieving any file from a remote location, you should be able to utilise FTP/scp (hopefully not rsh). 2) Depending on what version of PowerPoint you're using/is being used, you should be able to import this with Applixware for Linux. I know 5.0 can import; I'm not sure about the earlier versions. You might also want to check with StarOffice (can be found off of Sun's website). I believe SO can import PPT files as well. 3) If the Win box is behind the IPMASQ, then you might wish to consider implementing NAT to do the address translation. If you can't do it on a software level, then there are relatively inexpensive hardware solutions available (circa $200 or less). Can't go too much into this one, however, since I have only dealt with NAT once. Cheers.
Re: PowerPoint
by btrott (Parson) on May 12, 2000 at 06:11 UTC
    I know this may not help you too much, but I thought it interesting, particularly since I just saw it last night.

    Tom Christiansen has a set of Perl tools that he uses for slide presentations instead of PowerPoint. The kit is called PerlPoint. I've never used it or even played around with it, but I thought it was worth pointing out/suggesting that you give it a look.

    So take a look at PerlPoint, if you want.

    Of course, this doesn't help you if you're not the person saying, "use PowerPoint." :)

RE: PowerPoint, Linux/Win32
by Specimen (Acolyte) on May 12, 2000 at 13:52 UTC
    Hi
    I think the simplest solution would be to share the directory with the files on the win95 machine, mount it on the linux machine (using smbmount i think, part of the handy samba package (as in linux package, not perl package)) and then use the Net::Ftp (or whatever package it is) to get the file, and to do your string parsing try doing a substitution on the binary ppt files and hope that the checksumming that is doubtless in the file format ignores the length of strings.

    Good luck
RE: PowerPoint, Linux/Win32
by Chrislnx (Initiate) on Sep 07, 2000 at 19:00 UTC
    Not quite sure whether you just want to extract the text from PPT, or whether you want to make changes to it and save it back as a valid PPT. If it's the latter option, I think you're onto a loser! Assuming the former, can you get the users to save in "Outline/RTF" format ? This will give you just the text, in RTF, which is fairly easy to parse. If you have to deal with a PPT file, you might be able to write something using libole2 to open the PPT file and parse it directly, but it will not be easy! (First challenge - find the documentation for the PPT file format on msdn.microsoft.com) Alternatively you could use something like StarOffice to convert it, but this will be very slow if you're processing lots of big presentations. A final thought - putting religion to one side for a moment, why are you using Linux? For the cost of a PC, a Win2K licence and a Powerpoint licence (i.e. a few thousand pounds), you could save yourself lots of coding by just using Perl with Win32::OLE and controlling powerpoint directly. Your solution would be less likely to break when someone uses a powerpoint feature that StarOffice doesn't understand, too. Chris
Re: PowerPoint
by GridMonk (Acolyte) on May 12, 2000 at 05:15 UTC
    Not sure how to edit my original post, so...

    One other item for Environment, both Linux and Win32 boxes have Perl 5 installed, so I am thinking it might be possible to have the user initiate a script on the Linux box, then have that call a script on the Win32 box which can then use Win32 to manipulate the PPT file.

    Not at all sure how to set it up though, any Perls of Wisdom appreciated...

    A monk asked his teacher,
    "What did old Masters attain when they entered the ultimate stage?"
    "They were like burglars sneaking into a vacant house," the teacher replied.

Re: PowerPoint
by buzzcutbuddha (Chaplain) on May 12, 2000 at 16:03 UTC
    Reisu has what I think would be the quickest solution, to use Applixware or Star Office
    because they offer the ability to import .ppt files from Windows. If you really wanted
    to be fancy, you could use VMWare on your machine and run PowerPoint that
    way, though that can be tricky too.

    This certainly is a dilly of a pickle GridMonk! I dunno. I'll think about it and if
    any magical solutions come my way, I'll let you know.
Re: PowerPoint
by ZZamboni (Curate) on May 12, 2000 at 17:37 UTC
    For grabbing the file I would use scp (from OpenSSH or ssh). I don't know if there is a Perl module for interfacing with it, though.

    As for manipulating the file... well, as others have said, this sounds like a though problem. I know you may be forced to using powerpoint, but here goes another alternative: MagicPoint. I have not used it, but the screenshots look good, and they say their format is text-based, so that would be ideal for manipulation with Perl.

    --ZZamboni

      Thanks for the pointer on MagicPoint. I'll have a look at it. Hopefully it supports MS PowerPoint files, because that is the format I will get the files in - I have no choice there...

      I guess what I am looking at now is:
      1) Get X running ( :-/ ) and get StarOffice up and going.
      2) Grab PPT file from remote site. I am still not 100% sure how to do this. I don't think I can use FTP because the remote site is not running an FTP server. The file is just placed in a directory and liked via HTML. The way I do it now is right-click the link and choose "Save Link As..." Not sure if I understand the above suggestions correctly, but the remote site is not running a file server, so... <shrug>
      3) Figure out how to access the text within the .PPT file. From what people are saying, it may be easier to do this using StarOffce than using MS PowerPoint. I have never used StarOffice before, so I'll have to do some reading up on this. Does anyone have some code examples of how to access StarOffice files with Perl?

      Thanks for the pointers so far - they at least have me thinking in the right direction now...

      gM

        Hi
        You don't need file serving on the windows machine. Right click on the folder containing the files, choose sharing and then select "shared as..." and give it a name. It is now available over the local network (ie up to the linux machine).

        Now on the linux machine try doing "man smbmount" and try to get the shared directory on the windows machine mounted on the linux machine (you probably need to be root).

        Also you don't need an ftp server on the linux machine as you can just as easily write a perl script to put the PPT file of your choice onto the remote machine of your choice. You could set it in the crontab to try and upload the file every 30 mins or something like that to your own machine. The only thing this requires is that you have access to the linux machine with ip forwarding...