Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Working with word document

by nick321 (Initiate)
on Aug 27, 2012 at 20:52 UTC ( #990060=perlquestion: print w/replies, xml ) Need Help??
nick321 has asked for the wisdom of the Perl Monks concerning the following question:

I have written code to copy contents of one file to another.

It perfectly works fine for the .txt file.

But when I try for .doc file, it doesn't work.

#!/usr/bin/perl use 5.010; use strict; # Open file to read unless (open(DATA1, "file1.doc")) { die "couldn't open file1" }; # Open new file to write unless (open(DATA2, ">>file2.doc")) { die "couldn't open file2" }; # Copy data from one file to another. while(<DATA1>) { print DATA2 "$_\n"; } close( DATA1 ); close( DATA2 );

Replies are listed 'Best First'.
Re: Working with word document
by philiprbrenan (Monk) on Aug 27, 2012 at 22:27 UTC
    use File::Copy; copy("file1","file2") or die "Copy failed: $!";
Re: Working with word document
by aaron_baugher (Curate) on Aug 27, 2012 at 22:14 UTC

    I don't know a lot about the DOC format, but I'm pretty sure it's a binary format. It's probably not a good idea to insert extra newlines into it, as you're doing here. Just read and print the lines, without changing anything, if that's your goal. (Of course, if that's your goal, why not just use the filesystem copy command?)

    Also, you're appending to the output file. If it doesn't exist, it'll be created, and that'll be fine. But if it does exist, and you're really appending to it, the DOC format may not appreciate that either.

    Aaron B.
    Available for small or large Perl jobs; see my home node.

Re: Working with word document
by Kenosis (Priest) on Aug 27, 2012 at 22:33 UTC

    Unless you're experimenting with this script (or it's just for the joy of it), consider using File::Copy for the task:

    use Modern::Perl; use File::Copy; copy( 'file1.doc', 'file2.doc' ) or die "File copy failed: $!";

    Hope this helps!

    Update: Am just not as fast as philiprbrenan...

Re: Working with word document
by 2teez (Vicar) on Aug 27, 2012 at 21:43 UTC
    Hi,

    But when I try for .doc file, it doesn't work.

    How do you mean? Any error message whatsoever or just a plain .doc file?
    It works for me. Though, I would have written the open function ( using the 3 - arugment open function and using a lexical variable as filehandles instead of barewords), like so:

    open my $fh,'<','file1.doc' or die "can't open file: $!"; ## OR open my $fh2,'>>','file2.doc' or die "can't open file: $!";
    Then you might also want to close the file handlers in the reverse order in which they are opened.
    like so:
    close $fh2 or die "can't close file: $!"; close $fh or die "can't close file: $!";
    You may also consider checking these modules Win32::OLE, Win32::Word::Writer from http://www.cpan.org

      hi, There is no error message. The .doc file just goes blank. I tried the both modifications you have mentioned. But neither of them works out for me.

      here is the code i have tried

      open my $fh,'<','file1.doc' or die "can't open file:"; open my $fh2,'>>','file2.doc' or die "can't open file:"; while(<fh>) { print fh2 "$_\n"; } close $fh2 or die "can't close file:"; close $fh or die "can't close file:";
        Hi,

        ofcourse, the .doc file will be blank with this:

        ... while(<fh>) ## note this <fh> Oops { print fh2 "$_\n"; ## note this fh2 Oops } ...
        but not with this:
        ... while(<$fh>) ## this is correct $fh { print $fh2 "$_\n"; ## correct $fh2 } ...
        The filehandles are $fh and $fh2 NOT fh and fh2!!!
        Moreover, if you use warnings and strict you will pick this easily.
        Hope this helps

Re: Working with word document
by aitap (Curate) on Aug 28, 2012 at 13:42 UTC
    If you want to copy Word files manually, you'll have to remember that these are binary files. Thus,
    • Use binmode on your filehandles to ensure that Windows won't break the document by "fixing" newline symbols.
    • Newlines in binary documents don't usually represent newlines in their contents, so it may be useful to use read, not readline (which is the synonym for <>).
    • Don't modify data you're copying in any ways unless you know what you're doing ("$_\n" is not a good thing in case of binary files).
    Sorry if my advice was wrong.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://990060]
Approved by Ratazong
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (8)
As of 2018-06-18 20:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?



    Results (110 votes). Check out past polls.

    Notices?