Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

OODoc Document Formatting Problems

by Bruce32903 (Scribe)
on Apr 13, 2011 at 13:11 UTC ( [id://899181]=perlquestion: print w/replies, xml ) Need Help??

Bruce32903 has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am trying to use OODoc to combine files into a single OpenOffice Writer document. I am successful in combining files but am rather lost when trying to manipulate the format of the odt file.

My two current problems are:

1) I always use the "old school" method of indenting subroutines and loops (three spaces for first indent, 6 for one more level of nesting, then 9, etc.). No matter how many leading spaces a source line has the output document only has a single leading space.

2) When I use my full set of files the output document has paragraph numbering turned on. It is not a huge task to turn it off inside OO Writer, but it would be nicer if it did not get turned on.

To illustrate the spaces problem I started with a fake data file named 1.php. (Overall I am working with some perl, some php and some odt files)
# This is a fake program/data file subroutine_1 { # there are three spaces in front of this pound sign }
For my second fake program/data file I just changed the "1" in both the file name and subroutine name to "2".

My sample code is here:
#!/usr/bin/perl # FILE: /WIP/perl_try_OODoc/perl_monks_03.pl # apt-get install liboffice-oodoc-perl (Ubuntu 9.04) or equivalent req +uired. use warnings; use strict; use OpenOffice::OODoc; my ($i, $my_text); my @Fnames = ('/var/www/php/1.php', '/var/www/php/2.php'); # files to +read into document my $x = "cp /home/my_name/util/oo-macro/blank_seed.odt /home/my_name/u +til/oo-macro/test.odt"; system($x); # copy from seed file to output document file my $doc_out = odfDocument(file => '/home/my_name/util/oo-macro/test.od +t'); # setup output # ====== add files into output document ====== foreach(@Fnames) # process each php file { open(FI, "<$_"); my @code_lines = <FI>; # read all the php code from a single file close FI; $my_text = ""; # init for($i = 0; $i <= $#code_lines; $i++) { $my_text .= $code_lines[$i]; #$my_text =~ s/\r/\r\n/g; # didn't help, didn't hurt } $doc_out -> appendParagraph(text => $my_text, style => "Default"); # write a full php + file to odt } $doc_out -> save; # done
When I run the above sample code with perl or php files the code runs without reported error. When I open up the odt file there is only one space at the beginning of any line that started with one or more spaces.

When I copy the odt file to a temporary directory, unzip it and look inside the content.xml file I can see that the 3 spaces are there. I can't find any way in OO Writer to turn off the contraction of these spaces so I am totally stuck. Since I would like my code to work for me and make my life easier I am also stuck on how to make my perl code correct this.

Overall, based upon current needs, I may be combining perl code, php code, text files or OO document files. Later I might throw spreadsheets into the mix too. My full size code example currently works with php and odt. I have a additional related problem there. The output document (combined odt files) has an additional left margin and all paragraphs have paragraph numbers. I have found the procedure to turn this off in OO Writer, but again I would prefer to have my perl code make "ready to use" files for me. So far I have not been able to find the answer to this one either.

Note that my large code file will, based upon file extension, use different code for odt files. I am using sample code from the internet that seems to combine the odt files properly. If this one posting doesn't solve both problems then I will repost with sample code for odt files. My guess is that this contraction of spaces problem will expose me to a wide variety of formatting power that will solve many style and formatting issues.

Any suggestions would be appreciated.

Thanks,
Bruce

Replies are listed 'Best First'.
Re: OODoc Document Formatting Problems
by duelafn (Parson) on Apr 13, 2011 at 14:48 UTC

    I can work around the issue by making use of nbsp:

    use warnings; use strict; use OpenOffice::OODoc; use File::Copy; use Encode; my ($i, $my_text); my @Fnames = ('test.php'); # files to read into document copy "test.odt.tmpl", "test.odt"; my $doc_out = odfDocument(file => 'test.odt'); # setup output # ====== add files into output document ====== foreach(@Fnames) # process each php file { open(my $file, "<", $_) or die "Error reading $_: $!"; $my_text = ""; # init for (<$file>) { # \x{A0} is &nbsp; s/^( +)/"\x{A0}"x(length($1))/e; $my_text .= $_; } $doc_out -> appendParagraph(text => encode('UTF-8',$my_text), style => "Default"); # write a full php + file to odt } $doc_out -> save; # done

    I also made a few perly and safety changes. The heart of the solution is the s/// and the encode(). My openoffice draws the non-breaking spaces with a different background color which is a bit distracting. I'm not sure if that can be worked around.

    Good Day,
        Dean

      Thank you for the response.

      When I run this my odt file has an upper case "A" with a caret over it followed by a dot (indicating space) with a shaded background (as you indicated it would). This would probably work well except for the A with the caret.

      Looking at the content.xml file I see that what was a space in my text file is now a 0xC3 0x82 0xC2 0xA0 in the file. Thus, three spaces became a total of 12 bytes with the byte pattern being the pattern from the previous sentence repeated 3 times.

      Thank you,
      Bruce
Re: OODoc Document Formatting Problems
by atcroft (Abbot) on Apr 13, 2011 at 14:03 UTC

    My first thought on reading this is to copy a section of code (formatted as you desire) into a normal OOWriter document, save it, then look at the content.xml file in it, to see how they compare with your script's output. (My other thought was that it sounds like maybe there is a setting for pre-formatted text that you may need to use.)

    Hope that helps.

      I basically have tried that. When I view/edit the odt file I can type a line with 3 leading spaces. After saving, closing and opening the file the three leading spaces that I typed are still there and the 3 spaces I imported are not present on the screen but are present in the content.xml file.

      Thanks,
      Bruce

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://899181]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-03-28 19:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found