Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

The Data

70 pages of text and images, formatted in an old print publication layout program.

The Task

Turn it all into HTML.

The Feeble Attempts

Program does not export to HTML
Program claims to output to both postscript and PDF. Lies. Exporting to these formats and then attempting to use a *2HTML converter proves worthless.
No documentation is available for the program to see if it's automatable through Win32::OLE.

The Horrible Conclusion

A whole lot of cutting & pasting.

The Mantra

Perl makes easy things easy and hard things possible.

The Code

#! perl -w use Win32::Clipboard; $clip = Win32::Clipboard::new(); $clip->Empty(); $SIG {"INT"} = sub {$exit=1}; while (1){ last if $exit; next unless $clip->GetText(); print $clip->GetText; print STDERR $clip->GetText; $clip->Empty(); }
explained :
Set up a new Win32::Clipboard object & clear the contents. Then set up a new $SIG{INT} handler to capture ctrl+c, which will be the 'exit' command.
loop while there's no text in the clipboard & the user hasn't pressed ctrl+c yet.
inside the loop, print the clipboard contents to STDOUT & STDERR, then clear them again.

usage : perl clipboard.pl >1.txt
arranged windows so that the layout program & a dos window were side-by-side. Starting with the first text element at the top of page one, select all of the element's text & copy. Move to next element, copy. Repeat for each text element. Hit ctrl-c.

move to next page in layout software & start perl clipboard.pl >2.txt... The entire process took a bit over 15 minutes once I got a rhythm down. Used the resulting files to craft simple XML documents, then applied a stylesheet to flesh out the HTML.

Total Elapsed Time : 2 hours
Thanks, perl!

In reply to Copy & copy... Win32 clipboard utility by boo_radley

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others browsing the Monastery: (6)
    As of 2021-04-20 11:35 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      No recent polls found

      Notices?