http://www.perlmonks.org?node_id=251182

peppiv has asked for the wisdom of the Perl Monks concerning the following question:

Good day,

I've been using PDF::Create to make a PDF from info I pull from a DB. Works great (thanks tall_man). My only problem is that when info I pull that has a single parenthesis in the body, PDF::Create dumps out and cannot create the page. I'd like to replace the single parenthesis with a blank space (I've tried this and it works). I'd also like to leave the parenthesis if there's the opening and closing one (e.g. "Madrid (Spain)" is OK and works, "Madrid (Spain" is not and does not work).

Here's where I'm at:
#code is stripped to bare bones in effort to save time/space use strict; use DBI; use PDF::Create; my $id = param("id"); my $pdf = new PDF::Create ('filename' => '../html/logs/mypdf.pdf', 'PageMode' => 'UseNone', 'Title' => 'Anything', ); my $root = $pdf->new_page('MediaBox' => [ 0, 0, 612, 792 ]); # Add a page which inherits its attributes from $root my $page = $root->new_page; # Prepare 2 fonts my $f1 = $pdf->font('Subtype' => 'Type1', 'Encoding' => 'WinAnsiEncoding', 'BaseFont' => 'Helvetica'); my $dbh = DBI->connect('DBI:mysql:database=hr_db;host=69.169.169.9','u +srname','password'); my $sth = $dbh->prepare('SELECT * FROM applicant WHERE id = ?'); $sth->execute($id) or die $sth->errstr; while (my @results = $sth->fetchrow_array()) { $results[15] =~ tr/\(/ /g; $results[15] =~ tr/\)/ /g; $page->stringl($f1, 10, 28, 675, "Specific Position(s)"); $page->stringl($f1, 10, 28, 664, "Applied For:"); $page->stringl($f1, 10, 93, 664, "$results[15]"); $page->line(90, 662, 310, 662); $page->stringl($f1, 10, 320, 664, "Date Applied:"); $page->stringl($f1, 10, 393, 664, "$results[20]"); $page->line(390, 662, 587, 662); } $sth->finish(); $dbh->disconnect; # Add the missing PDF objects and a the footer then close the file $pdf->close;
This works in removing parenthesis from $results[15], but how can I catch them in the entire returned array? $results[1..80]? Also, how can I leave the parenthesis if they are placed correctly "(Spain)" not "(Spain"? This problem stems from the fact that some of this data came from user input and it wasn't controlled on the front end to filter them out.

This problem only appears to happen with parenthesis. It doesn't break with brackets or other characters. Since PDF::Create is no longer supported, I can't find a way to escape them in it's print to screen line ($page->stringl($f1, 9, 30, 352, "$results[1]");
It's not like I can use the qq or anything.

Any help would be greatly appreciated! TIA

peppiv

hey I love sports, but if it's such a macho thing, why do all these people punch, dunk and tackle for purses, belts and rings?

Replies are listed 'Best First'.
Re: How do I replace certain character if condition exists
by bart (Canon) on Apr 17, 2003 at 13:25 UTC
    PDF::Create is no longer supported? That's too bad, I recently used it for a project and it worked well. I even added some commands myself, to support color.

    Anyway: PDF is based on Postscript. In PostScript, the string delimiters are parens. That's the cause of the problems, and why it doesn't break with brackets and braces. The way around it, is to escape the parens, any parens, if PDF supports it, with a backslash. The backslash itself must be special, too.

    Let me pull in the PDF docs, from PDFReference.pdf:

    3.2.3 String objects — literal strings
    Any characters may appear in a string except unbalanced parentheses and the backslash,which must be treated specially. Balanced pairs of parentheses within a string require no special treatment.
    The "special treatment" is pretty much as in Perl: preceding each special non-word character with a backslash; newline can be represented by "\n" and a tab with "\t". No surprise there. And finally, you can use octal representation with 3 digits, again, just like in Perl: "\ddd".

    That should be enough to get you on your way, I suppose. You can check the table at page 30 from the above PDF file (warning: huge file: 9MB), that is the 50th page in the document.

    p.s. It seems that quite a few people are rather fond of this module, including myself. If the original author is no longer interested in supporting it, perhaps somebody else should pick it up. How does one go about, in order to do that? (BTW the best place to escape the backslashes and the parens, would be in the module.)

      Great Job Bart.. could you post your additions to add color? Sounds great :)
        could you post your additions to add color? Sounds great :)
        Sure. Eventually, I'd like to get it on CPAN, but I'm not sure how long it can still take, so in the meantime, here's my add-on module for PDF::Create, which I gave the name PDF::Create::Extensions. All you need is this extra file, which adds a few methods to PDF::Create::Page, and they're available as soon as you do
        use PDF::Create; use PDF::Create::Extensions;
        You can create two kinds of colour, "linecolor" (for lines) and "fillcolor" (for area fills). For each, you can set the color in the same way, in RGB ("RGB" or "RG": red, green, blue, each value ranging from 0 to 1, less is darker), grayscale ("GRAY", "GREY" or "G", 1 value from 0 (white) to 1 (black); and CMYK ("CMYK" or "K", 4 values from 0 to 1, more is darker), like this:
        $page->linecolor(RGB => 0.6, 0.2, 0.2);
        The color type is case insensitive.

        The second thing you can set, is line width, line connection style ("join": styles "miter", "round", "bevel"), and line end style ("cap": styles "butt", "round", "square" (="projecting square")), like this:

        $page->linewidth(0.8, 'join' => 'miter');
        See the PDF documentation on what the possibilities represent, or just experiment with these options. Again, the names are case insensitive.

        Save the next code as PDF/Create/Extensions.pm somewhere in your @INC

Re: How do I replace certain character if condition exists
by broquaint (Abbot) on Apr 17, 2003 at 13:48 UTC
    Regexp::Common to the rescue!
    use strict; use Regexp::Common; my @tests = ( 'foo(bar)', 'foo bar', 'foo(bar', 'foo bar)', 'foo((bar)', 'foo(bar))', 'foo((bar)))', ); for(@tests) { printf "%12s", $_; s{ [()] }( )xg and tr/ //s unless m/ (?<! [(]) $RE{balanced}{-parens => '()'} (?! [)] ) /x; printf " - %s\n", $_; } __output__ foo(bar) - foo(bar) foo bar - foo bar foo(bar - foo bar foo bar) - foo bar foo((bar) - foo bar foo(bar)) - foo bar foo((bar))) - foo bar
    See. perlre and Regex::Common for more info on the regex used.
    HTH

    _________
    broquaint

Re: How do I replace certain character if condition exists
by kilinrax (Deacon) on Apr 17, 2003 at 13:14 UTC
    Possibly a simplistic solution, but have you tried using negative lookahead? For instance:
    abowley@krait:~$ (echo 'Madrid(Spain' && echo 'Madrid(Spain)') | perl +-pe 's/\((?![^)]*\))/ /g;' Madrid Spain Madrid(Spain)
    Breaking the regex down:
    s/ \( # an open bracket (?! # _not_ followed by [^)]* # anything except brackets (e.g. 'Spain') \) # followed by a bracket ) / /gz;

      This works if you are sure that you won't have any nested parentheses. If you do then it doesn't work at all, for example:

      a(b(c))           becomes a b(c))
      a (b (c) d (e) f) becomes a  b (c) d (e) f)
      
        Well, you can do that with a regex, but it's a good degree more complicated.....
        abowley@krait:~$ (echo 'Madrid(Spain(Europe)' && echo 'Madrid(Spain(Eu +rope))' && echo 'Madrid(Spain Europe))' && echo 'a(b(c))' && echo 'a +(b (c) d (e) f)') | perl -mstrict -wpe 'BEGIN { $brackets = qr#\([^() +]*(?:(??{$brackets})[^()]*)*\)# }; s#([^)]+?)($brackets)([^(]+)# ( my + $s = $1 ) =~ tr/(/ /; ( my $e = $3 ) =~ tr/)/ /; $s . $2 . $e #ge;' Madrid Spain(Europe) Madrid(Spain(Europe)) Madrid(Spain Europe) a(b(c)) a (b (c) d (e) f)
Re: How do I replace certain character if condition exists
by BrowserUk (Patriarch) on Apr 17, 2003 at 15:02 UTC

    This code will correct the gross imbalance of parens without requiring to delete them all. As is, it works line by line, but would operate on a whole file if you slurped it.

    #! perl -slw use strict; while(<DATA>) { my $count=0; chomp; print; $count += $1 eq '(' ? +1 : -1 while m[([()])]g ; s[(^.*?)\(] [$1] while $count-- > 0; s[(.*)\)(.*?$)][$1$2] while ++$count < 0; print; } __DATA__ test (test) test (test (test (test)) (test (test) (() (()) (((())))) ( () (()) ( ( () ) ) )) (( () (()) ( ( () ) ) ) ((( test test ))) (((test) (test)))

    All it does is count the number of opens and closes and delete enough of whichever is in excess from the beginning or end to correct any imbalance.

    In most cases, that probably means that the resultant parenthesizing would not reflect the original intent, but it is difficult to see how to determine what that intent was without specific knowledge.


    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: How do I replace certain character if condition exists
by tall_man (Parson) on Apr 17, 2003 at 17:46 UTC
    There's a module especially for this kind if thing: Text::Balanced. Here is an example of how you could use it.
    #!/usr/bin/perl -w use strict; use Text::Balanced qw (extract_bracketed); my $text = "In (Madrid) (Spain"; my ($extracted, $remainder, $skipped); my $newtext = ""; for (;;) { ($extracted, $remainder, $skipped) = extract_bracketed($text,'()',' +[^\(]*'); if ($extracted) { $newtext .= $skipped; $newtext .= $extracted; $text = $remainder; } else { if ($@->{error} =~ /Unmatched opening/) { $remainder =~ s/\(/ /; $text = $remainder; } elsif ($@->{error} =~ /Did not find opening bracket/) { $newtext .= $remainder; last; } else { die "Unexpected text balance error: $@"; } } } print "Now text is: *$newtext*\n";