Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Fixing bad CSS in EPUB files

by jimhenry (Acolyte)
on Sep 06, 2023 at 01:02 UTC ( [id://11154273]=CUFP: print w/replies, xml ) Need Help??

Many epubs come with unprofessional CSS that will not display correctly on some ebook readers. For instance, the font size may be illegibly small on a mobile device, or the user may have dark mode turned on, but the CSS specifies element foreground colors according to an assumed (but not specified) white background, so there is little or no contrast with the actual black background. I recently wrote a script to detect epubs with those problems, then one to detect and fix them.

My first attempt at this used EPUB::Parser, but I soon found that it didn't (as far as I could tell) have the functionality I needed to get at the internal CSS files and edit them. So I fell back on Archive::Zip (which EPUB::Parser uses) -- an epub is a zip file containing css, html, and xml files (and sometimes jpg's, etc.).

The full code and assocated files
The documentation

Here, I present two of the trickier functions; inverse_color() is passed a CSS color value of some kind (which can be a wide array of formats), calculates a complementary color, and returns it. It makes use of functions from Graphics::ColorUtils to map CSS color names to rgb values. It is called by fix_css_colors() when it finds a CSS block containing a color: attribute but no background-color: attribute.

sub inverse_color { my $color = shift; die "Missing argument to inverse_color()" unless $color; state $color_names; if ( not $color_names ) { #set_default_namespace("www"); $color_names = available_names(); } $color =~ s/^\s+//; $color =~ s/\s+$//; if ( $color =~ /^#[[:xdigit:]]{3}$/ ) { $color =~ s/#//; my $n = hex $color; my $i = 0xFFF - $n; my $inverse = sprintf "#%03x", $i; return $inverse; } elsif ( $color =~ /^#[[:xdigit:]]{6}$/ ) { $color =~ s/#//; my $n = hex $color; my $i = 0xFFFFFF - $n; my $inverse = sprintf "#%06x", $i; return $inverse; } elsif ( $color =~ /rgb \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+) , +\s* ([0-9]+) \s* \) /x ) { my ($r, $g, $b) = ($1, $2, $3); my $n = $r * 65536 + $g * 256 + $b; printf "converted %s to %06x\n", $color, $n if $verbose; my $i = 0xFFFFFF - $n; my $inverse = sprintf "#%06x", $i; return $inverse; } elsif ( $color =~ /rgba \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+) , + \s* ([0-9]+) \s* , \s* ([0-9.]+) \s* \) /x ) { my ($r, $g, $b, $alpha) = ($1, $2, $3, $4); my $inverse = sprintf "rgba( %d, %d, %d, %0.2f )", 255 - $r, 255 - + $g, 255 - $b, 1 - $alpha; return $inverse; } elsif ( $color =~ /hsl \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+)% +, \s* ([0-9]+)% \s* \) /x ) { my ( $hue, $saturation, $lightness ) = ($1, $2, $3); my $hue2 = ($hue + 180) % 360; my $sat2 = 100 - $saturation; my $light2 = 100 - $lightness; my $inverse = sprintf "hsl( %d, %d%%, %d%% )", $hue2, $sat2, $ligh +t2; return $inverse; } elsif ( $color =~ /hsla \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+)% + , \s* ([0-9]+)% \s* , \s* ([0-9.]+) \s* \) /x ) { my ( $hue, $saturation, $lightness, $alpha ) = ($1, $2, $3, $4); my $hue2 = ($hue + 180) % 360; my $sat2 = 100 - $saturation; my $light2 = 100 - $lightness; my $alpha2 = 1 - $alpha; my $inverse = sprintf "hsl( %d, %d%%, %d%%, %0.2f )", $hue2, $sat2 +, $light2, $alpha2; return $inverse; } elsif ( $color =~ /currentcolor/i ) { warn "Should have removed currentcolor in fix_css_colors()"; } elsif ( $color =~ /inherit/i ) { return "inherit"; } elsif ( $color_names->{ "www:". $color} or $color_names->{ $colo +r} ) { my $hexcolor = name2rgb( $color ); if ( not $hexcolor ) { $hexcolor = name2rgb( "www:" . $color ); if ( not $hexcolor ) { die "Can't resolve color name $color"; } } $hexcolor =~ s/#//; my $i = 0xFFFFFF - hex($hexcolor); my $inverse = sprintf "#%06x", $i; return $inverse; } else { die "Color format not implemented: $color"; } } sub fix_css_colors { my ($csstext, $css_fn, $epub_fn) = @_; return if not $csstext; my $errors = 0; my $corrections = 0; my $printed_filename = 0; say "Checking $epub_fn:$css_fn for bad colors\n" if $verbose; # this might be a good use of negative lookbehind? my @css_blocks = split /(})/, $csstext; for my $block ( @css_blocks ) { if ( $block =~ m/color: \s* ( [^;]+ ) \s* (?:;|$) /x ) { my $fgcolor = $1; print "found color: $fgcolor\n" if $verbose; if ( $fgcolor =~ m/currentcolor/i ) { $block =~ s/(color: \s* currentcolor \s* ;? \s* ) \n* //xi; print "Stripping out $1 as it is a pleonasm\n" if $verbose; $corrections++; next; } if ( $block !~ m/background-color:/ ) { my $bgcolor = inverse_color( $fgcolor ); $block =~ s/(color: \s* [^;}]+ \s* (?:;|$) )/background-color: + $bgcolor;\n$1/x; print "corrected block:\n$block\n}\n" if $verbose; $corrections++; } } } if ( $corrections ) { my $new_css_text = join "", @css_blocks; return $new_css_text; } else { return undef; } }

Replies are listed 'Best First'.
Re: Fixing bad CSS in EPUB files
by Anonymous Monk on Sep 06, 2023 at 09:37 UTC
    inverse_color() ...calculates a complementary color

    "complement" and "invert" are not the same (SO), be specific. + I don't think flipping alpha makes sense. + Flipping saturation/lightness makes no sense neither. hsl(0,100%,50%) i.e. red should invert to aqua i.e. hsl(180,100%,50%), not to grey/gray as your function does.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://11154273]
Approved by GrandFather
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-05-19 07:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found