Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

re-syncing these subtitles

by wazoox (Prior)
on Sep 05, 2010 at 20:19 UTC ( #858942=CUFP: print w/ replies, xml ) Need Help??

So your uncle recorded for you this old swedish movie on his DVR, and gave you a copy of it. You transcode it to XVID then realize that, Dammit! you haven't got the subtitles and your swedish really is cranky, so you get the subtitles from some website but alas, they're for some DVD version and begins more quickly (because there aren't the ads), or else.

So you need to edit those thousands of time entries in the .srt file to correct them. There comes my script to the rescue...

Let's say you need to shift the subtitles 3 minutes 42 seconds sooner... Simply do :

subshift /data/subtitle.srt - 3:42 > /data/newsubtitle.srt

Now you may want to make it later by 1 hour, 5 minutes, 32 seconds and bananas :

subshift /data/subtitle.srt + 1:5:32,242 > /data/newsubtitle.srt
Well, that's all folks. Here it is :
#!/usr/bin/perl use strict; use warnings; ############################################# # init my ( $infile, $oper, $shiftime ) = @ARGV; usage() if not ( $infile and $oper and $shiftime); usage() if ( $oper ne "+" and $oper ne "-" ); ############################################# # main open my $fh, "<", $infile or die "error opening '$infile':$!"; $shiftime = time_to_sec($shiftime); while (<$fh>) { if ( m/^(\d{2}:\d{2}:\d{2},\d+)\s\-\-\>\s(\d{2}:\d{2}:\d{2},\d+)/ +) { my ($start, $end) = ( $1, $2); my $startsec = time_to_sec($start); my $endsec = time_to_sec($end); if ( $oper eq "+" ) { $startsec += $shiftime; $endsec += $shiftime; } else { # bug : should check that time stays positive $startsec -= $shiftime; $endsec -= $shiftime; } $start = sec_to_time($startsec); $end= sec_to_time($endsec); print "$start --> $end\r\n"; } else { print $_ ; } } close $fh; ############################################# # subs sub usage { print "usage : $0 <file> <+|-> <duration>\n\n"; print "duration is expressed in hours:minutes:secondes,millisecond +s.\nLeading zeros are optional.\n"; print "result output to stdout.\n"; exit 1; } sub time_to_sec { my $time = shift; my @elems = split(':', $time); my $seconds = pop @elems; my $minutes = pop @elems; my $hours = pop @elems; $seconds =~ s/,/./g; # force to numerical $hours += 0; $minutes += 0; $seconds += 0; # convert to seconds for easier manipulation my $time_sec = ( $hours * 3600 ) + ($minutes * 60 ) + $seconds ; return $time_sec; } sub sec_to_time { my $sectime = shift; my $hours = sprintf( "%02d", int ( $sectime / 3600 ) ); my $minutes = sprintf( "%02d", int ( ( $sectime - ( $hours * 3600 + ) ) / 60 ) ); my $seconds = sprintf( "%02.3f", ( ( $sectime - ( $hours * 3600 ) +) - ( $minutes * 60 )) ); $seconds =~ s/\./,/g; return "$hours:$minutes:$seconds"; }

Edit: applied some corrections as suggested by jwkrahn.

Comment on re-syncing these subtitles
Select or Download Code
Re: re-syncing these subtitles
by CountZero (Bishop) on Sep 05, 2010 at 21:28 UTC
    "Old swedish movies" recorded by your "uncle" and you need subtitles?

    Say no more ... nudge nudge ... wink wink ... he said knowingly ...

    Still ++! Perhaps turn it into a module and put it on CPAN?

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      It can't be what you think. Else he wouldn't need the subtitles :)

        Yeah, well, it was more an allusion to Ingmar Bergman than anything else, actually :)
      Perhaps turn it into a module and put it on CPAN?

      I dunno... I've got quite a lot of similar small scripts or libraries, but I never made them available to CPAN because they're not much, really. That, and it would force me to work them into real complete toolkits :)

        Clean up the code, turn it into a module. Even if your tool does only one thing, if it does it well, it merits to be made available for the world at large.

        Fame and Fortune(*) awaits you on CPAN!

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        (*) Fortune is optional and at the CPAN users' discretion.
Re: re-syncing these subtitles
by jwkrahn (Monsignor) on Sep 05, 2010 at 23:53 UTC
    open my $fh, "<", $infile or die "error opening '$infile':$?";

    The $? variable is not relevant in this context.    You should be using either $! or $^E.


    if ( m/^(\d{2}:\d{2}:\d{2},\d+)\s\-\-\>\s(\d{2}:\d{2}:\d{2},\d+)/g + ) {

    You are using the /g global option in a scalar context, which would make sense in a while loop, but the value of $_ changes each time you test it so it makes no sense to use it there.


    my @elems = split(':', $time); my $seconds = pop @elems; my $minutes = pop @elems; my $hours = pop @elems;

    Why not just:

    my ( $hours, $minutes, $seconds ) = split /:/, $time;

    # force to numerical $hours += 0; $minutes += 0; $seconds += 0;

    An unnecessary step as the numbers are already numerical.


    my $seconds = sprintf( "%02.3f", ( ( $sectime - ( $hours * 3600 ) +) - ( $minutes * 60 )) );

    The number before the period in the format string for floating point is the total width.    So if you want the result to look like '12.345' then you need a format string of '%06.3f' because it has a total width of six characters.


      The $? variable is not relevant in this context. You should be using either $! or $^E.

      Oh yes, it's always quite easy to mix these up :)

      You are using the /g global option in a scalar context, which would make sense in a while loop, but the value of $_ changes each time you test it so it makes no sense to use it there.

      Well, same as above : quick and dirty script hastily thrown together, not too much testing...

      Why not just: my ( $hours, $minutes, $seconds ) = split /:/, $time;

      Because I want that if you pass only one value, it defaults to seconds, and to minutes:seconds with only two values.

      An unnecessary step as the numbers are already numerical.

      It's a cautionary cleanup in case you're passing garbage as parameters. it will at least avoid mangling hopelessly the output :)

      The number before the period in the format string for floating point is the total width.

      I think you missed the "d" there :). Edit : Well I don't know what's the right form, but "%02.3f" apparently works as intended... ""%05.3f" adds 4 leading zeros.

        It's a cautionary cleanup in case you're passing garbage as parameters.

        Perhaps you should use Scalar::Util::looks_like_number to confirm a numerical value.


        I think you missed the "d" there :)

        There is no letter 'd' in the line:

        my $seconds = sprintf( "%02.3f", ( ( $sectime - ( $hours * 3600 ) +) - ( $minutes * 60 )) );


        Update


        Why not just:
         my ( $hours, $minutes, $seconds ) = split /:/, $time;
        Because I want that if you pass only one value, it defaults to seconds, and to minutes:seconds with only two values.

        Then try it like this:

        my ( $hours, $minutes, $seconds ) = ( split /:/, $time )[ -3, -2, -1 ] +;
Re: re-syncing these subtitles
by Anonymous Monk on Sep 06, 2010 at 08:06 UTC
    I'd recommend Time::Duration and Time::Duration::Parse for this to save effort, assuming you can get it to give you the right output (for example, it interprets strings in the form 'dd:dd' as hh:mm instead of mm:ss, but you can add ':00' to the end to fix that).

    Also, I'd probably write the main logic differently:

    my $timespec = qr/\d{2}:\d{2}:\d{2},\d+/; my ($infile, $offset) = @ARGV; usage() unless defined $offset and $offset =~ /^[+-]?\d+$/; # integer ... if (my ($start, $end) = /^($timespec)\s-->\s($timespec)/) { $_ = time_to_sec($_) += $offset for $start, $end; print "$start --> $end\r\n"; } ...
Re: re-syncing these subtitles
by nyamned (Sexton) on Sep 13, 2010 at 21:48 UTC
    Quite useless, unless you are going to save that movie for a long time. Collection? I believe almost all modern video players can adjust subtitle delay (ex mplayer, vlc)
      My set-top box happily plays mkv, avi, and srt files, but can't adjust subtitle delay. Neither can my DVD player.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://858942]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2014-09-21 02:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (165 votes), past polls