Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^5: command line perl reading from STDIN (strace)

by tye (Cardinal)
on Jan 31, 2013 at 19:11 UTC ( #1016377=note: print w/ replies, xml ) Need Help??


in reply to Re^4: command line perl reading from STDIN
in thread command line perl reading from STDIN

That is a dramatic difference and is worth investigating further. Making Perl input processing 60-times faster in some cases might be a result.

I'd probably run strace on those cases and see what Perl is doing differently.

- tye        


Comment on Re^5: command line perl reading from STDIN (strace)
Re^6: command line perl reading from STDIN (strace)
by perlhappy (Novice) on Feb 01, 2013 at 14:54 UTC
    Ok, a probable reason for the difference.
    Its not actually as great a huge difference in time anymore. Let me explain, I'm doing work on a server that lots of other people are working on. Because of this the server sometimes gets more jobs to run than other times. I reviewed some of the jobs submission logs yesterday and it was extremely busy yesterday when I was doing the checks. I ran the two again today and used strace to track whats going on.

    cat A_1_1.fq

    write(1, "12555:2368#0/1\nffafWgggWgagcggff"..., 1048576) = 1048576
    read(3, "0004_FC:1:76:5896:2982#0/1\naQXaa"..., 1048576) = 1048576
    write(1, "0004_FC:1:76:5896:2982#0/1\naQXaa"..., 1048576) = 1048576
    read(3, "\nhhhhhcghghhhhhhhhfhhhhhgfghhhhh"..., 1048576) = 1048576
    write(1, "\nhhhhhcghghhhhhhhhfhhhhhgfghhhhh"..., 1048576) = 1048576

    perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";'

    read(0, "ehhhfhhh]\n@HWI-EAS283_0004_FC:1:"..., 4096) = 4096
    read(0, "_0004_FC:1:52:6965:11034#0/1\nggg"..., 4096) = 4096
    read(0, "0004_FC:1:52:10518:11036#0/1\nhhg"..., 4096) = 4096
    read(0, "1:52:14559:11038#0/1\nffffffdfdf["..., 4096) = 4096
    write(1, "GCCCCCAGAGCANCGTCTCTGGGGGCAGCCAG"..., 4096) = 4096
    read(0, "fgggfcbfffcffdcdfcfaffffaa^fff"..., 4096) = 4096



    >time perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' < A_1_1.fq | wc -l

    read(3, "-EAS283_0004_FC:1:21:15451:12331"..., 4096) = 4096
    read(3, ":18706:12324#0/1\ncaYYcaaaVTaaZ"..., 4096) = 4096
    read(3, "hhhfgahhhhh\n@HWI-EAS283_0004_FC:"..., 4096) = 4096
    read(3, "ATCCTCCAGGCGATTCAACGCCTTGGTTCTCT"..., 4096) = 4096
    write(1, "TTTCTGTTCACTCTCAACTTCTCCTTCCAGTT"..., 4096) = 4096
    read(3, "8646:12340#0/1\ngegcgaKaaaffff_gg"..., 4096) = 4096

    There is still a time difference but nowhere near as large as before (see below). The first is however, consistently faster.

    >time cat A_1_1.fq | perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' | wc -l
    26814958

    real 0m41.711s
    user 0m38.406s
    sys 0m5.096s


    >time perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' < A_1_1.fq | wc -l
    26814958

    real 3m39.382s
    user 0m52.811s
    sys 0m23.169s


Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1016377]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (12)
As of 2014-12-19 18:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (91 votes), past polls