Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^5: command line perl reading from STDIN (strace)

by tye (Cardinal)
on Jan 31, 2013 at 19:11 UTC ( #1016377=note: print w/ replies, xml ) Need Help??


in reply to Re^4: command line perl reading from STDIN
in thread command line perl reading from STDIN

That is a dramatic difference and is worth investigating further. Making Perl input processing 60-times faster in some cases might be a result.

I'd probably run strace on those cases and see what Perl is doing differently.

- tye        


Comment on Re^5: command line perl reading from STDIN (strace)
Replies are listed 'Best First'.
Re^6: command line perl reading from STDIN (strace)
by perlhappy (Novice) on Feb 01, 2013 at 14:54 UTC
    Ok, a probable reason for the difference.
    Its not actually as great a huge difference in time anymore. Let me explain, I'm doing work on a server that lots of other people are working on. Because of this the server sometimes gets more jobs to run than other times. I reviewed some of the jobs submission logs yesterday and it was extremely busy yesterday when I was doing the checks. I ran the two again today and used strace to track whats going on.

    cat A_1_1.fq

    write(1, "12555:2368#0/1\nffafWgggWgagcggff"..., 1048576) = 1048576
    read(3, "0004_FC:1:76:5896:2982#0/1\naQXaa"..., 1048576) = 1048576
    write(1, "0004_FC:1:76:5896:2982#0/1\naQXaa"..., 1048576) = 1048576
    read(3, "\nhhhhhcghghhhhhhhhfhhhhhgfghhhhh"..., 1048576) = 1048576
    write(1, "\nhhhhhcghghhhhhhhhfhhhhhgfghhhhh"..., 1048576) = 1048576

    perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";'

    read(0, "ehhhfhhh]\n@HWI-EAS283_0004_FC:1:"..., 4096) = 4096
    read(0, "_0004_FC:1:52:6965:11034#0/1\nggg"..., 4096) = 4096
    read(0, "0004_FC:1:52:10518:11036#0/1\nhhg"..., 4096) = 4096
    read(0, "1:52:14559:11038#0/1\nffffffdfdf["..., 4096) = 4096
    write(1, "GCCCCCAGAGCANCGTCTCTGGGGGCAGCCAG"..., 4096) = 4096
    read(0, "fgggfcbfffcffdcdfcfaffffaa^fff"..., 4096) = 4096



    >time perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' < A_1_1.fq | wc -l

    read(3, "-EAS283_0004_FC:1:21:15451:12331"..., 4096) = 4096
    read(3, ":18706:12324#0/1\ncaYYcaaaVTaaZ"..., 4096) = 4096
    read(3, "hhhfgahhhhh\n@HWI-EAS283_0004_FC:"..., 4096) = 4096
    read(3, "ATCCTCCAGGCGATTCAACGCCTTGGTTCTCT"..., 4096) = 4096
    write(1, "TTTCTGTTCACTCTCAACTTCTCCTTCCAGTT"..., 4096) = 4096
    read(3, "8646:12340#0/1\ngegcgaKaaaffff_gg"..., 4096) = 4096

    There is still a time difference but nowhere near as large as before (see below). The first is however, consistently faster.

    >time cat A_1_1.fq | perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' | wc -l
    26814958

    real 0m41.711s
    user 0m38.406s
    sys 0m5.096s


    >time perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' < A_1_1.fq | wc -l
    26814958

    real 3m39.382s
    user 0m52.811s
    sys 0m23.169s


Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1016377]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2015-07-29 03:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (260 votes), past polls