time perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' A_1_1.fq | wc -l
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
The exact same output using both commands and the exact same number of lines. The results are exactly the same but the time performance is really different. Weird
| [reply] |
| [reply] |
That is a dramatic difference and is worth investigating further. Making Perl input processing 60-times faster in some cases might be a result.
I'd probably run strace on those cases and see what Perl is doing differently.
| [reply] |
Ok, a probable reason for the difference.
Its not actually as great a huge difference in time anymore. Let me explain, I'm doing work on a server that lots of other people are working on. Because of this the server sometimes gets more jobs to run than other times. I reviewed some of the jobs submission logs yesterday and it was extremely busy yesterday when I was doing the checks. I ran the two again today and used strace to track whats going on.
cat A_1_1.fq
write(1, "12555:2368#0/1\nffafWgggWgagcggff"..., 1048576) = 1048576
read(3, "0004_FC:1:76:5896:2982#0/1\naQXaa"..., 1048576) = 1048576
write(1, "0004_FC:1:76:5896:2982#0/1\naQXaa"..., 1048576) = 1048576
read(3, "\nhhhhhcghghhhhhhhhfhhhhhgfghhhhh"..., 1048576) = 1048576
write(1, "\nhhhhhcghghhhhhhhhfhhhhhgfghhhhh"..., 1048576) = 1048576
perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";'
read(0, "ehhhfhhh]\n@HWI-EAS283_0004_FC:1:"..., 4096) = 4096
read(0, "_0004_FC:1:52:6965:11034#0/1\nggg"..., 4096) = 4096
read(0, "0004_FC:1:52:10518:11036#0/1\nhhg"..., 4096) = 4096
read(0, "1:52:14559:11038#0/1\nffffffdfdf["..., 4096) = 4096
write(1, "GCCCCCAGAGCANCGTCTCTGGGGGCAGCCAG"..., 4096) = 4096
read(0, "fgggfcbfffcffdcdfcfaffffaa^fff"..., 4096) = 4096
>time perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' < A_1_1.fq | wc -l
read(3, "-EAS283_0004_FC:1:21:15451:12331"..., 4096) = 4096
read(3, ":18706:12324#0/1\ncaYYcaaaVTaaZ"..., 4096) = 4096
read(3, "hhhfgahhhhh\n@HWI-EAS283_0004_FC:"..., 4096) = 4096
read(3, "ATCCTCCAGGCGATTCAACGCCTTGGTTCTCT"..., 4096) = 4096
write(1, "TTTCTGTTCACTCTCAACTTCTCCTTCCAGTT"..., 4096) = 4096
read(3, "8646:12340#0/1\ngegcgaKaaaffff_gg"..., 4096) = 4096
There is still a time difference but nowhere near as large as before (see below). The first is however, consistently faster.
>time cat A_1_1.fq | perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' | wc -l
26814958
real 0m41.711s
user 0m38.406s
sys 0m5.096s
>time perl -ne '$s=<>;<>;<>; chomp $s; print "$s\n";' < A_1_1.fq | wc -l
26814958
real 3m39.382s
user 0m52.811s
sys 0m23.169s
| [reply] [d/l] [select] |