Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^2: Deconvolutinng FastQ files

by Anonymous Monk
on Aug 07, 2012 at 13:37 UTC ( #985986=note: print w/replies, xml ) Need Help??


in reply to Re: Deconvolutinng FastQ files
in thread Deconvolutinng FastQ files

Thank you frozenjoy. With FastQ on galaxy I need to trim the first three letters for my record to be able use the barcode splitting function. I haven't tried the stand-alone version yet. I will give that a go once I can get it to work on my computer. These three letters are important for my analysis, so I am not entirely sure if I can use FastX's barcode splitter tool. I am playing around with the galaxy version of it at present.

Replies are listed 'Best First'.
Re^3: Deconvolutinng FastQ files
by frozenwithjoy (Priest) on Aug 07, 2012 at 15:35 UTC
    I took a look at fastx_barcode_splitter.pl and I think I've figured out a solution. I haven't tested it, but if you change line 161 from:
    unless $barcode =~ m/^[AGCT]+$/;
    to:
    unless $barcode =~ m/^[AGCTN]+$/;
    then you should be able to prefix your barcodes w/ 3 N's as long as you set --mismatches to at least 3 on the command line when running the script.

    One caveat is that you will want to toss out any reads that have any Ns in the first X bases (where X = 3+ barcode length). Have you run FastQC? If so, this will tell you the per base N content. It probably won't be an issue if you've already done preliminary filtering based on Illumina's Y/N flags (assuming Illumina sequencing, of course).

    Also, (depending on your computer, of course) I suspect fastx_barcode_splitter.pl will run a lot faster at the CLI than on Galaxy (at least if you are using the public galaxy server).

    Edit: to avoid the potential problem w/ Ns, just use some other non-nucleotide character!

      Thank you frozenwithjoy. I should give this a go too. We are thinking about running our own Galaxy server in the EC2, so the revised fastx_barcode_splitter might come in handy. Browseruk's script below works super fast, I am not sure how fastx_barcode_splitter.pl might compare.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://985986]
help
Chatterbox?
Corion had a meeting with some startup today. They have a very interesting DB proxy product, but their tech stack is really, really weird. They use the Pg wire protocol but not the Pg libraries to handle it. They support Pg SQL syntax, but don't use ...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2018-04-19 12:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?