Re: get full hit descrption from blast output (xml)by graff (Chancellor)
|on Nov 30, 2012 at 02:34 UTC||Need Help??|
I want to help, but based on what you've posted, it's hard. First, you didn't post a complete XML file -- I have to wonder if there's anything else missing besides a bunch of close tags at the end of your sample data.
Then, your command line doesn't really give us enough info. What would be reasonable values for the "-n number_of_hits_to_keep" and "-b bit_score_cutoff" in order to get relevant results? Also, your command line uses a "-d" where I think the script is expecting a "-t".
And I'm sorry to seem picky, but you should be able to find an easy way to get your indention right (emacs? vi? some decent IDE or other programmer-savvy editor? perltidy?). Trust me, it really helps.
Anyway, I did manage to get the output you reported, but I'm a stranger to Bio::SearchIO, and I really can't tell what line or portion of the code actually has something to do with the "Hit_def" parameter in the xml file.
Just using a straightforward XPath extraction for "//Hit_def" on your (fixed) xml file does indeed return the full string you want - "43989.cce_0262 (Cyanothece ATCC 51142)".
In order to figure it out, I had to add this at the top:
Then step through it with the debugger until I got inside this block:
Then my next debugger command was:
Looking through the resulting output, I found the missing string -- see if you can find it too... Once you do, you should be able to figure out how to print it to your output file as desired: