<?xml version="1.0" encoding="windows-1252"?>
<node id="1006386" title="Re: get full hit descrption from blast output (xml)" created="2012-11-29 21:34:52" updated="2012-11-29 21:34:52">
<type id="11">
note</type>
<author id="44715">
graff</author>
<data>
<field name="doctext">
I want to help, but based on what you've posted, it's hard.  First, you didn't post a complete XML file -- I have to wonder if there's anything else missing besides a bunch of close tags at the end of your sample data.
&lt;P&gt;
Then, your command line doesn't really give us enough info.  What would be reasonable values for the "-n number_of_hits_to_keep" and "-b bit_score_cutoff" in order to get relevant results?  Also, your command line uses a "-d" where I think the script is expecting a "-t".
&lt;P&gt;
And I'm sorry to seem picky, but you should be able to find an easy way to get your indention right (emacs? vi? some decent IDE or other programmer-savvy editor? perltidy?).  Trust me, it really helps.
&lt;P&gt;
Anyway, I did manage to get the output you reported, but I'm a stranger to [cpan://Bio::SearchIO], and I really can't tell what line or portion of the code actually has something to do with the "Hit_def" parameter in the xml file.
&lt;P&gt;
Just using a straightforward XPath extraction for "//Hit_def" on your (fixed) xml file does indeed return the full string you want - "43989.cce_0262 (Cyanothece ATCC 51142)".
&lt;P&gt;
In order to figure it out, I had to add this at the top:
&lt;c&gt;
use Data::Dumper 'Dumper';
&lt;/c&gt;
Then step through it with the debugger until I got inside this block:
&lt;c&gt;
    while (my $hit = $result-&gt;next_hit) {
&lt;/c&gt;
Then my next debugger command was:
&lt;c&gt;
p Data::Dumper::Dumper($hit)
&lt;/c&gt;
Looking through the resulting output, I found the missing string -- see if you can find it too... Once you do, you should be able to figure out how to print it to your output file as desired:
&lt;c&gt;

$VAR1 = bless( {
                 '_hsps' =&gt; [
                              {
                                '-query_start' =&gt; 253,
                                '-algorithm' =&gt; 'BLASTX',
                                '-gaps' =&gt; '0',
                                '-hit_seq' =&gt; 'ITGAVCLMDYLEKVLEKLRELAQKLIETLLGPQ',
                                '-hit_length' =&gt; '65',
                                '-query_length' =&gt; '508',
                                '-query_desc' =&gt; 'HKUN3Y301D9XQX',
                                '-query_frame' =&gt; -1,
                                '-rank' =&gt; 1,
                                '-hit_desc' =&gt; '43989.cce_0262 (Cyanothece ATCC 51142)',
                                '-query_end' =&gt; 155,
                                '-hit_name' =&gt; 'gnl|BL_ORD_ID|1515029',
                                '-identical' =&gt; '17',
                                '-query_name' =&gt; 'Query_1',
                                '-evalue' =&gt; '0.00664016',
                                '-score' =&gt; '92',
                                '-conserved' =&gt; '27',
                                '-hit_frame' =&gt; 0,
                                '-hsp_length' =&gt; '33',
                                '-query_seq' =&gt; 'LRGAICSMEHIEEALGKLKDWARKLIELLLGPR',
                                '-hit_start' =&gt; '12',
                                '-homology_seq' =&gt; '+ GA+C M+++E+ L KL++ A+KLIE LLGP+',
                                '-hit_end' =&gt; '44',
                                '-bits' =&gt; '40.0466'
                              }
                            ],
                 '_iterator' =&gt; 0,
                 '_description' =&gt; '(Cyanothece ATCC 51142)',
                 '_significance' =&gt; '0.00664016',
                 '_query_length' =&gt; '508',
                 '_accession' =&gt; '1515029',
                 '_length' =&gt; '65',
                 '_psiblast_iteration' =&gt; '1',
                 '_name' =&gt; '43989.cce_0262',
                 '_rank' =&gt; 1,
                 '_algorithm' =&gt; 'BLASTX',
                 '_root_verbose' =&gt; 0,
                 '_hashes' =&gt; {
                                '0' =&gt; 1
                              },
                 '_hsp_factory' =&gt; bless( {
                                            'interface' =&gt; 'Bio::Search::HSP::HSPI',
                                            'type' =&gt; 'Bio::Search::HSP::GenericHSP',
                                            '_loaded_types' =&gt; {
                                                                 'Bio::Search::HSP::GenericHSP' =&gt; 1
                                                               },
                                            '_root_verbose' =&gt; 0
                                          }, 'Bio::Factory::ObjectFactory' )
               }, 'Bio::Search::Hit::GenericHit' );
&lt;/c&gt;</field>
<field name="root_node">
1006367</field>
<field name="parent_node">
1006367</field>
</data>
</node>
