in reply to Matching complementary base pairs from 2 different files
Hello Meetali16, and welcome to the Monastery!
I assume that the “Result” you give is the output you want to receive? If so, it’s difficult to see why some matches are included and others excluded. Looking at the rsid fields doesn’t help, since all lines in both input files have the same rsid. Matching base pairs between the two files would then result in 6 matches:
rs492602 CC GG Vitamin B12 deficiency FUT2 Higher levels of vita +min B12 rs492602 CC CC Vitamin B12 deficiency FUT2 Higher levels of vita +min B12 rs492602 CT GG Vitamin B12 deficiency FUT2 Normal levels of vita +min B12 rs492602 CT CC Vitamin B12 deficiency FUT2 Normal levels of vita +min B12 rs492602 TT GG Vitamin B12 deficiency FUT2 Normal levels of vita +min B12 rs492602 TT CC Vitamin B12 deficiency FUT2 Normal levels of vita +min B12
— unless the asterisks are significant? Please clarify.
In the meantime, I will propose a general strategy: Decide which of the two input files is likely to be shorter, and read the contents of that file into a hash. (The format for the hash will depend on the type of matching you require.) Then read the larger file, line by line, extracting its rsid and base pair fields and matching against the appropriate fields in the hash. Hash lookup is one of the areas where Perl really shines.
I notice you increment $i on each match, but never use it. I’m guessing you want to:
print "Found $i matches\n";
at the end of the script?
Please clarify what you are trying to achieve, to make it easier for the Monks to help you — and please remember that most of us are not biologists!
Cheers,
Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Matching complementary base pairs from 2 different files
by Meetali16 (Novice) on Jan 01, 2018 at 10:42 UTC | |
by Athanasius (Archbishop) on Jan 01, 2018 at 13:16 UTC |