It would be very easy to add a test to verify that the character at the position to be edited is what the edit expects. The additional effort would be paid back, the first time that it finds an example of inconsistent data.
Whilst it would be easy to add; it is almost certainly unnecessary.
When you know how these edit lists are produced, you realise that the sequences being edited were (half of) the input to the processing that produced the edit list; thus with real data, if the sequence name/id -- which tend to look like uc002yje.1 chr21:13973492-13976330 or 32_Illumina_Multiplexing_PCR_Primer_1.01 or ceti albus: chrom 1 or SVN001-12|RMNH.ARA.14133|ANA0001|CP|M etc. -- is found in the hash, then the likeihood that the edit file will contain a different initial character at the specified position is very small indeed.
Perhaps the worst that could happen is that the post-edited sequence file could be (re)paired with the same edit file and re-run. The result would be that the entire file would be "edited", and the resulting output file would be identical to the input file. Ie. No harm done.
What my code did lack was a check for/handling of, the existence of the sequence (from the edit file) in the hash (from the sequence file), which would almost certainly indicate that the wrong edit file was being paired with the sequence file (or vice versa).
But then, my purpose was (as always) to provide the OP with the minimum demonstration that would explain the problem he was asking about -- in this case his misconception regarding 0-based and 1-based indexing -- and not production level, ready-to-run code.
That said; the addition of the consistency check would do no harm either :)
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
In the absence of evidence, opinion is indistinguishable from prejudice.
|