Re^2: Fast Identification Of String Difference

in reply to Re: Fast Identification Of String Difference
in thread Fast Identification Of String Difference

Dear BrowserUK,
Store in memory for further processing.

---
neversaint and everlastingly indebted.......

Comment on Re^2: Fast Identification Of String Difference

Replies are listed 'Best First'.
Re^3: Fast Identification Of String Difference by BrowserUk (Patriarch) on Jan 18, 2011 at 05:47 UTC
Could you give us more information about what processing you are going to do? For example: Will you use each triplet of data (c1,c2,p) in isolation? Or do you need all the triplets from a given pair of strings all together? Or do you need the triplets from different pairs of strings together? The reason for these questions is that whether done in Perl or C allocating the memory in which to build the list of positions is a substantial part of the overall cost. If you only need each position in isolation, then an iterator interface might be more efficient to use. Equally, are the pairs of strings you are comparing the same length? Or are you comparing short strings with (every?) substring of a large strings? Are you comparing many short strings against (every) substring of larger strings? The problem is that the basic mechanics of comparing the characters in two string is very fast. Especially in C. But the details of the code that surrounds that can have a big impact on the overall application time. Rather than an extended to'n'fro of questions, it would be easier if you posted code or pseudo-code of the actual application, along with numbers and sizes of the strings involved. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]

Replies are listed 'Best First'.

Re^3: Fast Identification Of String Difference
by BrowserUk (Patriarch) on Jan 18, 2011 at 05:47 UTC

Could you give us more information about what processing you are going to do?

For example:

Will you use each triplet of data (c1,c2,p) in isolation?
Or do you need all the triplets from a given pair of strings all together?
Or do you need the triplets from different pairs of strings together?

The reason for these questions is that whether done in Perl or C allocating the memory in which to build the list of positions is a substantial part of the overall cost. If you only need each position in isolation, then an iterator interface might be more efficient to use.

Equally, are the pairs of strings you are comparing the same length? Or are you comparing short strings with (every?) substring of a large strings? Are you comparing many short strings against (every) substring of larger strings?

The problem is that the basic mechanics of comparing the characters in two string is very fast. Especially in C. But the details of the code that surrounds that can have a big impact on the overall application time.

Rather than an extended to'n'fro of questions, it would be easier if you posted code or pseudo-code of the actual application, along with numbers and sizes of the strings involved.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]

In Section Seekers of Perl Wisdom