This is strictly an I/O-bound process, so threading is unlikely to help you much here. The ruling constraint is how fast the (one?) disk-drive can spin and move its read/write head assembly around ... and particularly, how often the read/write head is obliged to move from one place to another. This is why you sometimes encounter the at-first counter-intuitive finding that “adding threads makes it slower,” this being a result of essentially randomizing the pattern of back-and-forth movements the read/write heads must make, and greatly increasing the “churn” of the operating-system’s buffers.
What I think you really, really want to do here is to use, say, a SQLite database file (or files), indexing the data so that you do not in fact have to read every line to find what might be a match. Indexes do not have to be perfect in order to be useful. Any strategy that reduces the amount of records that must actually be examined, by any means whatever, is going to be worthwhile: in this case, you simply want to separate the data into meaningful clumps so that you only need to iterate through one of them.
Another useful strategy, especially if you are “clumping,” is to grab a big handful of records-to be-looked-for into a memory structure such as a list, big enough to fit in real memory (i.e. without paging ...), so that you can make each I/O against the other file do more work for you. Having spent the I/O time to retrieve the record (and its neighbors), you can compare it against the entire handful without incurring more I/O cost. (But if the structure is so large (among all the processes that may exist) that it does cause paging, then you have just incurred a hidden I/O cost that can be quite debilitating: a too-large to-fit yet frequently-accessed group of pages, causing thrashing to occur.)