It sounds like you know what algorithms you want to use, but you're fighting the infrastructure -- sometimes very simple ideas end up surrounded by huge amounts of I/O and user interface code.
During development, it helps to forget about I/O and just use something like Data::Dumper or Storable. Those modules can save whatever data structures you use -- including nested and recursive hashes. You don't have to fight the file I/O until you've solved your application problems. Your data structures remain fluid.
In this case, I'd suggest you keep the annotations in separate data structures from the original input. For example, store each input sentence as an array of words. Use a second array of hashes to store the annotations. Or use multiple arrays (one for each attribute) if you're uncomfortable with hash references. Generic word dictionaries should be indexed by word. Specific input sentences can be indexed by word offset within the sentence. If you change the sentences, you might find splice() useful. Just remember to also splice the corresponding annotation arrays so that the word offsets remain correct.
BTW, you might want to check out some other Eliza programs in Perl:
merlyn's IRC Eliza article and