In case you're not aware of this, you can add 'lang=xx' attributes to both block-level and inline elements in HTML4 and later, which may or may not make parsing a bit easier.
One question for clarification: what should the system do if your user requests, for example, French, but the source document is Italian in origin, and has more translations for some 'chunks' (for want of a better term) in EN than FR?
i.e. chunk 1 has IT & EN translations, chunk 2 has IT, EN & FR, chunk 3 has IT only - chunk 2 would obviously return the FR version and chunk 3 the IT (as it's the only one available), but what about chunk 1? What would the user expect to see for that?