Is there a general rule you might recommend for how to estimate how much to abstract, and how much db duplication to allow?
I recommend avoiding duplication. If you allow duplication, then you also allow the possibility that two or more pieces of information will be in conflict with each other, also known as data corruption. There are ways to mitigate this risk, but none are as effective as disallowing duplication altogether.
There's no absolute correct answer to the right level of abstraction. It depends on how the system will be used, considering performance, scalability, maintenance, and quality requirements. Compare for example OLTP and OLAP. It might help to think of your object model as a representation of your data suited to the functional requirements of your application or interface, whereas your database model should be designed to suit non-functional requirements such as data integrity and maximising query flexibility.