Add document type support for everything we supported so far
This is largely existing code that previously was either in the generic extractors or somewhere in ExtractorEngine, refactored to have everything related to a specific document type in a single place.