Commit 919a4f29 authored by Volker Krause's avatar Volker Krause
Browse files

Consider the PDF to plain text fallback when determining extractors

The external extractor would otherwise miss extractors needing that which
match against outside context such as MIME headers.
parent 7427125d
......@@ -52,6 +52,13 @@ void ExternalProcessor::preExtract(ExtractorDocumentNode &node, const ExtractorE
{
std::vector<const AbstractExtractor*> extractors;
engine->extractorRepository()->extractorsForNode(node, extractors);
// consider the implicit conversion to text/plain the PDF processor can do
if (node.mimeType() == QLatin1String("application/pdf")) {
node.setMimeType(QStringLiteral("text/plain"));
engine->extractorRepository()->extractorsForNode(node, extractors);
node.setMimeType(QStringLiteral("application/pdf"));
}
QStringList extNames;
extNames.reserve(extractors.size());
std::transform(extractors.begin(), extractors.end(), std::back_inserter(extNames), [](auto ext) { return ext->name(); });
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment