1. 19 Feb, 2020 1 commit
  2. 15 Jan, 2020 1 commit
  3. 17 Nov, 2019 2 commits
  4. 11 Oct, 2019 1 commit
  5. 06 Oct, 2019 2 commits
  6. 05 Jul, 2019 6 commits
  7. 05 May, 2019 1 commit
  8. 10 Dec, 2018 1 commit
  9. 24 Nov, 2018 1 commit
  10. 29 Oct, 2018 1 commit
    • Albert Astals Cid's avatar
      GBool -> bool · 22175dc4
      Albert Astals Cid authored
      It's just a typedef in current poppler and it's going away in next
      releases so this makes it future proof
      22175dc4
  11. 07 Oct, 2018 1 commit
  12. 30 Sep, 2018 1 commit
  13. 21 Sep, 2018 5 commits
    • Volker Krause's avatar
      Expose file size and image object ids · 4eff93a7
      Volker Krause authored
      Enables optimizations in the generic PDF extractor.
      4eff93a7
    • Volker Krause's avatar
      Load PDF pages only when needed · f23a1fcc
      Volker Krause authored
      f23a1fcc
    • Volker Krause's avatar
      Rework how we handle multi-occurrence images · 25d1bbef
      Volker Krause authored
      Multiple occurrences, possibly with different transformations, are now
      listed as separate images, the expensive image decoding is only done
      once though by using a document-wide image data map. This fixes a few
      correctness issues (e.g. when listing images for a page sub-rect), and
      more importantly brings this a big step closer to the discussed Qt image
      access API in Poppler.
      
      This will possibly duplicate barcode decoding effort though, but that's
      something to be handled one layer up in the extractor, by using a object
      id based mapping like we use for the image decoding.
      25d1bbef
    • Volker Krause's avatar
      Read page count directly from Poppler · 59ebf041
      Volker Krause authored
      This allows us to fill our page vector on demand.
      59ebf041
    • Volker Krause's avatar
      Remove the document-wide image access API · bf950350
      Volker Krause authored
      We only use the per-page image access API nowadays. This maps better to
      how Poppler does this, and it paves the way for full on-demand parsing
      of the PDF content.
      bf950350
  14. 13 Sep, 2018 1 commit
  15. 04 Sep, 2018 1 commit
  16. 24 Aug, 2018 1 commit
  17. 19 Aug, 2018 1 commit
    • Volker Krause's avatar
      Implement direct PDF image loading · 81b3d444
      Volker Krause authored
      This avoid re-parsing the document for each image retrieval. This is still
      disabled by default, as we first need Poppler patch 107617 integrated to
      fix memory issues with the GfxImageColorMap. It's worth it though, it
      speeds up running the full extractor test suite by ~15%.
      81b3d444
  18. 18 Aug, 2018 1 commit
  19. 10 Jul, 2018 1 commit
  20. 25 Jun, 2018 1 commit
  21. 06 May, 2018 1 commit
  22. 05 May, 2018 3 commits
  23. 30 Apr, 2018 5 commits