How to count color pages in a PDF/Word doc using Java -
i looking develop desktop application using java count number of colored pages in pdf or word file. used part of overall system calculate cost of printing document in terms of how many pages there (color/b&w).
ideally, user of application use file dialog select desired prf/word file, application count , output number of colored pages, allowing system automatically calculate document cost accordingly.
i.e if a4 colored pages cost 50c per page print, , b&w cost 10c per page, calculate total cost of document per colored/b&w pages.
i aware of existing software rapid pdf count http://www.traction-software.co.uk/rapidpdfcount/, unsuitable part on integration new system. have tried using ghostscript/python per solution: http://root42.blogspot.de/2012/10/counting-color-pages-in-pdf-files.html, takes long (5mins count 100 page pdf), , difficult implement desktop app.
is there method of counting number of colored pages in pdf or word file using java (or alternative language)
thanks
although might sound easy, task rather complicated.
one option use program such itext walk every single token in pdf, tokens support color , compare definition of "black". however, basic text , drawing commands. images different beast you'll need find image parser or grab copy of each spec , walk each of those.
one of downsides of token walking need handle tokens reference other things , further walk tokens.
another downside things can overlap each other you'd want aware of coordinates, z-index, transparency , such.
there many more bumps in road that's start. what's interesting if accomplish this, you'll have found you've partially built pdf renderer!
next, you'll need define "black". off top of head there's rgb black, cmyk black, grey black , maybe lab black along pantones. shouldn't hard if build i'd want know "blank ink usage" shades of grey. there's "rich blank" might need deal with, too!
so, said, think ghostscript option found best bet. literally renders pdf , calculates ink coverage rgb standpoint. still should handle grey's, too, shouldn't hard, here's starting point.
Comments
Post a Comment