That's functionality not directly available from the scripts API but we can create a script to automate the
ImageMagick tool and get that info.
The idea is to render each PDF page and analyze the result bitmaps for color content.
Made some test and here is a sample script that creates a .csv file with a "Color Pages Count" and "BW/Gray Page Count" columns. It renders each PDF page, converts the result bitmaps to the HSI colorspace and computes the mean value of the saturation channel. The page is considered colorized if this value is higher than 0, or BW/Gray otherwise. You may adjust this threshold to your needs.
// Add format function to the String prototype
// First, checks if it isn't implemented yet.
if (!String.prototype.format) {
String.prototype.format = function() {
var args = arguments;
return this.replace(/{(\d+)}/g, function(match, number) {
return typeof args[number] != 'undefined' ? args[number] : match;
});
};
}
var imo = new ActiveXObject("ImageMagickObject.MagickImage.1");
var fso = new ActiveXObject("Scripting.FileSystemObject");
var tmpfolder = fso.GetSpecialFolder(2 /*TemporaryFolder*/ );
var InfoFilename = tmpfolder + '\\PagesInfo.txt';
var CSVOutputFileName = tmpfolder + '\\' + fso.GetTempName();
var CSVOutputFile = fso.CreateTextFile(CSVOutputFileName, true, true);
//write header line to the csv file
CSVOutputFile.WriteLine('Filename,Status,"Pages Count","Color Pages Count","BW/Gray Pages Count"');
for (var i = 0; i < pdfe.SelectedFiles.Count; i++) {
var file = pdfe.SelectedFiles(i);
pdfe.echo('Processing ' + file.filename);
try {
//use imagemagick to render each pdf page, convert the result image colorspace
//to HSI and output "1" if the mean of the saturation values is higher
//than 0 (the page has color), and "0" if 0 (no color in the page)
imo.convert(file.filename, "-colorspace", "HSI", "-format", "%[fx:mean.g>0?1:0]", "info:" + InfoFilename);
//read the result info file, that contains a "0" or "1" for each page
//in the PDF. E.g. 0110, for a 4 pages PDF with pages 1 and 4 being bw/gray
//and 2 and 3 with color.
var f = fso.GetFile(InfoFilename);
var fts = f.OpenAsTextStream();
var info = fts.ReadAll();
fts.Close();
f.Delete();
//Count the number of "1"
var ColorPagesCount = info.split('1').length - 1;
//Count the number of "0"
var BWPagesCount = info.split('0').length - 1;
pdfe.echo(file.filename + ': Color Pages Count = ' + ColorPagesCount + ',BW / Gray Page Count = ' + BWPagesCount, 0, 2);
CSVOutputFile.WriteLine('"{0}",OK,{1},{2},{3}'.format(file.filename, file.NumPages, ColorPagesCount, BWPagesCount));
} catch (e) {
pdfe.echo(file.filename + ': Error (' + e.message + ')', 0xff0000, 2);
CSVOutputFile.WriteLine('"{0}",Failed'.format(file.filename));
}
}
CSVOutputFile.Close();
dialog = pdfe.SaveDialog;
dialog.DefaultExt = '.csv';
dialog.filter = 'CSV (*.csv)|*.csv';
dialog.Options = '[ofOverwritePrompt]';
dialog.Filename = fso.GetParentFolderName(file.filename) + '\\PDFsInfo.csv';
if (dialog.execute) {
if (fso.FileExists(dialog.Filename)) fso.DeleteFile(dialog.Filename);
fso.MoveFile(CSVOutputFileName, dialog.Filename);
var WshShell = WScript.CreateObject("WScript.Shell");
WshShell.Run(dialog.Filename);
} else {
fso.DeleteFile(CSVOutputFileName);
}
To test it, just import the attached .myscript file into the
PDF-ShellTools My Scripts, and you will get a "Number of Color and BW/Gray pages" named script, you can invoke for all the selected PDF files from the Windows shell PDF files context menu, from the PDF-ShellTools>My Scripts sub menu.
The scrip needs to have the 32-bit version of the ImageMagick tool installed. I've tested with the
ImageMagick-7.0.5-5-Q16-x86-dll.exe one. While installing, make sure you select the "Install ImageMagickObject OLE Control for VBScript,..." option, under the "additional tasks" page of the installer.
The ImageMagick also needs to have the
Ghostscript tool installed, to handle the PDF format.
If the script is performing as needed we can change it to put the info into custom metadata properties, as you suggested.