First the PDFs need to be OCRed and the data you want to extract correctly showing in the
text only PDF View (this only to visually verify if the PDFE text extractor is working with your OCRed PDFs). If that's the case, you may use the
Task Automation Folders tool to set a folder monitor in a work folder that will trigger a script, every time a new PDF is saved to that monitored folder, that will extract the data and commit it to PDF metadata fields, and then move this metadata edited PDF to a files processed folder(s). How the script extracts the data will depend on the specificity of your PDFs, but generally this can be easily accomplished with regular expressions, and/or the scripts API
Page.TextEx object that provide font information and text position under the PDF page (check the
My Scripts batch tool, "Page TextEx example" script, for more info about this object).
In the task automation folders tool, in order to call the script, we use the "rename file" task, and the reference to the script name in the rename formula. The script must also return the new full path file name, in order for the rename file task to move the file to the processed files folder.
If you want, attach here to a forum reply some samples of these PDFs (or send them directly to me), already OCRed, let me know the data you want to extract and to what PDF metadata fields, and I will try to develop a script showing how all this can be accomplished.
After the data is extracted to metadata fields, you just need to provide your taxadviser with a .csv file created with the
Export grid fields[ tool tool, along with the PDFs. Each .csv line will make reference to the PDF file name and to its relevant extracted data fields.