Author Topic: Splitting by bookmarks (undo merge -BookmarkAll)  (Read 7776 times)

0 Members and 1 Guest are viewing this topic.

conrad.drake

  • Newbie
  • *
  • Posts: 35
Splitting by bookmarks (undo merge -BookmarkAll)
« on: November 05, 2015, 08:11:58 AM »
I've just been handed several large PDFs which I need to split up into individual files.  Fortunately it looks like someone used PDF Shell Tools to merge them as each sub-file does have a bookmark. 

But as there's no unmerge -BookmarkAll  I'm going to have to reach for PdfTK to extract the bookmark information and write a MS Dos script to call PDF Shell tools with the right start/end pages and titles.
Something like
for /F "eol=; tokens=1,2*  %i in (bookmarks.txt) do PDFShellTools Split -s SplitRules=%i-%j "OutputFilename=%k" BigMergedFile.pdf

If there's not a way to do this with the API (which I haven't exhaustively checked) can you please consider for some time in the future?
Thanks!
[edit = bug fixed in above - %k is automatically created]

RTT

  • Administrator
  • *****
  • Posts: 918
Re: Splitting by bookmarks (undo merge -BookmarkAll)
« Reply #1 on: November 05, 2015, 10:42:08 PM »
Not possible with the current version, because the bookmarks script API object still lacks access to the target page reference. Something I will try to add to the split tool and scripts API in a next release. ;)

Thanks for mentioning the need of this functionality.

conrad.drake

  • Newbie
  • *
  • Posts: 35
Re: Splitting by bookmarks (undo merge -BookmarkAll)
« Reply #2 on: November 06, 2015, 01:31:16 AM »
Thanks - PDF Labs have managed to work this out in PDFTk - it's GPL so you may get some hints from how they've implemented it (I have no idea!)

I ran

pdftk "BigMergeFile.pdf" dump_data > info.txt

this generates a whole bunch of data like the following

BookmarkBegin
BookmarkTitle: TCDL-801198_A
BookmarkLevel: 1
BookmarkPageNumber: 84

plus the all important total number of pages.

NumberOfPages: 1472

A quick mucking about in XLS (or your favourite script language) can turn this into text like
; 1st page   Last   Title
1   2   Table Of Contents
3   24   BE847-AK-BOM-100_1_Reviewed_AC
25   77   B8F47-AK-SPC-100_1_Reviewed_AC
78   83   B84G7-TK-BLD-101.001_1_IPX Reviewed_AC
84   89   TCDL-871198_A

which can be fed into the FOR loop above

RTT

  • Administrator
  • *****
  • Posts: 918
Re: Splitting by bookmarks (undo merge -BookmarkAll)
« Reply #3 on: December 23, 2015, 02:43:26 AM »
With the 2.6.3 minor version release, it is now possible to split by top-level bookmarks, directly from the split/extract pages tool, or to use the new DestPageIndex property, added to the scripting API Bookmark object, from any operation that needs this bookmark destination page index information.

conrad.drake

  • Newbie
  • *
  • Posts: 35
Re: Splitting by bookmarks (undo merge -BookmarkAll)
« Reply #4 on: February 29, 2016, 05:12:44 AM »
Thanks!  Top level split works as advertised.