Skip to content

law

Legal analysis and commentary.

Better document review in DEVONthink with Bates deeplinks and AppleScript.

Hey folks, in this post I'm sharing and explaining a custom script I wrote for DEVONthink that greatly assists me when I'm reviewing large batches of Bates-stamped documents (e.g., administrative records or discovery dumps).

Specifically, this script will:

  1. copy selected text in a PDF to the clipboard;
  2. determine the Bates number of the page the selected text is on;
  3. generate a Markdown-formatted Bates cite for that page linking back to the document, page, and specific passage of text that was selected; and
  4. append this Bates 'deeplink' to the clipboard.

ⓘ About DEVONthink

DEVONthink is comprehensive document management and productivity software exclusive to macOS that has seen continuous updates, improvements, and new features added for over two decades. The developer calls it "your paperless office." That certainly rings true for me—it's been my go-to work and study app since I first started using it in law school 7 years ago.

Example usage.

Suppose I'm reviewing the file FWS 065382–065397.pdf near the end of a 75,000-page administrative record (true story).

Suppose I find something really incriminating on the sixth page of that document—a candid email exchange where a staff biologist wrote: "If we do this, Franklin's Bumble Bee will go extinct." (fictitious example).

If I select that sentence in the PDF and use the keyboard shortcut I've assigned to my script, ⇧ + ⌥ + C, the following is placed in my system clipboard:

"If we do this, Franklin's Bumble Bee will go extinct." [FWS 65387](x-devonthink-item://883C7BE0-A328-4818-A4B5-3AF7E5504135?page=6&start=534&length=53&search=If%20we%20do%20this%2C%20Franklin%27s%20Bumble%20Bee%20will%20go%20extinct.). 

With Markdown rendering, that becomes, "If we do this, Franklin's Bumble Bee will go extinct." FWS 65387.

Now, if I open my Markdown editor of choice—Ulysses, Obsidian, or DEVONthink itself depending on the task at hand (more on that in a future post)—and paste (i.e., ⌘ v), the text I selected, plus the correct Bates cite with a deeplink back to the source sentence in the PDF, is inserted.

I can then at any time simply click the Bates cite and get right back to the exact point in that specific document where the incriminating statement is found.

Deeplinks are neat, right? Let's set it up.

Pre-requisites.

  1. This tutorial assumes you have a macOS computer with Perl and DEVONthink installed. The standard version of DEVONthink will work but I do recommend buying the Pro version, among other reasons, for easier OCR. The agencies I sue often transmit non-OCR'd documents and DEVONthink Pro is a godsend when that occurs.
  2. The documents you're looking at should be Bates-numbered (though my script has a fall-back mode for non-Bates numbered documents—more on that below).
  3. The documents should be named according to their Bates starting number or their Bates range. The script I wrote can handle any of these filename conventions:

FWS 000533.pdf
FWS-000533.pdf
FWS_000533.pdf
NMFS 002561-002985.pdf
BLM 45.pdf
AR_45-62.pdf

As you can see:

  • the agency prefix doesn't matter;
  • the separator between the agency prefix and the pages number(s) can be (space), -, or _; and
  • the filename can include either the Bates number of the document's first page, or the Bates range of the complete document separated by -.

This makes the script flexible enough to cover the file naming conventions I see most often in my legal practice. But I have seen others that my script couldn't feasibly be made to accomodate, for example:

20170912 1813 Redacted.pdf
20170922 0000 CSERC Scoping comment letter transmittal email.pdf
20200828 County of Tuolumne transmittal submission.pdf
Butler and Wooster 2003.pdf
California Resources 2020.pdf
Cayan et al 2008.pdf
Crozier et al 2006.pdf

If you're working with documents that are Bates-stamped but use a different file naming convention, like the example above, the script won't work until you rename the files to a supported naming convention. But when you're wrangling a 2,215-document AR comprising over 75,000 pages (true story), it's infeasible manually rename them. That's why I wrote—

A nifty helper script to handle Bates documents with nonconforming filenames.

This helper script, bates, will batch-rename folders full of documents while preserving their original filenames as metadata. The Python-language helper script is quite complex in its own right— it can handle PDFs with multiple text layers, non-OCR'd PDFs, DRM-protected PDFs, PDFs where Bates stamps are inserted as annotations, and various different Bates stamp formatting conventions.

Find it here →

I've written complete documentation on installing and using bates here, but the basic usage goes like this:

bates "~/Cases/My Big Case/Adminstrative Record" \
    --prefix "BLM AR " \
    --digits 6 \
    --name-prefix "BLM " \
    --log INFO

In this example, the script will take every PDF file in the folder ~/Cases/My Big Case/Administrative Record, and search the first and last page for Bates stamps formatted like BLM AR ######. Once it finds them, it will stash the original file name in the file's Finder comment metadata field, then rename the file BLM {first page Bates number}-{last page Bates number}.

With this naming convention the DEVONthink script will work.

👍
I recently applied this helper script to the 2215-document / 75,000-page AR I already mentioned, and it successfully named every single document—even ones where Bates stamps were illegible because of overlapping text or dark backgrounds. It's quite robust if I may say so myself.
⚠️
This helper script stores the original file name in the Finder comment metadata field, to preserve that information and have it still readily accessible within DEVONthink, but you should nevertheless always work on copies of files as the script may result in destructive changes in edge cases.

Alright, with the prerequisites in place and our documents abiding a compatible filename convention, the actual script this post is about can work. I've written detailed documentation for it here, but it's actually quite simple to set up:

  1. Copy the script from here
  2. Open /Applications/Script Editor.app
  3. Paste in the script and save it (e.g. on your Desktop) as Bates Source Link.scpt
  4. Open DEVONthink 3.app and click the script icon in the menu bar (it looks sorta like §) → Open Scripts Folder, or alternatively open Finder and click Go in the menu bar → Go to folder... and enter ~/Library/Application Scripts/com.devon-technologies.think3
  5. Move the Bates Source Link.scpt to the Menu subfolder you should now see in the Finder window

That's it.

You can use the script now by clicking the script icon (§) in the menu bar and then Bates Source Link.

Assign a keyboard shortcut.

... but clicking the script menu and finding the right script each time becomes a bit cumbersome, right? I thought so, too, so let's configure a keyboard shortcut for it:

  1. In your menu bar, click  → System Settings → Keyboard → Keyboard Shortcuts → App Shortcuts
  2. Select DEVONthink 3.app and click +
  3. Enter exact script name as it appears in the script menu in DEVONthink, i.e., Bates Source Link
  4. Assign your desired shortcut. A good option that doesn't conflict with default shortcuts is ⇧ ⌥ b (Option + Shift + b).
💡
I use the free utility Hyperkey.app to expand my available keyboard shortcut. Using Hyperkey I've assigned ⇪ b (Caps Lock + b) to this script.

Handling non-Bates documents.

Not all documents I work with are Bates stamped, so I made the script handle other documents too.

When a document doesn't follow one of the Bates naming convention detailed above, the script will still work. But instead of determining the Bates number for the active page and formatting a deeplinked Bates cite, it will instead format a deeplinked generic cite like this:

"If we do this, Franklin's Bumble Bee will go extinct." [FWS email thread at 6](x-devonthink-item://883C7BE0-A328-4818-A4B5-3AF7E5504135?page=6&start=534&length=53&search=If%20we%20do%20this%2C%20Franklin%27s%20Bumble%20Bee%20will%20go%20extinct.).

With Markdown rendering, that becomes, "If we do this, Franklin's Bumble Bee will go extinct." FWS email thread at 6.

Conclusion

That's it for this tip, folks. Let me know in the comments if you use DEVONthink and decide to give my script(s) a whirl. And as always, feel free to expand their functionality on my repo at sij.ai. I'd be particularly interested if anyone knows how to handle rich text links in AppleScript, to make this compatible in other apps besides Markdown editors.

Cheers!

Hello, world!

Welcome to my blog!

You likely already know me if you’re reading this, but in case these blogs reach new readers, please forgive a brief introduction. I’m Sangye Ince-Johannsen. I'm a staff attorney at the Western Environmental Law Center, where for five years I've litigated environmental cases in federal court on behalf of various local, regional, and national, and international conservation-oriented nonprofit organizations. My docket largely focuses on defending spotted owls, anadromous fish, grizzlies, wolves, and their respective habitats from federal and federally-licensed activities in cases brought under the Endangered Species Act, National Environmental Policy Act, Administrative Procedure Act, and Clean Water Act.

Before pursuing my legal career, I worked as a documentary filmmaker and videographer for four years in southern Oregon. During that time I had the privilege of working on documentaries including One Billion Rising (2013) by Eve Ensler and Robert Redford, When Giants Fall (2015) by Leslie Griffith, narrative films including Redwood Highway (2013) and Wild (2014), outreach videos for the Neighborhood Food Project among other regional nonprofits, and numerous music and event videos.

Going back even further, before my undergraduate studies in Anthropology, Videography, and International Relations at Southern Oregon University, my first interest—my first love, even—was computer science and programming. After a long hiatus, my passion for programming, especially AI/ML research and development, has been rekindled. Recent advancements in AI have me particularly excited about leveraging large language models and automatic speech recognition to become a more productive and effective advocate.

Vision for this blog

That’s what I hope to do with these blogs: share AI/ML scripts, workflows, and apps, along with key caveats and considerations, with other litigators and knowledge workers. I also intend to share some legal analysis and other fun stuff—AI image and speech generation, drone videography, self-hosting tools and tricks—along the way.

So, what’s with the two URLs, you may be wondering? For most of my posts, especially at the outset, you’ll get the same content whether you visit sij.law or sij.ai. However, if you’re a litigator and not necessarily an AI/ML enthusiast, there will likely be posts that aren’t interesting to you in the slightest, which I’ll save for sij.ai. If you’re passionate about AI/ML productivity hacks but not necessarily a litigator or interested in deep legal analysis, you’ll likely have a better time over on sij.ai. Either way, thanks for dropping by, feel free to explore some of my current projects in the site navigation, and I can’t wait to engage with you in the comments or on social media.

Cheers!

Edit on 9/25/24: I've decided to devote sij.ai to a dev and code hub rather than a parallel topical blog. That means all posts, whether related to law or ai/ml, will be consolidated here, and you can use the tags law and ai/ml to isolate one topic or the other.