James Truitt is a digital archivist with a deep knowledge of
regular expressions. His professional interests include the
intersection of digital preservation and efficient archival
processing, and more generally the application of aggregate
description techniques to born-digital materials.
James has worked in archives for nearly a decade. He
holds an MLIS from San José State University, and his writing
has been published in the American Archivist.
James holds the position of digital archivst at the
Friends Historical Library of Swarthmore College.
May 26, 2026
Sometimes I want to save a web page as a PDF for some reason or another. However, this can be problematic, because most (all?) browsers’ HTML-to-PDF conversion is an offshoot of the process for printing a web page onto paper. This means that the default behavior is for the page to get broken up into 8.5-by-11-inch chunks, which often wreaks havoc on the layout of pages that have large images or are much taller than they are wide (i.e., basically all modern web pages).
April 29, 2026
Let’s try this
This is Python:
def hw():
print('hello world!')
And this is regex:
\d\d?-\d\d?-\d{4}
April 16, 2026
Hello! Welcome to my new blog! I am excited to finally have this thing up.