John Andrew Kunze – Biography
![]() |
CV | ORCID: 0000-0001-7604-8041 Linkedin: linkedin.com/in/jakkbl | Github: jkunze Mastodon: fosstodon.org/@jakkbl | Bluesky: jakkbl.bsky.social jkunze.net | ARK Alliance |
John Kunze is a pioneer in the theory and practice of digital libraries whose passion for creating and sharing free, open, pragmatic digital solutions has guided his long public sector career. With a background in computer science and mathematics, he wrote software that comes pre-installed in every Mac and Linux system. He had leading roles in establishing identifier standards (URL, ARK), metadata standards (Dublin Core), archiving standards (BagIt, WARC), the Z39.50 library protocol, UC Berkeley’s first Campus Wide Information System, and repository microservices used in HathTrust and OCFL. He is currently a senior research associate with Drexel University.
Participating in open source 20 years before that term was coined, in 1978 at the University of California (UC) Berkeley Kunze began fixing BSD Unix bugs and writing tools that come pre-installed in today’s Mac and Linux systems (jot, lam, rs). During that time he maintained the global terminal capability database (termcap), created an online Unix help system, brought the Bell Labs Unix “learn” program and its computer-aided instruction scripts to life, and became principal author of the book Common Lisp: the Reference.
In 1989 he began creating UC Berkeley’s first Campus Wide Information System (CWIS).* Into that system, called Infocal,† he built pre-web hypertext navigation (inspired by “learn” scripts), a custom search engine, and integration with (a) library catalog search via the then-new Z39.50 protocol, (b) major campus datasets (course catalog, schedule of classes, phone book, job vacancies), and (c) the World Wide Web. The protocol had never been implemented and required him to invest heavily in standards development, software development (releasing the first complete open source client-server codebase), and the first Z39.50 interoperability demonstration (with Penn State University and the UC Division of Library Automation).
As Infocal became web-aware, Kunze began to work with identifier standards. In 1994 he declined principal editorship of the URL specification, and instead agreed to write the functional requirements in an attempt to unblock the URL standard, at an impasse because URLs were breaking so often. His document permitted them to break and was published as RFC1736, resulting in the immediate approval of the first URL standard as RFC1738. As part of a 3-year fellowship at the US National Library of Medicine (NLM), he analyzed the persistent identifier landscape and in 2000 defined the framework for the NLM multi-dimensional permanence levels.
Believing broken URLs to be his fault, Kunze created the ARK (Archival Resource Key) persistent identifier scheme in 2001 at the California Digital Library (CDL). With the idea of addressing broken links flexibly and affordably while leveraging the NLM permanence levels, he evolved the ARK specification, created the ARK resolver and registration infrastructure, and registered the first 600 ARK organizations. In 2018 with help from DuraSpace, he led creation and growth of the ARK Alliance (arks.org). By the end of 2025, there were over 1720 ARK organizations, including 12 national libraries, 215 universities, 254 archives, 144 museums, 124 journals, and 59 scientific centers. He also co-authored RFC1625 (WAIS) and RFC2056 (Z39.50 URLs).
Motivated by the Unix philosophy favoring simple, extensible tools that combine easily and by a distaste for siloed solutions, Kunze developed open source tools for ARKs that also work for non-ARK identifiers. The Name-to-Thing (N2T.net) resolver supports hundreds of compact identifier schemes, the EZID identifier service supports ARKs and DOIs (URNs, PURLs, and IGSNs were planned), the Noid (Nice Opaque Identifier) Tool mints billions of ARK and Handle identifiers, and THUMP specifies inflections for ARKs that work with any URL-based identifiers.
A key takeaway from Z39.50 interoperability testing, especially for nonbibliographic applications, was the notion of shared attributes such as title, author, and publisher. This prompted him to propose the URC in 1992 and to jump on board the Dublin Core in 1995. There he led publication of the world’s first metadata standards (RFC2413, RFC2731, ANSI/NISO Z39.85), upon which most metadata schemas are based: Schema.org, OAI-PMH, MODS, METS, EPUB, DataCite, Darwin Core, etc. Considering metadata to be far from finished, he created the minimalist Dublin Kernel based on his Z39.50 work, the TEMPER date format, and a vision (1996) of a kind of “Dublin Mantle” that he that he would later implement as the YAMZ.net vocabulary builder.
Under a grant from the Library of Congress (LC) for harvesting and preserving at-risk websites, Kunze published the first draft of the WARC standard, which is now used in all large-scale web archiving (e.g., Internet Archive). To move files between archives, he worked with the LC team as principal author of the BagIt standard (RFC8493), which is also widely used (e.g., Library of Congress, Stanford, Cornell, Dryad). He also created repository microservice specifications used in HathiTrust (Pairtree), BagIt (Oxum), and OCFL (Namaste).
* Not everyone knows that a few years before the web appeared, dozens of custom-built state-of-the-art networked information systems were emerging at universities for the purpose of sharing diverse types of information with students, faculty, staff, and the general public. In this brief era of the Campus Wide Information System (CWIS), institutions of higher education effectively piloted the WWW insofar as they worked out the presentation and maintenance of heterogeneous online data while using advanced networking protocols such as FTP, NNTP, Z39.50, and Gopher. Gopher (from the University of Minnesota) was the first CWIS software packaged for easy installation, and just as thousands of non-campus organizations were adopting it, the WWW software appeared with its winning hypertext capability began to edge it out.
† Pronunciation (IPA en-us): Kunze /ˈkʊnziː/, Infocal /ˈɪnfəʊkæl/, Noid /nɔɪd/, EZID /iːˌziːaɪˌdiː/, URL /juːɑːɹɛl/
