Why Build the DPLA API?

The Digital Public Library of America built and maintains an open API to encourage the independent development of applications, tools, and resources that make use of data contained in the DPLA platform in new and innovative ways, from anywhere, at any time. For inspiration, consider developing an application that visualizes metadata in compelling ways, or a recommendation engine that suggests similar cultural heritage content based on user preferences or criteria, or a lightweight front-end interface for mobile devices. Or pursue something entirely different! The possibilities are endless.

Levels of Representation

(or: Why the Deep Data Structure?)

To help organize and structure metadata, the DPLA falls back on some pretty philosophical concepts. Perhaps the most important of these is the idea that things in the physical world can be represented at a number of different levels. For example, if I request a URL and receive an image of a female yodeler, one could conceivably have metadata about the woman in the photograph, the physical photograph that represents the woman, and the digital resource that represents the photograph. Each of these levels of representation carries its own metadata, and each is represented within the DPLA’s metadata schema. Let’s work our way backwards to see how this plays out.

The digital representation of this wonderful photograph is an image in JPEG format. Digital representations of physical objects can come in all sorts of formats (JPEG, MP3, MOV, etc.), but what we’re really interested in is the metadata. When you receive an item from the DPLA API, the top-level fields have names like “hasView,” “@context,” and “dataProvider,” which aren’t all that familiar as metadata. That’s because they’re talking about the digital representation of the object, not the object itself. For those familiar fields, we turn to the “source resource.”

The second level of representation, the physical photograph of the female yodeler, is known as the “source resource” in DPLA parlance. The source resource is the physical object as it exists in the world—a photograph, a pamphlet, a map, etc. The fields most people are familiar with (title, creator, date) are all within this sourceResource field, as they’re metadata about the physical object, rather than metadata about the digitization.

The first of these levels, the yodeler herself, alas, doesn’t get much metadata in this example. In fact, this level is rarely represented in DPLA data, as it’s not often represented in original records in any structured way (if at all).

So if you find yourself frustrated with the fact that all the familiar fields are buried an extra level down in the metadata, remember to think philosophically. We all have levels. We all have depth. And if you just get confused, refer to our overview of metadata structures.

Proudly Found Elsewhere

(or: Why So Many Namespaces?)

An issue that plagues a good many technical projects is Not Invented Here syndrome. Many well-intentioned projects create their entire technical and intellectual infrastructure from scratch because of the perceived costs and lack of understanding of previously developed solutions. This has the effect of creating more work for the initiator, but it also excludes a lot of intellectual capital and community support from existing projects.

The DPLA is decidedly against all that. We’ve worked to get input from a great number of interested parties, and we’ve gratefully incorporated many of the hard-fought standards that have been created by other communities, as well as lots of open source technologies. So when you see lots of confusing namespaces, take heart: you’re in lots and lots of good hands.

Presumption of Openness

The DPLA API is built with the same principles of openness in mind that underlie the broader project it supports. We wish to make the barrier to public access as low as possible, in order to facilitate broader engagement and model accessibility for other institutions similarly situated. All requests are presumed to be ‘friendly’ requests — that is, requests that respect the API’s operating parameters and capacity to deliver data. To that end, it has a few features that embody this presumption:

Simple HTTP request/response model
By using the RESTful properties of HTTP combined with the ubiquity of HTTP clients (that is, web browsers), anyone can read this documentation and start querying the DPLA API right away.
No rate limiting
As detailed in our section on use and abuse, the DPLA API does not implement rate limiting of requests. Although the DPLA reserves the right to impose rate limiting on a case by case basis to ensure that others have the use of it, this is not the default.
Free, easy access to API keys
This API requires clients to pass an API key to our servers in order to correlate requests with requesters. The information generated by registration and requests is held in the strictest confidence and will only be used for legitimate purposes, e.g., abuse complaints, schema update notifications, etc.
Open access to “meta-metadata.”
In addition to offering metadata from an ever expanding group of library catalogues and databases, API responses contain information about the aggregated metadata requested. While the DPLA reserves the right to modify this meta-metadata for clarity or to correct misleading data, it is offered to the general public to give you a sense of the items collections for which you search.

It is the DPLA’s hope that users of the API will find its principles of openness and accessibility useful and make use of it in a way that nourishes and supports those values.