Discussion: what consitutes an AIP

Main Page > Vancouver Digital Archives > Documentation > Discussion: what constitutes an AIP?

=Relationship between SIP and AIP=


 * OAIS does not dictate how a SIP is to correspond to an AIP, explicitly allowing for one-to-one, one-to-many, many-to-one and many-to-many relationships. See OAIS, sec. 4.3.2, Data Transformations in the Ingest Functional Area.
 * In the Vancouver digital archives, a SIP is a folder (or container) which is based on a specific classification and associated retention schedule and disposition trigger. A folder does not contain any subfolders.
 * Most likely possibilities for construction of the AIP are: 1 SIP = 1 AIP or 1 SIP = multiple AIPs. In the latter case, an AIP would generally consist of a logical file, which in some cases would consist of more than 1 physical object (examples: documents with separate appendices, e-mails with attachments).
 * What are the pros and cons of these two possibilities?

1 SIP = multiple AIPs
=Creating and storing PDI=

OAIS defines Preservation Description Information (PDI) as "[t]he information which is necessary for adequate preservation of the Content Information and which can be categorized as Provenance, Reference, Fixity, and Context information." (OAIS definitions, 1-12)

Provenance information
OAIS defines Provenance Information as "[t]he information that documents the history of the Content Information. This information tells the origin or source of the Content Information, any changes that may have taken place since it was originated, and who has had custody of it since it was originated. Examples of Provenance Information are the principal investigator who recorded the data, and the information concerning its storage, handling, and migration." (OAIS definitions, p. 1-12)

Most of the provenancial information in objects transferred from TRIM will be contained in the TRIM export metadata file. Some of this information will be generic to the entire SIP and some will be exclusive to specific files. Since the TRIM export file is an xml document it can be parsed and its contents stored in a database (possibly in ICA-AtoM). However, some or all of it can be packaged as part of the AIP. Can some of the data be parsed and added to the BagIT package? There can and probably should be some redundancy - ie. some of the metadata are added to the BagIt package, some or all are bundled into the AIP (for example, the entire TRIM export metadata file could be added to the AIP) and some are extracted and stored in Qubit.

Reference information
OAIS defines Reference information as "[t]he information that identifies, and if necessary describes, one or more mechanisms used to provide assigned identifiers for the Content Information. It also provides identifiers that allow outside systems to refer, unambiguously, to a particular Content Information. An example of Reference Information is an ISBN."

The definition seems to imply that Reference Information is information about a unique identifier, but the example given is the unique identifier itself. CASPAR uses the term Persistent Identifier (as do other projects). See D1201:CONCEPTUAL MODEL - pp. 48-49 http://www.casparpreserves.eu/publications/deliverables.

We haven't really discussed what this number will be. Could we use the TRIM record number? Or should Archivematica generate a unique (persistent) identifier? The number would need to be stored as part of the AIP and in a database.

Fixity information
Typical example of fixity information is the checksum or hash value. According to CASPAR's conceptual model, PDI information also includes information about the checksum or hash value: "In a broad sense the tools for fixity used by the repositories (and by the creator) have to be documented and this documentation (specifically related to the process and to the responsibilities) will be part of the PDI component and would play a relevant role for ensuring the trustworthiness (integrity as a part of it) of the preserved resources." (D1201:CONCEPTUAL MODEL, p. 48)

In Archivematica the BagIT script creates MD5 checksums for all files packaged into an AIP. JHOVE can also be used to generate three different types of checksums for a file. These could be stored separately from the AIP or as part of the AIP. The information about the checksums and the process used to generate them would be generic to all AIPs and could be documented in policies and procedures or system documentation etc.

Context information
OAIS defines Context Information as "[t]he information that documents the relationships of the Content Information to its environment. This includes why the Content Information was created and how it relates to other Content Information objects." (OAIS definitions, p. 1-8)

According to CASPAR, "Context covers an extremely broad range of topics and it is difficult to define a precise boundary. In fact Provenance Information...can be viewed as a special type of Context Information." (D1201:CONCEPTUAL MODEL, p. 50). See notes for Provenance Information, above. Other context information could be provided as part of archival descriptions in ICA-AtoM.