Internet Archive Items ====================== What Is an Item? ---------------- Archive.org is made up of "items". An item is a logical "thing" that we represent on one web page on archive.org. An item can be considered as a group of files that deserve their own metadata. If the files in an item have separate metadata, the files should probably be in different items. An item can be a book, a song, an album, a dataset, a movie, an image or set of images, etc. Every item has an `identifier `_ that is unique across archive.org. How Items Are Structured ------------------------ An item is just a directory of files and possibly subdirectories. Every item has at least two files named in the following format (see `metadata page `_ for more context on what an identifier is): - ``_files.xml`` - ``_meta.xml`` The ``_meta.xml`` file is an XML file containing all of the `metadata describing the item `_. The ``_files.xml`` file is an XML file containing all of the file-level metadata. There can only be one ``_meta.xml`` file and one ``_files.xml`` file per item. Alongside these metadata files and the original files uploaded to the item, the item may also contain `derivative files automatically generated by archive.org `_. Item Limitations ---------------- As a rule of thumb, items should: - **not** be over 100GB - **not** contain more than 10,000 files. Collections ----------- All items must be part of a collection. A collection is simply an item with special characteristics. Besides an image file for the collection logo, files should **never** be uploaded directly to a collection item. Items can be assigned to a collection at the time of creation, or after the item has been created by modifying the ``collection`` element in an item's metadata to contain the identifier for the given collection (i.e. ``ia metadata -m collection:``. Currently collections can only be created by archive.org staff. Please contact `info@archive.org `_ if you need a collection. Archival URLs ------------- An item's "details" page will always be available at:: https://archive.org/details/ The item directory is always available at:: https://archive.org/download/ A particular file can always be downloaded from:: https://archive.org/download// **Note**: Archival URLs may redirect to an actual server that contains the content. The resultant URL is **not** a permalink. For example, the archival URL:: https://archive.org/download/popeye_taxi-turvey/popeye_taxi-turvey_meta.xml currently redirects to:: https://ia802304.us.archive.org/30/items/popeye_taxi-turvey/popeye_taxi-turvey_meta.xml **DO NOT LINK** to any archive.org URL that begins with numbers like this. This refers to the particular machine that we're serving the file from right now, but we move items to new servers all the time. If you link to this sort of URL, instead of the archival URL, your link **WILL** break at some point.