Internetarchive is a python interface to archive.org.
Usage:
>>> from internetarchive import get_item
>>> item = get_item('govlawgacode20071')
>>> item.exists
True
copyright: |
|
---|---|
license: | AGPL 3, see LICENSE for more details. |
Bases: internetarchive.item.BaseItem
This class represents an archive.org item. Generally this class should not be used directly, but rather via the internetarchive.get_item() function:
>>> from internetarchive import get_item
>>> item = get_item('stairs')
>>> print(item.metadata)
Or to modify the metadata for an item:
>>> metadata = dict(title='The Stairs')
>>> item.modify_metadata(metadata)
>>> print(item.metadata['title'])
'The Stairs'
This class also uses IA’s S3-like interface to upload files to an item. You need to supply your IAS3 credentials in environment variables in order to upload:
>>> item.upload('myfile.tar', access_key='Y6oUrAcCEs4sK8ey',
... secret_key='youRSECRETKEYzZzZ')
True
You can retrieve S3 keys here: https://archive.org/account/s3.php
Download files from an item.
Parameters: |
|
---|---|
Return type: | bool |
Returns: | True if if all files have been downloaded successfully. |
Get a File object for the named file.
Return type: | internetarchive.File |
---|---|
Returns: | An internetarchive.File object. |
Parameters: | file_metadata (dict) – (optional) a dict of metadata for the given fille. |
Modify the metadata of an existing item on Archive.org.
Note: The Metadata Write API does not yet comply with the latest Json-Patch standard. It currently complies with version 02.
Parameters: |
|
---|
Usage:
>>> import internetarchive
>>> item = internetarchive.Item('mapi_test_item1')
>>> md = dict(new_key='new_value', foo=['bar', 'bar2'])
>>> item.modify_metadata(md)
Return type: | dict |
---|---|
Returns: | A dictionary containing the status_code and response returned from the Metadata API. |
Upload files to an item. The item will be created if it does not exist.
Parameters: |
|
---|
Usage:
>>> import internetarchive
>>> item = internetarchive.Item('identifier')
>>> md = dict(mediatype='image', creator='Jake Johnson')
>>> item.upload('/path/to/image.jpg', metadata=md, queue_derive=False)
True
Return type: | list |
---|---|
Returns: | A list of requests.Response objects. |
Upload a single file to an item. The item will be created if it does not exist.
Parameters: |
|
---|
Usage:
>>> import internetarchive
>>> item = internetarchive.Item('identifier')
>>> item.upload_file('/path/to/image.jpg',
... key='photos/image1.jpg')
True
Bases: internetarchive.files.BaseFile
This class represents a file in an archive.org item. You can use this class to access the file metadata:
>>> import internetarchive
>>> item = internetarchive.Item('stairs')
>>> file = internetarchive.File(item, 'stairs.avi')
>>> print(f.format, f.size)
('Cinepack', '3786730')
Or to download a file:
>>> file.download()
>>> file.download('fabulous_movie_of_stairs.avi')
This class also uses IA’s S3-like interface to delete a file from an item. You need to supply your IAS3 credentials in environment variables in order to delete:
>>> file.delete(access_key='Y6oUrAcCEs4sK8ey',
... secret_key='youRSECRETKEYzZzZ')
You can retrieve S3 keys here: https://archive.org/account/s3.php
Delete a file from the Archive. Note: Some files – such as <itemname>_meta.xml – cannot be deleted.
Parameters: |
|
---|
Download the file into the current working directory.
Parameters: |
|
---|---|
Return type: | bool |
Returns: | True if file was successfully downloaded. |
Bases: object
This class represents an archive.org item search. You can use this class to search for Archive.org items using the advanced search engine.
Usage:
>>> from internetarchive.session import ArchiveSession
>>> from internetarchive.search import Search
>>> s = ArchiveSession()
>>> search = Search(s, '(uploader:jake@archive.org)')
>>> for result in search:
... print(result['identifier'])
Bases: object
This class represents the Archive.org catalog. You can use this class to access tasks from the catalog.
>>> import internetarchive
>>> c = internetarchive.Catalog(internetarchive.session.ArchiveSession(),
... identifier='jstor_ejc')
>>> c.tasks[-1].task_id
143919540
Bases: requests.sessions.Session
The ArchiveSession object collects together useful functionality from internetarchive as well as important data such as configuration information and credentials. It is subclassed from requests.Session.
Usage:
>>> from internetarchive import ArchiveSession
>>> s = ArchiveSession()
>>> item = s.get_item('nasa')
Collection(identifier='nasa', exists=True)
A method for creating internetarchive.Item and internetarchive.Collection objects.
Parameters: |
|
---|
Get an item’s metadata from the Metadata API
Parameters: | identifier (str) – Globally unique Archive.org identifier. |
---|---|
Return type: | dict |
Returns: | Metadat API response. |
Get tasks from the Archive.org catalog. internetarchive must be configured with your logged-in-* cookies to use this function. If no arguments are provided, all queued tasks for the user will be returned.
Parameters: |
|
---|---|
Returns: | A set of CatalogTask objects. |
Mount an HTTP adapter to the ArchiveSession object.
Parameters: |
|
---|
Search for items on Archive.org.
Parameters: |
|
---|---|
Returns: | A Search object, yielding search results. |
Convenience function to quickly configure any level of logging to a file.
Parameters: |
---|
This module implements the Internetarchive API.
copyright: |
|
---|---|
license: | AGPL 3, see LICENSE for more details. |
Configure internetarchive with your Archive.org credentials.
Parameters: |
---|
>>> from internetarchive import configure
>>> configure('user@example.com', 'password')
Delete files from an item. Note: Some system files, such as <itemname>_meta.xml, cannot be deleted.
Parameters: |
|
---|
Download files from an item.
Parameters: |
|
---|---|
Return type: | bool |
Returns: | True if all files were downloaded successfully. |
Get File objects from an item.
Parameters: |
|
---|
>>> from internetarchive import get_files
>>> fnames = [f.name for f in get_files('nasa', glob_pattern='*xml')]
>>> print(fnames)
['nasa_reviews.xml', 'nasa_meta.xml', 'nasa_files.xml']
Get an Item object.
Parameters: |
|
---|
>>> from internetarchive import get_item
>>> item = get_item('nasa')
>>> item.item_size
121084
Return a new ArchiveSession object. The ArchiveSession object is the main interface to the internetarchive lib. It allows you to persist certain parameters across tasks.
Parameters: |
|
---|---|
Returns: | ArchiveSession object. |
Usage:
>>> from internetarchive import get_session
>>> config = dict(s3=dict(access='foo', secret='bar'))
>>> s = get_session(config)
>>> s.access_key
'foo'
From the session object, you can access all of the functionality of the internetarchive lib:
>>> item = s.get_item('nasa')
>>> item.download()
nasa: ddddddd - success
>>> s.get_tasks(task_ids=31643513)[0].server
'ia311234'
Get tasks from the Archive.org catalog. internetarchive must be configured with your logged-in-* cookies to use this function. If no arguments are provided, all queued tasks for the user will be returned.
Parameters: |
|
---|---|
Returns: | A set of CatalogTask objects. |
Returns details about an Archive.org user given an IA-S3 key pair.
Parameters: |
---|
Returns an Archive.org username given an IA-S3 key pair.
Parameters: |
---|
Modify the metadata of an existing item on Archive.org.
Parameters: |
|
---|---|
Returns: | requests.Response object or requests.Request object if debug is True. |
Search for items on Archive.org.
Parameters: |
|
---|---|
Returns: | A Search object, yielding search results. |
Upload files to an item. The item will be created if it does not exist.
Parameters: |
|
---|---|
Returns: | A list of requests.Response objects. |