Home
(DRS) User Manual for Data Loading
Contents
1. VEER 1 jl y Office for Information Systems Harvard University Library DRS User Manual for Data Loading Version 5 004 The Digital Repository Service DRS provides Harvard affiliated owners of digital material with a storage and retrieval system for their collections This manual is designed to assist deposit agents with the batch deposit of materials to DRS Included are procedures for batch depositing descriptions of the data loading process and descriptions of the base elements of the DRS batch DTD Supporting information about the batch DTD sample batch xml files supplemental metadata dictionaries etc are available in the DRS Documentation Center section of the OIS web site For more detailed descriptions of DRS services and policies regarding DRS use consult the DRS section of the OIS web site Need help with DRS deposits e Consult support information on the Depositor iSite http isites harvard edu icb icb do keyword k26186 Harvard only access e To report a problem or ask a question about DRS deposits please use the DRS feedback form http nrs harvard edu urn 3 hul ois drshelp If reporting a problem please describe the activity leading up to the problem and any error messages you receive Versions and Revision history aicinincinnniacai ni 3 1 0 Registration of Owners and Depositors oooommmsms 3 2 0 The Batch Deposit Process sissescsseasssisasisseaccsassindsesssancsssdevecassssneccei
2. e For production drop boxes http drs harvard edu 8080 drs servlet WebAdminService page view_queue Although these batch queue status pages are part of the DRS Web Admin system access to them requires only a valid Harvard ID Web Admin registration is not required 3 2 Batch loader reports The DRS batch loader sends out an email message that reports on the success or failure of a processed batch Recipients of this message are identified within the batch xml file For more information see Successful load report Failed load report 3 2 1 Successful load report When a batch has been successfully processed the DRS loader sends a confirmation report to the email addresses specified in the lt emailSuccess gt element of the batch xml file Email contacts for a successful load should include the depositor and the digitizing project leader manager It is important to save these reports since the information included DRS object IDs owner supplied identifiers URNs provides a link between deposited objects in DRS and your local information about these objects The successful load report has the following format Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 11 of 31 Subject DRS LOAD REPORT owner lt owner gt batch dir lt batch_directory_name gt batch lt batch_name gt DB lt batch_id gt Batch Summary Digital Objects Added Relationships Added URNs Re
3. DRS consists of an Oracle database and a UNIX file system Digital objects are stored in the DRS using an OIS load program that reads instructions from a batch control file formatted in XML and named batch xml The file system provides physical storage for the objects while the database tracks file locations on disk management information about the objects and object specific metadata Before deposits can begin both the object owner and depositing agent must be registered with the DRS To make deposits the depositing agent will transfer objects to a batch directory within a DRS SFTP drop box account Accompanying these objects will be a batch control file batch xml formatted according to the DRS batch DTD For more information about the batch deposit process see What is a deposit to the DRS About the SFTP drop box About batch directories Performing a batch deposit Controlling loading order of batches Best practices for depositors and owners DRS deposit tools and resources See also information about the Data Loading Process Checking batch queue status Batch loader reports What is a deposit to the DRS Digital objects come in two varieties objects originally in digital form born digital or reformatted objects changed from analog to digital Both kinds of electronic objects are loaded into the DRS in the same way A deposit includes a batch directory containing one or more digital
4. a comma delimited list of file patterns that should be deleted by the DRS loader after a successful load even if these files were not loaded into the DRS Four types of file patterns are supported 1 specifying the exact file title e g data out 2 specifying a file ending e g bak 3 specifying a file start e g temp 4 specifying all files e g contactInfo transaction lt batch name run17 userval shooting with new hasselblad digital unit gt lt batch name run17 remove bak temp gt lt batch name run17 remove gt lt batch name run17 remove data copy gt Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading lt contactInfo gt Purpose Mandatory Attribute s Elements contained lt emailSuccess gt Purpose Mandatory Attribute s Sample use lt emailFailure gt Purpose Mandatory Attribute s Sample Use lt successMethod gt Purpose Mandatory Attribute s Page 20 of 31 Area where email addresses are listed to report batch load successes and failures yes none emailSuccess emailFailure Declare a single or comma delimited list of email addresses to send the result report Yes none lt emailSuccess gt diglib lab harvard edu jane_doe harvard edu lt emailSuccess gt Declare a single or comma delimited list of email addresses to report errors about the load yes none lt emailFailure gt diglib lab ha
5. The DRS batch DTD is available from the OIS web site http hul harvard edu ois xml xsd drs drs_batch dtd This section defines the base elements within the batch DTD For definitions of type specific elements in the DTD e g still images audio consult the type specific metadata supplements available in the DRS Documentation Center of the OIS public web site Click on an element name to move to its definition in the list below lt batch gt lt add gt lt object gt lt fileFormat gt lt contactInfo gt lt relationshipMap gt lt file gt lt createDate gt lt emailSuccess gt lt objectID gt lt objectData gt lt access gt lt emailFailure gt lt id gt lt ownerSuppliedName gt lt usageClass gt lt sucessMethod gt lt relationship gt lt billingCode gt lt signature gt lt transaction gt lt unMap gt lt role gt lt mimetype gt lt owner gt lt urnmask gt lt purpose gt lt metadata gt lt urn gt lt quality gt lt batch gt Purpose root node of the DRS batch loading document Mandatory yes Required attribute s Optional attribute s Elements contained Sample use name DRS emails a report back to the depositor about successful loads This report references the load by this name attribute userval this attribute allows the depositor to associate text information of their choosing with that particular loading session directive used to request special processing see appendix remove
6. objects or request URNs for digital objects in the DRS yes none object relationshipMap urnMap Defines a DRS relationship A relationship map in the DRS consists of three items e reference to a DRS object e relationship type e reference to a DRS object Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 22 of 31 Mandatory Required attribute s Optional attribute s Elements contained Sample use lt objectID gt Purpose The references can identify an object previously loaded in the DRS or in the current batch The relationships are read left to right no none constrained A relationship can be constraining or unconstrained A DRS object cannot be removed if any constraining relationships to it exist Typically any relationship that exists between two objects with the same DRS owner is constrained Unconstrained relationships are only used for specifying inter owner relationships objectID relationship objectID lt transaction gt lt owner gt DRS TEST lt owner gt lt add gt lt relationshipMap gt lt file gt vase jpg lt file gt lt relationship value IS_DERIVATIVE_OF gt lt file gt vase tif lt file gt lt relationshipMap gt lt add gt lt transaction gt XML entity is used to refer to a DRS object in the current batch or already in the DRS There are four ways you can identify a DRS digital object during the loading process e file fi
7. objects and a batch transaction file The batch control file written in XML and called batch xml describes the objects and defines these actions e Add an object e Create the URN Uniform Resource Name for this object e Create a relationship between objects About the SFTP drop box Batch deposits are sent to OIS over the Harvard campus network Each depositing unit will be issued an SFTP drop box on an OIS secure server The depositor transfers the material by SFTP to one or more batch directories within his or her drop box Depositing units may be issued multiple drop boxes to improve workflow Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 6 of 31 2 3 The drop box issued to a depositing unit will contain the following directories usr lib incoming etcandbin Deposits will be made by transferring batch directories and their contents to the incoming directory Note Do not put deposit files directly into the incoming directory The batch xml file and digital object files should be within a sub directory that is under the incoming directory See Batch directories for more information Closing a connection to an SFTP drop box will trigger the queuing of a batch if the drop box contains any batch directories containing a batch control file batch xm1 A file named LOADING is automatically placed in a batch directory when an SFTP connection is closed and the batch directory contains a batch
8. the LOADING file are deleted To retry the batch update any corrupt or missing digital objects and upload the new batch xml file After you disconnect from the SFTP session your batch will be queued for reloading Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 14 of 31 4 0 Identifying and Relating Objects This section describes the options supported by DRS for identifying deposited objects and defining relationships between deposited objects The process of assigning identifiers and defining relationships occurs when the batch is processed based on instructions in the batch xml file For more information see Identifying deposited objects Assigning a Uniform Resource Name URN Defining relationships between objects 4 1 Identifying deposited objects There are three ways to identify an object in the DRS DRS object id a URN and through owner supplied metadata All of these identifiers should be tracked from the successful load report sent to the depositor DRS object ID The DRS object id is a numerical value generated automatically for every digital object deposited Every object in the DRS has a unique object ID URN A URN Uniform Resource Name can be requested for any object in the DRS URNs are persistent location independent identifiers for network accessible resources The key value of a URN is its persistence an object can be found by its URN even if its file na
9. this document for explanation The owner or depositor decides metric for quality It could be size resolution length or any other measurement that is meaningful In many cases there may be only one version of a particular digital object In this case ownerSuppliedName is unique and this value is not needed yes value value of the quality Should be NA if not applicable These values range from 1 to 10 where 1 is the lowest quality and 10 is the greatest lt quality value NA gt The basic nature of the binary material on disk It is more specific than the DRS metadata type in many cases yes ICC GIF JPEG TIFF TDF TEXT PCD AIFF RealAudio APP WAV WER JP2 ZIP GZIP PDF The date this digital object was created This field has the following valid date formats Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Mandatory Attribute s Sample use lt access gt Purpose Mandatory Required Attribute s Sample use lt usageClass gt Purpose Mandatory Required Attribute s Sample use lt signature gt Purpose Mandatory Required Attribute s Page 28 of 31 YYYYMMDD YYYY MM DD YYYY MM DD HH MM SS Leading zeros should be prepended if necessary The hours are on 24 hour time no none lt createDate gt 2001 04 14 15 15 20 lt createDate gt This element specifies what bodies can access this digital object if it is deliverable T
10. tools and resources Consult the DRS Documentation Center page on the OIS web site for information and links on topics related to DRS batch deposits including e the DRS batch DTD esample batch xml files e type specific metadata supplements and e suggestions for xml validators MD5 checksum utilities and SFTP sources The Data Loading Process Once a batch is deposited and the DRS loader detects the presence of a batch xml file the data loading process starts Note While batch deposits can occur at any time the DRS batch loading service processes batches only during business hours Monday to Saturday 8am 8pm Batches deposited after these hours will be processed the next business day The DRS loading process polls the SFTP drop boxes every few minutes for queued batch directories during times that the loader is running The data loading process includes these steps e All queued batch directories are sorted according to when they were put on the queue when the LOADING file was put in the batch directory corresponding to when the SFTP connection was closed and the sort order of batch directory names e Batches are processed and digital objects are deposited according to instructions provided in the batch xml file To check the status of your batch consult the Batch queue status page e When a batch is completed the loader sends an email report of the results See Batch loader reports for more information e
11. xml file The presence of the LOADING file is what triggers a batch to be put into the loading queue Please do not transfer the batch xml file to a batch directory until the batch is ready to be loaded and do not try to create or delete the LOADING file yourself SFTP connections can still be made to drop boxes while batches are processing so that additional batches can be dropped off Please refrain from doing anything with batch directories that have already been fully transferred to a drop box to avoid conflicts with the DRS loading application About batch directories All batch deposits to the DRS must be made within batch directories A batch directory is a properly named sub directory within the incoming directory of a drop box All batch contents including the batch control file batch xm1 must be located in a batch directory The figure below illustrates three batch directories within the dropboxX drop box dropboxX incoming arbitrarydirl BATCH1 LOADING batch xml batch contents arbitrarydir2 BATCH2 LOADING batch xml batch contents arbitrarydir3 BATCH3 LOADING batch xml batch contents In this example a depositor has transferred 3 batch directories arbitrarydirl arbitrarydir2 and arbitrarydir3 to a drop box called dropboxX The presence of a LOADING file indicates that these batches are already in the loading queue Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Pag
12. After a successful load all files that have been loaded into the DRS will be deleted from the batch directory along with the batch xml file the LOADING file and any empty directories Objects in the drop box that were not part of the load will not be removed If the batch directory becomes empty after deleting these files the batch directory will be deleted Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 10 of 31 Note that depositors can specify file names and file name patterns in the batch xml file that the DRS loader should delete after a successful load even though these files were not loaded into the DRS See the remove attribute of the lt batch gt element for details After a failed load all the digital objects associated with an error batch are left in the depositor s batch directory while the batch xml file and the LOADING file are deleted The depositor is responsible for all data until the DRS has sent a confirmation report Best practice requires depositors to keep a copy of all data until the project quality control is complete Related topics Checking batch queue status Batch loader reports 3 1 Checking batch queue status Depending on the size of the deposited data the load may take multiple hours to process and report The status of batches in the queue can be seen at e For test drop boxes http drstest harvard edu 901 1 drs servlet WebAdminService page view_queue
13. Data Loading Page 30 of 31 Mandatory Required Attribute s Sample use CONTAINER containerMetadata yes type metadata type See table above for appropriate values lt metadata type IMAGE gt lt imageMetadata gt consult Image appendix lt imageMetadata gt lt metadata gt or lt metadata type TEXT gt lt textMetadata gt consult text appendix not yet available lt textMetadata gt lt metadata gt Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 31 of 31 7 0 Requesting Assistance with Batch Loading The HUL Office for Information Systems OIS provides administrative and technical support for DRS and its related systems e For questions about registration and setup to use DRS please contact the Digital Projects Team e To report a problem or ask a question about batch deposits or other DRS technical issues use the OIS Support Center If you need to talk with someone in person consult the OIS Support Center for the current DRS contact in OIS Version 5 004 Revised January 27 2010
14. Manual for Data Loading Page 13 of 31 Here is a sample failure report To xxxxx Ofas harvard edu To drs support xxxxx harvard edu Subject DRS Error processing current batch There was an input data error while processing your DRS batch The data files are still in your batch directory and the batch xml file has been removed Please see the error text below Correct and upload the new batch xml and problematic object files if any exist to restart the loading process For more information contact drs supportOxxxxx harvard edu Error Text Drop Box fal6ftp Batch Directory NewacqAD309 Batch Name NewacqAD309_3_10_2004 Context validate Transaction unknown Top Level Message Following files not found U556253_1_smdl jpg U556254_1_smdl jpg U556253_1_prdwork jpg U531317_1 tif U531315_1_smdl jpg U531315_1_prdwork jpg U556254_1_lgdl jpg U556254_1 tif Embedded Exception Type none Note If the failure is caused by errors in the header of the batch xml file processing may fail before the loader can read the failure email addresses In this case the loader will report the error to DRS staff in OIS only If you submit a batch that does not go through and no email report is received contact DRS staff by submitting a DRS support request at http nrs harvard edu urn 3 hul ois drshelp All the digital objects associated with an error batch are left in the depositor s batch directory while the batch xml file and
15. arantee that the string created will be unique Submitting URN masks that generate non unique values will result in an error and the rejection of the request to generate aname To guarantee the generation of a unique name include the unique integer value component n in the mask Examples URN MASK CREATED VALUE urn 3 dig n urn 3 dig 75 urmn 3 FAL yyyy n urn 3 FAL 1999 76 um 3 HCL DIG yyyy mo dd urn 3 HCL DIG 20000103 unique only for one object on any given day um 3 HCL DIG yyyy mo dd n urn 3 HCL DIG 200001032 always unique Defining relationships between objects The DRS provides a flexible and powerful mechanism for defining relationships between objects that have been placed into repository storage The relationships may be one to one many to one or many to many The syntax for defining the relationship is as easy as the syntax of a sentence in English The pattern follows the form lt subject gt lt verb gt lt object gt The following pseudo code constructions represent some possible object relationships e File 123 1s derived from File 345 e Object id 20 is target for File 678 Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 17 of 31 e Object id 25 is derived from Object id 27 Relationships can only exist between objects in the DRS These relationships can be added through the batch xml transaction file during deposit of the related objects or any time after Consul
16. are depositing It has base metadata about the digital object that is used for billing identification validation and access Type specific data about this object is stored in the metadata element none ownerSuppliedName billingCode role purpose quality fileFormat signature createDate mimetype access Depositors must provide a name for each deposited object Most often this is a tracking number used by the depositor This name in combination with role purpose quality must be unique within the owner s set of objects in the DRS Please see Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Mandatory Attribute s Sample use lt billingCode gt Purpose Mandatory Attribute s Sample Use lt role gt Purpose Mandatory Required attribute s Sample use lt purpose gt Purpose Mandatory Required attribute s Page 26 of 31 the Identifiers part of this document that explains this yes none lt ownerSupplied Name gt music123 abc lt ownerSuppliedName gt These codes are supplied to the owning organizations by the Office for Information Systems OIS They should be received upon owner registration yes none lt billingCode gt HUL OWNER XYZ_0001 lt billingCode gt Defines the role of the object as compared to other digital objects that are the same logically but a different format See the Identifiers part of this document for explanation In many cases the
17. ask specifies a format for generating a URN Please see the URN Masks part of this document that describes that Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Mandatory Attribute s Sample use lt urn gt Purpose Mandatory Attribute s Sample use lt object gt Purpose Mandatory Attribute s Elements contained lt file gt Purpose Mandatory Attribute s Sample use lt objectData gt Purpose Attributes Elements contained lt ownerSuppliedName gt Purpose Page 25 of 31 no none lt urnmask gt urn 3 DRS Guest n lt urnmask gt To request a specific URN for a digital object This requested URN must have the correct authority path and format no none lt urn gt urn 3 DRS Guest 12345 lt urn gt This element brackets base object data that is to be added to DRS An object during the load process consists of a file name base object data and type specific metadata Each of these items is captured as a sub element of object no none file objectData metadata The depositor provides the name of the physical file that has been dropped into the SFTP drop box The file names must be unique within the batch xml driver file This file name is retained in the DRS but is not searchable see ownerSuppliedName for defining a local identifier yes none lt file gt vase_aquatic tif lt file gt Specifies crucial data about the digital object that you
18. ch as IS_PART_OF Mandatory yes Required Attribute s value the relationship text Here is a list of the relationships supported by the DRS and their meaning Sample use lt relationship value IS_DERIVATIVE_OF gt Metadata Name Supplement Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 24 of 31 HAS_DECLARATION Text HAS_DTD Text HAS_ENTITIES Text IS_AUXILIARY_OF Audio IS_DERIVATIVE_OF Image IS_ICC_FOR Image IS_ICC_OF Image IS_IMAGE_FOR Image IS_INDEXED_BY Text IS_OCR_FOR Text IS_OCR_OF deprecated Text IS_PART_OF Audio Text IS_PRESERVATION_REPLACEMENT_OF IS_RELATED_CHANNEL_OF Audio IS_TARGET_OF Image IS_TDF_FOR Image IS_WAVEFORM_FOR Audio IS_WORLD_FILE_OF Image lt urnMap gt Purpose Element allows you to name a digital object in the current batch or one that is already in the DRS You may request a specific URN using the lt urn gt element or you may request one be generated for you using the lt urnmask gt element Mandatory no Attribute s none Elements contained objectID urnmask urn Sample use lt transaction gt lt owner gt DRS TEST lt owner gt lt add gt lt urnMap gt lt file gt vase jpg lt file gt lt urnmask gt urn 3 DRS Guest n lt urnmask gt lt urnMap gt lt add gt lt transaction gt lt urnmask gt Purpose To request URN generation for a particular digital object A URN m
19. e 7 of 31 2 4 Batch directories can contain any number of sub directories which in turn can contain any number of sub directories Everything within a single batch directory is considered part of the same batch Batch directories can be named according to depositor preference as long as e There is no other directory with the same name in the incoming directory of that drop box e The batch directory name is less than 101 characters e The batch directory name contains only letters digits underscores _ and hyphens The Table below lists examples of valid and invalid batch directory names Examples of VALID batch directory Examples of INVALID batch directory names names batch batch directory 1batch batch batch1 batch 2005 _batch a batch directory name which exceeds the batch character length this length can be no longer than batch_1 100 characters batch_2005_06_01 batch_2005_06_01 1 20050601_150502 ag A batch directory name must be less than 101 characters and only contain letters digits underscores _ and hyphens Note that the name can start with any of these valid characters but it is best practice not to start with a hyphen because of the impact on file name sorting Performing a batch deposit Batch directory deposits to DRS can occur anytime to a specific SFTP drop box even when a different batch directory load is already in progress for that drop box Because there is n
20. enhancement 1 11 05 Corrected error in lt access gt element Required Attribute s section p23 10 12 04 Corrected error in drop box address p8 Changes to DRS batch report format p9 Changes to the DRS DTD local file name now retained in DRS but not searchable updates to these elements lt relationship gt lt fileFormat gt lt access gt lt mimetype gt and lt metadata gt 7 1 2004 Revised all FTP references to SFTP DRS batch deposit now requires a secure FTP client Changed the SFTP drop box address to Ididropbox hul harvard edu 3 23 2004 Jpdated JPEG2000 mimetypes EE pdated imagemetadata supplement address to ttp preserve harvard edu resources imagemetadata pdf E pdated Appendix M5 and XML resources links 10 07 2002 Updated DRS XML validator address to ttp drstest harvard edu cgi bin drs_validate pl a Registration of Owners and Depositors In order to deposit digital materials to the DRS the Harvard organizational owner of the materials must be registered as a DRS object owner and the agent responsible for depositing these materials must be registered as a DRS batch depositor Registration for DRS owners Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 4 of 31 The Harvard organization with financial and curatorial control over objects being prepared for deposit is called the object owner Object owners using the DRS for the first time must register a
21. es the mime type of the digital object you are depositing Mandatory yes Attribute s None Valid mimetypes application x esri pyramid file application x esri statistics file application x icc application x sonic waveform reduction application x wavelab waveform application zip reduction audio x aiff application pdf audio x wave audio x pn realaudio image jp2 image gif image jpx image jpeg image x photo cd image tiff text sgml text plain text xml Sample use lt mimetype gt text plain lt mimetype gt lt metadata gt Purpose This element holds the type specific metadata for the digital object As opposed to the base object metadata above different metadata types images text audio have completely different sets of attributes required for sufficient description and archiving This tag also specifies the DRS metadata type Specifying a metadata type IMAGE and a metadata sub element imageMetadata may seem redundant However in some cases more than one metadata sub element may be used for the same metadata type element Here is a list of valid DRS metadata types and the metadata sub elements they should use Metadata type Sub Element Supplement APP appMetadata Application AUDIO audioMetadata Audio IMAGE imageMetadata Image TARGET imageMetadata Image TEXT textMetadata Text Version 5 004 Revised January 27 2010 DRS User Manual for
22. eti vs deter woth lets ici ade setedbel TE 28 SION AUUITC gt is jiorecun dat idas 28 lt mimety POS 35s ea chet ce test ealeves GEEK E haere a teed SIE Peele eee ees 29 SIM CTA AAS ii EA aE E aTe is 29 7 0 Requesting Assistance with Batch Loading ccscccssssccssssccssssccssscssees 31 Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 3 of 31 Versions and Revision history 1 0 12 1 09 Version 5 004 Replaced links to old documentation corrected some links 7 16 09 Version 5 003 Added PDF to the list of accepted DRS formats and Mimetypes 3 6 09 Version 5 002 Removed references to unacceptable JPEG 2000 filename extensions Jjpx and jpf 4 8 08 Version 5 0 Added description of new usageClass and new successMethod attribute and updated links to correspond with new OIS website release and new DRS storage architecture changes 3 21 07 Version 4 008 Changed hours that batch processing occurs 8 1 06 Version 4 007 Changed max characters in batch directory name from 32 to 100 7 11 05 Version 4 006 Republished in HTML and PDF versions Minor reorganization of contents for delivery as HTML Removed Chapter 4 section on space planning Added details to Chapter 3 section on correcting data already deposited 6 16 05 Added IS_PRESERVATION_REPLACEMENT_OF relationship 6 10 05 Added mime type application zip 6 08 05 Added new batch report summary 6 01 05 Revised for multiple batches per drop box loading
23. f the report lists information for each object deposited The data in this section is tab delimited with the above descriptors listed at the top e Filename e Owner supplied name e URN If no URN was requested in the current batch a null value is returned here e Object ID e Mimetype e File Size e Insertion Date e Role e Purpose e Quality e Owner e Access Flag e Usage Class e MDS Signature The relationship section of the report lists the following information for each relationship added e DRS object id for 1st object e Relationship e DRS object id for 2nd object e Owner The URNs requested section of the report lists any URNs requested for objects outside the current batch The batch loader allows you to request a URN for an object already in the DRS In most cases objects will be named during the deposit of that digital object In that case the URN is listed above in the digital object section For URNs requested for objects outside the current batch the report provides this information e DRS object id e URN e Owner 3 2 2 Failed load report If any errors are detected during a load the entire batch is rejected and an error report is sent to the email addresses in the lt emailFailure gt element of the batch xm1 document The error email will contain the name of the batch directory that failed along with a message describing the cause of the failure Version 5 004 Revised January 27 2010 DRS User
24. here are three options for this P for public R for restricted and N for no access A public object is available to the world A restricted object is available to only Harvard An object marked as having no access is only available to administrators via the DRS Web Admin yes values P public R restricted N no access Only capital letters are accepted lt access value P gt Each digital file is stored according to its usage classification as either high use or low use Deliverables used for public access should be classified as highuse Preservation and archival versions masters that are not being delivered and other dark objects should be classified as lowuse yes values HIGHUSE LOWUSE lt usageClass value HIGHUSE gt or lt usageClass value LOWUSE gt The signature validates the integrity of the digital object during transfer between the depositor s system and the DRS It is also used for file validity within the DRS after it has been deposited MDS signatures are always character strings of length 32 specifying a hexadecimal checksum All letters a f should be lowercase yes type the type of the signature Currently the DRS only Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading supports MDS signatures Page 29 of 31 Sample use lt signature type MD5 gt 7c9b35da4f2ebd436f1cf88e5a39b3a2 lt signature gt lt mimetype gt Purpose Specifi
25. lename of the object in the current batch This identifier can only be used locally within a batch e urn the URN of an object already in the DRS This identifier cannot be used for referring to objects in the current batch Note that only objects that have had specific add URN transactions are named in the DRS Not all objects have URNs e id DRS object identifier Each object in the DRS has a numerical identifier associated with it upon deposit This is the key for digital objects in the DRS Every object already in the DRS must have an object id This identifier cannot be used for referring to objects in the current batch e ownerSupplieName role purpose quality mimetype the combination of these can be used to identify an object either in the current batch or already in the DRS Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 23 of 31 Note that these must be unique within the owner s space lt id gt Purpose DRS object id is a numerical value that uniquely identifies a digital object in the DRS Note that this is different from a URN which identifies a digital object in a globally unique namespace Mandatory no Attributes none Sample use lt id gt 123 lt id gt lt relationship gt Purpose To specify a relationship between two objects in the DRS Relationships are used to describe how separate digital objects were produced such as IS_DERIVATIVE_OF or how they form a complex object su
26. me or physical location changes A URN is used just like a URL A URN is required for objects that will be delivered out of the DRS for example image files delivered to users of the VIA union catalog A URN is optional for objects intended for storage only for example archival versions of objects Consult the Assigning a Uniform Resource Name URN section for information about URN assignment options Owner Supplied Name and Role Purpose Quality Each object stored in the DRS must be accompanied by an owner supplied name This name serves as a unique identifier that links deposited objects with local information about those objects The depositor must specify this name in the lt ownerSuppliedName gt element of the batch xml file for each digital object they are depositing The name must be unique within that owner s collection in the DRS although there is one exception More than one digital object may have the same owner supplied name if the role purpose quality values are different It is common for many versions of the same logical object to exist in the DRS For example there may be a high resolution archival version and a low quality deliverable of the same image One method for retaining this information is to have the same owner supplied name for both images but use different role purpose quality values to capture the differences between the physical objects This use of role purpose quality is optional for owners deposit
27. nica 18 5 1 Correcting data already in DRS woo elec esse ceseeseeeseeeeeeeeaeeeaeeeaaecaaeceeeeeeeseeeees 18 5 2 Adding values to controlled vocabulary liStS oonnnonnocininincnnncnnoccnoncnonanonaconacnno 18 January 27 2010 President and Fellows of Harvard College http hul harvard edu ois systems drs load_manual DRS User Manual for Data Loading Page 2 of 31 6 0 DTD Base Element Descriptions ooooomoocoss 19 EDAUCOS SL RAN NAAA AA AAA STE on Cae 19 contacto a aa E dnd a dot nl eS 20 KSTMAN SUCCESS Ai dada 20 EMail MIES a iia andadas 20 success Medi ti it is dsa 20 ALANIS ACT ONS E A AA EAR T ERAAN 21 OWED cries tess ici sein delicias a isidro 21 A Aa AE E ESEE EE ES 21 lt relationshipMap gt iii tits 21 lt object DS ere a a t e al aaaea aaa e 22 ADS A NE E AEE E ET T AEE EEE E 23 lt relations hipsi onise isis dialers E RA ER E e Ean 23 UM o e De ee cdots Gli OO io 24 AO MA ru A ES A AA EE 24 O a a dd dae e a a nd A 25 GOD ER A aaa eed eee Le ee eat ee 25 MA ea io aah aaa iat ae eae 25 SODJSCHD ALAS heise lasek tallada aia 25 lt OWnerSuppliddNaME gt i ivinsecessecasse ne tedvasereuesee ida ee AE i 25 lt billingCode gt ms cascos Abdou Siw esas etd aie lbisnos eS 26 POLO A A Sg TREE Cea oa ha N Ceska 26 PUEPOSC gt eae taster oleate dle aceon pen Gat sels a a alg las lo Mads DOES da 26 SUA ICY gt aia ib 27 A A A E 27 KCL ALED ALS A SRA 27 O ea EEE A eet Tee ees cane aes Meee 28 SUSASEC ASS muestra ld Sete sed
28. o locking mechanism on drop boxes or batch directories care should be taken not to interfere with batches already in the queue If you see a LOADING file in the batch directory loading of the batch is in progress In addition to the steps below depositing agents should also consider information in the Best practice section of this manual DRS batch deposit requires a secure FTP client Information about SFTP client options is available on the DRS Documentation Center page The SFTP drop box address for DRS Production deposits is drsrop hul harvard edu The SFTP drop box address for DRS QA deposits is drsrop qa hul harvard edu To deposit one or more batches follow these steps 1 Open an SFTP connection to drsrop hul harvard edu for production deposits or to drsrop qa hul harvard edu for QA deposits 2 Transfer a uniquely named batch directory of data to the incoming directory The batch directory can be named according to depositor preference as long as it meets the naming constraints described in About batch directories Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 8 of 31 2 5 3 If you are ready to queue the batch transfer the batch xm1 file to the batch directory If you are not ready to queue the batch skip this step 4 Close the SFTP connection The act of closing the connection queues any batch directories with abatch xml file Batches will not be queued until the SFTP sessi
29. on is closed Opening the SFTP session again is fine as well as maintaining multiple SFTP sessions to the same drop box 5 If you want to send another batch go to step 1 and repeat these steps 6 Once a batch is processed the DRS loader will send an email message that reports the success or failure of the batch See The data loading process for more information Controlling loading order of batches Sometimes a batch is dependent on the contents of another batch and you want to make sure one loads before the other There are two ways to control the queue order of your batches e Using the SFTP client Follow deposit steps 1 4 described in Performing a batch deposit for the batch you want loaded first Make sure that you close the SFTP connection to queue up the first batch Then follow the same 4 steps again for the batch you want loaded next By closing the SFTP connection between SFTPing batches you will ensure that the batch SFTPed first will get loaded first e Using batch directory names The second way to control the loading order of batches is by the names of batch directories To utilize this method you SFTP multiple batches in the same SFTP session Name your batch directories so that the names of any batches you want loaded earlier would come earlier in an ASCII alphabetical sort order That is hyphens then digits then capital letters then underscore _ then lower case letters The table below shows the sorting
30. order of 14 batch directories that had been transferred to a single drop box in a single SFTP session When you close the SFTP connection batches will be queued in the sort order of the batch directory names Queue order sequence Batch directory name 1 1test 2 _ 3 adir 4 123 5 1234 6 124 7 Adir 8 Bdir 9 _ltest 10 _Itestl 11 _1testl_ 12 _test 13 adirectory 14 bdirectory Note that batches from other drop boxes might be queued between these batches but this order sequence would be maintained Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 9 of 31 2 6 2 7 3 0 Best practices for depositors and owners e An MDS signature is required for each object to ensure that the object has been successfully transmitted to DRS MDS tool information is available on the DRS Documentation Center page e Object owners are strongly encouraged to retain the object on local servers until they have received notification by the DRS that the object has been successfully deposited and quality assurance procedures have been completed e Object owners are strongly encouraged to maintain a link between the local system and the DRS DRS permits users to associate local information such as an accession number with an object via the owner supplied identifier field Please see the Identifying deposited objects section of this manual DRS deposit
31. ors If the owner supplied name is unique there is no need to use role purpose quality at all If you are going to use these descriptors Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 15 of 31 please consult the appropriate metadata supplement for best practice Metadata supplements are available from the DRS Documentation Center on the OIS web site 4 2 Assigning a Uniform Resource Name URN A Uniform Resource Names URN is a persistent location independent identifier for a network accessible resource The key value of a URN is its persistence an object can be found by its URN even if its file name or physical location changes A URN is used just like a URL A URN can be specified for any object in the DRS A URN is required for objects that will be delivered out of the DRS for example image files delivered to users of the VIA union catalog A URN is optional for objects intended for storage only for example archival versions of objects For more information see any of these topics URN syntax URN assignment options 4 2 1 URN syntax In the DRS a URN has the form urn 3 lt authority path gt lt resource name gt Example urm 3 FHCL 123456 where urn 3 is the namespace identifier which indicates that the name is part of Harvard s NRS namespace lt authority path gt is the authority path which identifies the Harvard organizational unit responsible for the name and lt resou
32. quested ttachment drsbatch_ lt batch_id gt txt igital Objects Added elationships Added RNs Requested Click here to view a sample successful load report The Subject line of the email message contains basic information about the batch The body of the report contains a batch summary followed by three sections that describe the digital objects added relationships added and URNs requested These same three sections are also included in a tab delimited text file attached to the email message Within the Subject line of the report e owner is the DRS owner code specified in the lt owner gt element of the batch xml file ebatch directory name is the name of the directory to which the batch was deposited ebatch name is the name of the batch provided by the depositor in the lt batch gt element of the batch xml file ebatch idis an internal identifier generated by the DRS for each deposited batch The Batch Summary section which is only in the body of the email includes the following information about the batch e Batch directory name e Batch name e Batch id e Owner s e Batch drop off time e Time waiting to start load e Loading start time e Loading end time e Total load time e Number of transactions e Number of files deposited e Batch size Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 12 of 31 e Number of files per mime type The digital object section o
33. rce name gt identifies the named object The lt resource name gt portion must be unique relative to the specified lt authority path gt A URN in this form urn 3 FHCL 123456 is not actionable unless it is embedded within a URL Here is an example of an actionable URN as found in a catalog http nrs harvard edu urn 3 FHCL 123456 The domain name nrs harvard edu refers to the name resolution server for Harvard s NRS namespace URN assignments are supplied by the lt urnMap gt element in the batch xm1 file that accompanies a deposit 4 2 2 URN assignment options To assign a URN to deposited objects the depositing agent must have the appropriate authority path and a decision about style of resource name assignment The object owner is responsible for providing the authority path The style of resource name will be determined by the owner in consultation with depositing agent There are two options for resource name style request a specific URN or request that a URN be generated by DRS Requesting a specific URN Requesting a specific URN means to fully specify the URN as a literal string that will be assigned to the object in the deposit process In the DRS DTD use the lt urn gt element to specify the URN The literal string will include urn 3 namespace identifier followed by the appropriate authority path and a unique local identifier often an accession number serving as the resource name The resource name must be unique rela
34. re may only be one version of a particular digital object In this case the ownerSuppliedName should be unique and this value is not needed yes value value of the role Should be NA if not applicable lt role value NA gt Name Metadata Supplement ARCHIVAL_MASTER Image DELIVERABLE Image PRODUCTION_MASTER Image CONTAINER NA Defines the purpose of the object as compared to other digital objects that are the same logically but in a different format See the Identifiers part of this document for explanation In many cases there may be only one version of a particular digital object In this case ownerSuppliedName is unique and this value is not needed yes value value of the purpose Should be NA if not applicable Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Sample use lt quality gt Purpose Mandatory Required attribute s Sample use lt fileFormat gt Purpose Mandatory List of valid element values lt createDate gt Purpose Page 27 of 31 lt purpose value NA gt Name Metadata Supplement BITONAL Image COLOR Image CROPPED Image GRAYSCALE Image PRINT Image PROCESSED Image RAW Image VIEW Image WHOLE Image NA Defines the quality of the object as compared to other digital objects that are the same logically but in a different format See the Identifiers part of
35. rvard edu jane_doe harvard edu lt emailFailure gt Request a method of delivery for DRS success reports Yes Note that if this element is missing from batch xml DRS loader will add it at deposit with the default value EMAIL Value Possible values are EMAIL DROPBOX ALL Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Sample Use lt transaction gt Purpose Mandatory Required attribute s Optional attribute s Elements contained lt owner gt Purpose Mandatory Attribute s Elements contained Sample use lt add gt Purpose Mandatory Attribute s Elements contained lt relationshipMap gt Purpose Page 21 of 31 lt successMethod value EMAIL gt Marks the start and end of an individual operation in the DRS yes none userval this attribute is similar to the userval attribute provided to you in the opening lt batch gt element A unique comment can be provided for each transaction Currently add is the only operation enabled through the batch loader owner add DRS owner code that specifies who owns the digital object being deposited This owner code should be given to you during the owner registration process Every transaction in the DRS must be owned yes none none lt owner gt HUL OWNER lt owner gt Marks the start and end of an add operation within per transaction Currently you can add digital objects relationships between
36. s a digital object owner To register submit the online registration form available from the OIS web site http hul harvard edu ois systems drs f drs owner reg html Owner registration is a one time process More information about the registration process appears in the Digital object owner FAQ on the OIS web site Consult the List of Registered Owners for a current list of Harvard organizations that have registered as DRS object owners Registration for deposit agents A deposit agent is an individual or organization authorized to deposit batches of digital objects into the DRS An agent may be a reformatting digitization vendor that deposits objects on behalf of the Harvard organization that owns the objects Or an agent may be an individual within the owning organization New deposit agents must register and prepare for their first deposit by following the steps outlined in How to become a DRS deposit agent Upon registration the deposit agent will receive an SFTP DRS drop box account and DRS loading instructions Related topic Maintenance of DRS Data Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 5 of 31 2 0 The Batch Deposit Process 2 1 2 2 This section describes the DRS batch deposit process including SFTP drop boxes and batch directories as well as the actual deposit procedure To find out more about how the DRS processes batch deposits see The Data Loading Process
37. s tiara 5 2 1 What is a deposit to the DRS ossicsisisieneiiciisicnieiiiirniiiiniiiniiiiiiiane 5 2 2 About the SFTP drop DOX iii cccsstsecessessngcedasea cede snasciesssccelpsoneasepuabegustenccsdsaesseaepivadens 5 2 3 About batch directories ce vansscvesececesacsanctcssesenvesdasadcoctuesznes iaiaaeaia kN ENAS skinie Nin 6 2 4 Performing a batch deposit ooooocccnncccnnocconoccnonccnonononnnannnnncnnnncononcnnonncnan cn i 7 2 5 Controlling loading order Of batches ooooonnccnnncnnncinocanacnnancnonononoconacononnncnncnnccnn no 8 2 6 Best practices for depositors and OWNEIS ooooconccnnoconoccnnnnnnnnonononncconocnno cnc cnnccnncnnnnos 9 2 7 DRS deposit tools and resQUICES ooooonocccococoncconnconncnnnnon ccoo nono nono nc conan cnn nn rancia cnn 9 3 0 The Data Loading Process ccsesiasccscscssissscsanssasceanicsxsdancansseanensncsasccesdsasccanasancnsens 9 3 1 Checking batch queue Status eee eee eseecsseeeeceeeeeseeeeeeeeaeesaeesaaeeaeeseeeseeeeeeeees 10 3 2 Batch loader reports iia nine sico ri ide 10 4 0 Identifying and Relating Objects ooccoooooocooncconnccnoonccnoncccnoncccnonccconoccconocoss 14 4 1 Identifying deposited Objects eee eee eeeeeeeeeseceeeceeceeeeeeeeeeeeseeeeeeecsaecaecnaeeeaeens 14 4 2 Assigning a Uniform Resource Name URN ooooconoconoconocnnonnconcconoconacanccnnncnncannnos 15 4 3 Defining relationships between Objects 0 eee eeeeeeeeeneesseceaeceseeeeeeeeeeeeeeeeeeaaes 16 5 0 Maintenance of DRS Dat c
38. t the lt relationshipMap gt element in the DTD element descriptions for a list of currently defined relationships If you plan to relate objects between batches you must track the DRS object IDs returned from loading reports If digital object A is deposited in batch 1 and digital object B is deposited in batch 2 the relationship transaction in batch 2 must specify the object ID or URN of object A because object A is not in the current batch The object id is the preferred identifier for referencing something already in the DRS The second example above demonstrates this relationship transaction where object A has id 20 and object B has the file name 678 in batch 2 Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 18 of 31 5 0 Maintenance of DRS Data 5 1 5 2 This section covers topics related to the maintenance of DRS data For more information see Correcting data already in DRS Adding values to controlled vocabulary lists Correcting data already in DRS The DRS batch deposit process supports the addition of data only Options at batch deposit include e Add digital objects e Create a URN to an object for an object in the batch or an object already deposited e Create a relationship between objects for objects within the batch and also objects already deposited Once objects are deposited you can use the DRS administrative system called DRS Web Admin to further manipulate
39. the objects and related metadata DRS Web Admin allows authorized object owners and deposit agents to view add update and delete their data in DRS using only a web browser Using DRS Web Admin authorized users can e Deposit individual objects and related metadata e View and download deposited objects e Update existing objects and related metadata replace an object change its metadata add update relationships create URNSs e Delete deposited objects What a user can do from DRS Web Admin depends on the security role assigned to the user and the DRS owner code associated with that role A user representing a single organization will usually be authorized to manipulate only objects under that organization s DRS owner code A user working for more than one organization such as a deposit agent will be authorized to manipulate objects for multiple owners Consult the DRS Web Admin section of the OIS web site for more information about functions security roles registration and access Adding values to controlled vocabulary lists Some elements in the batch loading DTD have a set of controlled terms to select from examples include relationships role purpose quality If the list of terms in any particular element does not reflect your archival needs contact OIS and ask for the OIS Metadata Analyst Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 19 of 31 6 0 DTD Base Element Descriptions
40. tive to the chosen authority path In the following sample URN urn 3 FHCL ms 12345 Version 5 004 Revised January 27 2010 DRS User Manual for Data Loading Page 16 of 31 FHCL is the authority path for Harvard College Library and ms12345 is the resource name in this case the object s accession number Requesting that a URN be generated 4 3 To request that a URN be generated for an object means specifying a URN that is a combination of literal values and auto generated values In the DRS DTD use the lt urnmask gt element to request that a URN be generated The URN mask has components that are expanded by the DRS load program when a name is created at which time the components are replaced in the URN with date and time stamps These components may be added anywhere in the string and may be repeated if desired They are enclosed in braces to distinguish them from other parts of the URN string This means that brace characters may not be part of a name brace characters are not part of the valid NRS name character set in any event The components are Component Replacement values Meaning dd 01 31 Current day of the month hh24 01 24 Current hour 24 hour clock mm 00 to 59 Current minute of hour mo 01 12 Current month n 0 1027 Unique integer value ss 00 59 Current second yyyy 1999 9999 Current year Note that using the URN mask does not gu
Download Pdf Manuals
Related Search
Related Contents
Warmup® 3iE™ Energy-Monitoring Thermostat Testo 410-1 Air Velocity and Temperature Meter User Manual Stöckli Guide Page 3 - Episcopal Church Yocto-MiniDisplay, user manual 仕様書 - YAMABISHI M - PASO Sound Systems Products ヘルスアシスト HA-4EX 交換用ヘルスメーター取扱説明書 Mémoire de Projet Professionnel TITRE DU PROJET Copyright © All rights reserved.
Failed to retrieve file