Wiki

Clone wiki

main / Functional_Requirements

These requirements have been compiled from many sources: GDFR, the format registry working group, and the UDFR road map.

Scope and Services

  • Manage Identifiers
    • Use a controlled namespace for the unambiguous persistent public identification of digital formats
    • Identifiers should support machine actionability as that is their primary use
    • UDFR identifiers should be amenable for human purposes and resilient against transcription error
  • Provide public interfaces
    • Provide end-user interfaces for discovery and delivery of managed representation information for human and automated agents
    • End-user service interfaces for discovery and delivery of representation information to human and automated agents (standardized web service interfaces)
      • These services should be usable by format identification tools external to the registry.
      • These services should be usable by other services external to the registry that support preservation planning and activities.
    • Distributed discovery, and delivery of representation information, providing high availability at any time from any place
    • Guaranteed public access to the representation information
      • Suppress from public view for a limited embargo period some representation information in order to protect proprietary, trade secret, or other legally-encumbered information.
  • Provide storage
    • Storage of format representation information of as many different formats as possible
    • Store sample files for each format created by different applications (e.g. a PDF produced by Adobe PDF Library 4.8).
    • Store format specifications
    • The UDFR will follow evolving best practices for the secure, sustainable management of format representation information
    • Representation information should be replicated at one or more geographically dispersed locations
  • Support distributed input of registry information
    • The input process should be easy to use but result in machine actionable output
  • Support editorial oversight or some form of quality vetting of contributed registry information
  • Support local copies of the registry
    • Ability to manage in the local copy of the registry local information that is not supposed to be shared with other registry instances. There are two goals for this requirement: manage in the local instance information that the institution has no right to share (information on software for example), manage information on local preferences (format preferences, local policies…).
    • For local registry instances: ability to express value assessment and priorities for using formats. For example: prefer .rtf than .doc, prefer TIFF than JPEG.
    • Provide a mechanism for the distribution of the registry data to the local instances
  • Be maintained and sustained by a permanent governance body

Data Model

  • General
    • The UDFR representation information should be capable of expressing the descriptive, administrative, syntactic, semantic, and behavioral properties of formats pertinent to curation and preservation analysis, decision making, and activities
    • All representation information will be tagged to indicate the level of centrally-coordinated review by the UDFR governing authority or its designees.
    • All representation information will be tagged with provenance information sufficient to provide a complete audit trail of changes over time.
    • Support embargo metadata for representation information to protect proprietary, trade secret, or other legally-encumbered information.
  • Identifiers
    • Support the binding of various typed representation information [ISO 14721] to format identifiers
  • Descriptive representation information
    • Descriptive representation information will include: a canonical UDFR identifier
    • Descriptive representation information should include an arbitrary number of namespaced identifiers publicly associated with the format, such as MIME type, PUID, GDFR identifier, Apple UTI, Library of Congress FDD identifier, standard identifier (ANSI, ECMA, FNOR, ISO, ITU, NISO, etc.), IETF RFC identifier, IANA identifier, W3C recommendation identifier, etc.
    • Descriptive representation information should include an arbitrary number of common names publicly associated with the format
    • Descriptive representation information should include a format version identifier, as issued by the legitimate maintenance agency
    • Descriptive representation information should include a short discursive description of the salient properties of the format
    • Descriptive representation information should include a format classification providing a means to indicate a format's (using terminology drawn from the GDFR Format Classification) genre, role, composition, encoding form, constraint, basis, domain, transformative nature
    • Descriptive representation information should include (using terminology drawn from the GDFR Format Model and Relationships) affinity, containment, definition, extension, modification, requisition, restriction, semantic equivalence, syntactic equivalence, version
    • Descriptive representation information should include an arbitrary number of informative notes
  • Administrative representation information
    • Administrative representation information should include the corporate or individual agent(s) that created the format, the corporate or individual agent(s) that hold the intellectual property rights to the format, the corporate or individual agent(s) responsible for format maintenance, the format creation or release date, the format withdrawal date, an indication of the IPR status of the format, an arbitrary number of informative notes documenting format administrative properties.
      • Agents can be either corporate or individual.
  • Syntactic representation information
    • Syntactic representation information should include arbitrary number of typed external signatures; arbitrary number of internal signatures; an indication of byte ordering: big-endian, little-endian, either, both, or unknown; an arbitrary number of format grammars expressed in some formal notation, such as ABNF, BSDL, DFDL, EAST, XCEL, etc.; an arbitrary number of example files.; an arbitrary number of informative notes documenting format syntactic properties.
  • Semantic representation information
    • Semantic representation information should include an arbitrary number of specification documents; an arbitrary number of format assessments expressed in some formal notation, such as Library of Congress FDD, etc.; an arbitrary number of informative notes documenting format semantic properties.
  • Behavioral representation information
    • Behavioral representation information should include an arbitrary number of format software dependencies, an arbitrary number of format hardware dependencies, an arbitrary number of format media dependencies, an arbitrary number of informative notes documenting format behavioral properties
      • Behavioral information should also include arbitrary number of software processes that are not necessarily dependencies (that is, they are not necessarily required in order to use the format) that accept a given format as an input or output. These processes should be typed with regard to their supported operation, e.g. validator, transformer, renderer, etc. --Abrams 21:26, 19 May 2010 (UTC)
  • Documents
    • Documents can be described in terms of Title; Edition; Authoring agent(s); Publishing agent(s); Date of publication; an arbitrary number of formal identifiers, such as DDC, ICC, IEC, IETF BCP, IETF RFC, IETF STD, ISBN, ISO, ITU, LCCN, OCLC number, SICI, etc.; Document language; Document type; an indication of the IPR status of the document; an arbitrary number of informative notes documenting document properties; an arbitrary number of files that contain manifestations of the document content.
  • Files
    • Files can be described in terms of Name; File type, such as data, executable, object code, source code, etc.; an arbitrary number of typed message digest values, such as CRC-32, MD5, SHA-1, SHA-256, etc.; an indication of the IPR status of the file; Agent(s) who hold a copy of the file; an arbitrary number of informative notes documenting file properties.
    • Local holdings are described in terms of a locally-meaningful identifier, an indication of public accessibility to the file.
  • IPR status
    • IPR status can be described in terms of the agent holding the rights; the effective date of the rights claim; the expiry date of the rights claim; the legal jurisdiction in which the rights claim is made; the type of rights claim, such as copyright, patent, trade secret, etc.; license terms of use for the items covered under the rights claim; an arbitrary number of informative notes documenting the rights claim.

Updated

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.