Catalog Structure & File Naming

PARADISEC’s catalog is organised hierarchically: a collection contains items, which contain files. PARADISEC will assign a collection identifier (usually the depositor’s initials followed by a digit, e.g. ‘XP1’).

Each item identifier within a collection must be unique — there can be no duplicate item identifiers within a collection, and each deposited item (e.g. each tape) should have its own unique identifier.

Related files can share the same base name. For example, an .eaf transcription file would share the same name as its corresponding audio or video file, and both would sit inside the same item. Using collection ‘EG1’, item ‘Fred’, and file ‘20150108’ as an example: the audio file would be named EG1-Fred-20150108.wav and its transcript EG1-Fred-20150108.eaf. Item-level description is recorded at the level of EG1-Fred. You do not need to list filenames in the metadata — only item-level information is required. Files simply need to be named correctly so they are associated with the right items when archived.

Naming conventions to follow carefully:

  • No delimiters within names. Hyphens are reserved for separating collection, item, and file identifiers, so do not use hyphens or spaces within any individual component of a name. Ideal names are strings of alphanumerics.
  • Keep names short. Aim for fewer than 10 characters per component where possible. There is an absolute limit of 30 characters on item names, and long filenames may not survive transfer between operating systems.
  • Names are case-sensitive. Use uppercase for collection and item names; use lowercase for file extensions (e.g. .wav, .tif, .pdf). Do not mix upper and lowercase, as this can cause problems locating files later.

Avoid Data Duplication

To help us preserve digital heritage sustainably and continue providing high-quality services, we ask all depositors to review and rationalise their files before submission.
Duplication creates unnecessary storage demands and increases long-term preservation costs.
Before depositing, please:
• Consolidate versions of the same material
• Remove duplicate or repeated files
• Identify final or authoritative versions clearly
• Use consistent naming to avoid accidental duplication