File naming & Data duplication

PARADISEC’s catalog is organised hierarchically such that: a ‘collection’ >contains> ‘items’ >contains> ‘files’. Each item identifier within a collection is unique within that collection; there can be no duplicate item identifiers within a collection. PARADISEC will assign a collection identifier, usually the depositor’s initials and then a digit, e.g. ‘XP1’. It is in upper case, and case is important. Please be very careful not to mix upper and lowercase in names as it can cause problems finding files later. Related files can have the same name, for example an .eaf transcription file would have the same name as the media file (audio/video), and both are in the same item.

e.g. Within collection ‘EG1’ there is an item ‘Fred’, and within that item is the audio file ‘20150108’. So, its file name is EG1-Fred-20150108.wav and its transcript is EG1-Fred-20150108.eaf. The item description is at the level of EG1-Fred. You don’t need to list filenames in the metadata, only item-level information. The files need to be named correctly so they are associated with the items listed in the metadata when they are archived.

Note:

  • Each deposited item (e.g. each tape) should have a unique identifier.
  • Do not use delimiters (like hyphens or spaces) within a file name. Hyphens are reserved characters for our archiving purposes, so ideal names are strings of alphanumerics limited to fewer than 10 characters if possible (there is an absolute limit on item names of 30 characters). Long filenames may not survive transfer between different operating systems so please try to keep them as short as you can.
  • Names are case-sensitive, so it is preferable for names to all be uppercase, while extensions (e.g., .wav, .tif, .pdf) should be lowercase.

Downloads available:
Filenaming for PARADISEC (pdf or docx)

Avoid Data Duplication
To help us preserve digital heritage sustainably and continue providing high-quality services, we ask all depositors to review and rationalise their files before submission.
Duplication creates unnecessary storage demands and increases long-term preservation costs.
Before depositing, please:
• Consolidate versions of the same material
• Remove duplicate or repeated files
• Identify final or authoritative versions clearly
• Use consistent naming to avoid accidental duplication