PARADISEC’s catalog is organised hierarchically: a collection contains items, which contain files. PARADISEC will assign a collection identifier (usually the depositor’s initials followed by a digit, e.g. ‘XP1’).
Each item identifier within a collection must be unique — there can be no duplicate item identifiers within a collection, and each deposited item (e.g. each tape) should have its own unique identifier.
Related files can share the same base name. For example, an .eaf transcription file would share the same name as its corresponding audio or video file, and both would sit inside the same item. Using collection ‘EG1’, item ‘Fred’, and file ‘20150108’ as an example: the audio file would be named EG1-Fred-20150108.wav and its transcript EG1-Fred-20150108.eaf. Item-level description is recorded at the level of EG1-Fred. You do not need to list filenames in the metadata — only item-level information is required. Files simply need to be named correctly so they are associated with the right items when archived.
Naming conventions to follow carefully:
- No delimiters within names. Hyphens are reserved for separating collection, item, and file identifiers, so do not use hyphens or spaces within any individual component of a name. Ideal names are strings of alphanumerics.
- Keep names short. Aim for fewer than 10 characters per component where possible. There is an absolute limit of 30 characters on item names, and long filenames may not survive transfer between operating systems.
- Names are case-sensitive. Use uppercase for collection and item names; use lowercase for file extensions (e.g. .wav, .tif, .pdf). Do not mix upper and lowercase, as this can cause problems locating files later.
Avoid Data Duplication
• Remove duplicate or repeated files
• Identify final or authoritative versions clearly
• Use consistent naming to avoid accidental duplication