- This event has passed.
DPTP: What’s the magic number? The identification, characterisation, and validation of File Formats| £300
18th Jul 2017
If you have a digital preservation strategy that involves digital files, you’ll know how important it is to understand the file formats in which your data is encoded. To do this comprehensively involves at least three main operations: identifying the format, characterising the format, and validating the format. To put it another way, you’re asking three important questions: what is it, what properties does it have, and how is it going to behave? Is the file in front of you really a PDF, or not?
This course will demonstrate some of the tools that are available for carrying out these actions, show you the outputs, and help you in interpreting what they mean and why they are important for long-term preservation – particularly for any future migration operations.
While you may think technical metadata and file format signatures are a bit out of your line, we make them easy to grasp, and at the end of the course you will see how they apply to your collections and your content, and how they impact directly on the meaning and authenticity of your digital objects. You are not expected to do any command line actions on this course, but you will see how easy it is to do, and feel empowered to “try them at home”.
This course is a detailed examination of some of the operations that are likely to take place in a standard Ingest routine. As such, the course is part of a projected series of learning offerings that, taken together, will help you understand the process of Archival Information Package (AIP) assembly. Since these processes can often be automated in a preservation system, it is useful to see them at work and expose their mechanisms, to better understand the operations and their consequences.
By the end of this course, you will be able to:
- Understand the difference between identification, characterisation and validation of file formats
- Gain awareness of the tools that can be used for helping with these discrete operations
- See these tools in action, via screencasts. The tools include Siegfried, DROID, FITS, JHOVE, and veraPDF
- Have a chance to examine the outputs of the tools in some detail, seeing technical metadata from files and learning what it means
- Be equipped to interpret the outputs of the tools
- Learn the meaning of mime types, extensions, signatures and pattern matching (the “magic numbers”)
- Gain a basic understanding of extensible markup language (XML)
- Learn about the flavours of the PDF/A format
- Understand how to act on the results of identification, characterisation and validation
- Understand the limitations of the tools, and why they sometimes fail
- Be equipped to begin or enhance a strategy for dealing with file formats, as part of your digital preservation plan
Ed Pinsent wrote the recent iterations of the Digital Preservation Training Programme, building on the work of Kevin Ashley, Jen Mitcham, William Kilbride, Patricia Sleeman and others. Ed is a senior archivist based within the Digital Preservation Team at CoSector, and has been involved in all aspects of digital preservation since 2004. He has a traditional archivist and records manager background, and brings to his teaching a wide range of skills and experience from numerous digital preservation projects.
Who should attend?
- Digital Librarians
- Information Technology managers
- Records managers
- Repository managers
- Collection managers
- Information management professionals
Booking and more information about the course: http://bit.ly/2sY8NbD
More information about our consultancy services and other courses on the Digital Preservation Training Programme (DPTP): http://www.cosector.com/
For enquiries about the course content, please email: firstname.lastname@example.org
For enquiries about bookings and payment please email: email@example.com