Festivity Amongst the File Formats

Festivity Amongst the File Formats

A festive period for many of us, the time where the old year ends and the new one approaches is often a time that welcomes some reflection on what we have achieved but also where we are going. Therefore, using the structure of the famous Christmas tale, ‘A Christmas Carol’, it is time to look at the past, present and future of PRONOM for this year and the year coming.

PRONOM in the Past

PRONOM owes its success to a dedicated community of file format researchers, digital preservation experts and those not only in the information management field but also beyond it. In the past year we have added 153 new PUIDs, 159 new signatures and 213 updates. These may seem like big numbers but for all those updates, improvements and new entries to our database there are still so many more suggestions that we are working on entering. We are not only incredibly grateful to all the people whose work has made it into the database, but also all of those who have messaged and emailed us and are still yet to be recognised in our release notes! 

This year as a team we were lucky enough to be invited to give a virtual workshop at the BitCurator Forum (available online if anyone wishes to recap), an in person workshop at the ARA Conference in Belfast and a talk at the ICA Congress in Abu Dhabi. We also followed with lots of interest some of our contributors gave talks at No Time to Wait on AV Formats or Mac Formats at iPres, to name only a few!

As well as all of this amazing work special mention must be said to our two file format analysts, Andrea Hricíková and Andrey Kotov, who moved on to new roles a few months ago. They may have only been working with us a short amount of time but made a huge impact on PRONOM.

PRONOM in the Present

Currently the PRONOM Team is entering the new suggestions created as a result of our PRONOM Research Week. This is a wonderful event, initially started in 2019 that we now continue each year. Taking place on the week around Digital Preservation Day it is a dedicated time to focus on file formats and research should you wish to do so. The community collaborated to tackle the long standing problem that there are many file formats in the database without descriptions. We allowed contributions with no account or signup (and optionally anonymous) to make it even easier and more accessible to submit research to PRONOM, and we hope to make this open access submission process permanent to do the same next year.

Another key area we are working on is to take stock on some of the data discrepancies within PRONOM, as you may see from the large number of updates one of the key focusses for this year was to tackle just a few of these. Ranging from strengthening our identification process (by tightening signature offsets), to ensuring that the data PRONOM generates meets our own standards in the data outputted (we are soon to release our updated XML schema). While we were proud of the work achieved with this, we also recognise there is still a lot more to do. A task for past, present and future!

PRONOM in the Future

In the New Year, we will be publishing the aforementioned XML schema with some new tags. One tag will enable tools (should they wish to) such as DROID and Siegfried to pull out information on the type of format that is being identified. For example it could be a type of audio file, video file or text file. For some organisations this could help assess what types of further analysis their records need or how to prioritise their research into format types.

Additionally we are hoping to upload all of our guidance to our GitHub Research page, in markdown format so it can be edited by anyone. Whilst our team are very knowledgeable we are the first to admit we do not know everything! We would love for the community to add their own tricks, tips and expertise when it comes to file format research. There is a lot of knowledge that could be pulled together to create comprehensible guidance.

In an extension of this another goal is to start creating resources for PRONOM and file format research that can be used by those teaching courses relevant to our work, such as digital preservation. This would involve short teaching games and quizzes, such as those used in our workshops, and could extend to lesson plans for potential use by universities or schools. These would be uploaded to our GitHub site for free access by all and iteratively improved dependent on feedback received.

There are a number of other ideas we wish to investigate further in the coming year. How can we communicate more clearly with the community? Is there space for expert groups in areas such as AV file formats or scientific file formats within the current crowdsourcing structure that PRONOM maintains? How can we improve our resources and drop-in sessions? How do we increase our accessibility? As always we are open to any ideas you have about ways that PRONOM can be improved and how we can do better. So if any of these ideas have resonated with you, or if you have other ideas and opinions on our work then please do get in touch.

At the end of the year there is time to reflect on all that we have achieved but we are aware that there is always so much more to do! Whether that is dealing with queries, submissions or plans to improve our service. We are a small team who work on PRONOM alongside our core digital archiving responsibilities, sometimes responses can be slow and sometimes submissions take a while to be present in our releases but we are looking forward to continuing our work on PRONOM and seeing what the new year will bring!

Wishing a festive season to everyone, and a huge thank you to the digital preservation community for all the support that you continue to give our team.

This is our PRONOM Contributor Map. We would like to take a moment to thank everyone for their efforts - far and wide.
This is our PRONOM Contributor Map. We would like to take a moment to thank everyone for their efforts – far and wide.


Leave a Reply

Join the conversation