ross-spencer's Blog

Digital Preservation Analyst, Archives New Zealand, formerly Digital Preservation Researcher National Archives, UK. My interests are format identification, migration, creation of tools for the characterization of file formats. I'm a coder. I began in industry developing VoIP solutions in C++. More recently I code in Python. The majority of my recent work you will find on GitHub.com. In the past I have blogged about digital preservation for The National Archives, UK. and I maintain a personal blog at exponentialdecay.co.uk. I am interested in any correspondence relating to digital preservation, either via the form here, or via Twitter: @beet_keeper. Please note, with the exception of matter-of-fact commentary on the work I may be conducting at Archives New Zealand, all opinions and thoughts are my own.

Much of the inspiration from this blog came from this source here. According to UNESCO, the authenticity of a record can be jeopardized by: Threats to integrity. Changes to the content of the object itself also potentially damage authenticity. Most such changes stem from threats to the object at a data level. A hyperlink is data. […]

By ross-spencer, posted in ross-spencer's Blog

19th May 2017  3:41 AM  1118 Reads  No comments

At Archives New Zealand we were finding ‘WAVE’ files becoming a bottleneck of one of our ingest processes. The result initially looked odd to me where I had thought I had understood in the past that file format identification would not take longer to divine than a checksum. My rationale being that to identify a […]

By ross-spencer, posted in ross-spencer's Blog

22nd Aug 2016  8:15 AM  1144 Reads  2 Comments

Jenny Mitcham, Digital Archivist at the University of York started a nice snowball rolling last week when she asked “Research data – what does it really look like?” Paul Young at the National Archives, UK, was one of those to respond, to show that perhaps the snowball had been generating momentum for a number of […]

By ross-spencer, posted in ross-spencer's Blog

14th Jun 2016  6:43 AM  1726 Reads  No comments

As promised yesterday this is the follow up blog to the refactor of my original DROID SQLite Analysis work. The new version now allows you to produce reports from the format identification tool Siegfried. In this blog I wanted to talk about a small number of other details that can be a bit harder to […]

By ross-spencer, posted in ross-spencer's Blog

24th May 2016  9:59 AM  1382 Reads  No comments

With the release of the latest Siegfried there was added motivation for me to provide an analysis output for the format identification tool. With ‘double the magic’ there was a lot more for us to explore as analysts, and fingers crossed this release (a refactor) of my SQLite based analysis tool will help with that exploration. Previous […]

By ross-spencer, posted in ross-spencer's Blog

23rd May 2016  6:56 AM  1361 Reads  No comments

For anyone dealing with a relatively small number of records, compared to say an internet or data archive, a reasonable process for ingest of material into your digital preservation system might be: 1. Process files with a file format identification tool 2. Per 1. process files with a file format validation tool 3. Per 1. […]

By ross-spencer, posted in ross-spencer's Blog

13th Mar 2016  5:27 AM  1935 Reads  No comments

This is the second blog inspired by my visit to colleagues at National Library of Australia, last August. The first, discusses a federated approach to better incorporating custom signatures into the PRONOM signature base without modifying PRONOM. The essence of the blog, however, still centers around how the community can create signatures for itself, and […]

By ross-spencer, posted in ross-spencer's Blog

7th Jan 2016  7:15 AM  2131 Reads  No comments

Abstract: Downloading an object over the internet through a standard web-browser is a mechanism that is ‘less-than-optimal’ for the delivery of archival objects. Download of objects will not preserve the file-system metadata of the object. Tools like Wget can do this, but do we want the same behavior of the browser? On answering that, do […]

By ross-spencer, posted in ross-spencer's Blog

2nd Nov 2015  6:16 AM  2719 Reads  No comments

Presented here is a tool that will create a 'rogues gallery' out of any digital collection for which you have a DROID report for (alternatively, soon, a Siegfried report for). The tool was presented at a recent OPF Webinar, Preservation in Practice: Archives New Zealand; slides here. And was created by myself and Andrea K. […]

By ross-spencer, posted in ross-spencer's Blog

25th Aug 2015  9:44 AM  2625 Reads  No comments

Abstract() This blog discusses what we have available in our toolkit for contributing more signatures to PRONOM for the benefit of the digital preservation community. It also discusses the potential issues we need to work around in the short time we have between controlled PRONOM releases. The blog outlines an idea for a temporary, federated […]

By ross-spencer, posted in ross-spencer's Blog

11th Aug 2015  10:59 AM  2859 Reads  3 Comments