The OPF Archives Interest Group (AIG) returned to Copenhagen earlier this month for a face-to-face meeting. Luckily, storm Ciara did not affect the travel from Estonia, the Netherlands, and the UK and we all made it safely to our hosts at the Danish National Archives (where we were greeted with some tasty pastries and coffee).
Since July 2016 the AIG have been busy investigating significant properties of spreadsheets so one of the main goals of the meeting was to do some ‘sprints’ together to finalise this work. We were joined by Lotte, an intern at the National Archives of the Netherlands. She will be helping the AIG with the stakeholder analysis. After a quick round of introductions, we dived into some practical work. We already built a knowledge base of spreadsheet functions and behaviours based on information extracted from spreadsheet specifications and tool analysis. Recently, we decided to use some slightly different naming conventions used in compatibility tables by Apple Numbers and Microsoft Excel. Frederik created a new spreadsheet to map the terms we had used in the knowledge base to those used in the compatibility tables. The task was to work through the lists and agree on the final names we should use. Where possible, we agreed to use the names in the compatibility tables to ensure they are more recognisable to others. However, there are several properties in our knowledge base which are not part of the tables but that are relevant to archiving and preservation. These were also reviewed to make sure the names are understandable.
Establishing the significance of properties is not an exact science. Significance relies heavily on e.g. context and the stakeholder. As AIG we embraced the InSpect methodology for assessing significant properties, with an object analysis phase and a stakeholder analysis phase. The former establishes the (technical) properties and the functions and behaviours of the type of object, in our case spreadsheets. The latter extracts the opinion of stakeholders w.r.t. significant properties, functions and behaviours. Combining them yields properties of spreadsheets that are important to preserve.
In parallel to finalising the lists of spreadsheet properties, functions and behaviours, we have therefore been carrying out stakeholder interviews. The Danish National Archives created a questionnaire to find out:
- which spreadsheet formats are most commonly used, by how many people and how often,
- the task or purpose for using a particular spreadsheet format,
- which features and functions they use,
- the volume of spreadsheets they have
- what steps they are taking to preserve them.
So far, our interviewees have included the record management officers, archivists and “heavy users” of spreadsheets in different funding agencies, a national bank, and regional archives. We are very interested to hear from anyone with a high use of spreadsheets who would be willing to be interviewed for this work – please get in touch!
The day concluded with a tour of the archives which holds over 400km of paper records and 300 TB of digital records. Afterwards, we went for dinner. It was Copenhagen Dining Week, a restaurant festival that started in 2011 offering special menus for a week during the winter months. After dinner, we decided to spend some iPRES 2019 Best Poster Audience Award prize money on beers. (More specifically: one beer.) We plan to donate the remaining amount of ~ €200 to a charity.
Early the next morning we got straight to work on developing the report structure for this work. It involved reviewing all of the research we have done to date from our initial reading list to the development of a prototype tool to analyse spreadsheet properties. Now that we have a solid template, we will be continuing this work over the next couple months and hope to present it at the OPF AGM in June.
Remco van Veenendaal, National Archives of the Netherlands, commented: “Collaborative work on our draft report was a new experience. Everyone was in the same room working on the same document, but I didn’t hear a sound other than typing.”
The final task of the meeting was to brainstorm what we will work on after the report is complete. We created a list of priorities back in 2016, but we have welcomed more members to the group and wanted to see if our priorities have changed. There were a lot of ideas for collaborative work! Just some of the suggestions included practical work with PAR (Preservation Action Registries), understanding PDF/A flavours and tools for creation and conversion and testing with veraPDF, file format validation comparing results, speed etc, quality management – looking at the impact of error messages, digital signatures, and the carbon footprint of preservation. We also talked about how many countries are currently writing or updating their archival acts. Would it be possible to harmonise them in some way, at least within Europe?
Our next step for this brainstorming session is to write up this long list of ideas and share it with OPF members to also see what their priorities are. We hope to continue to expand this group as we have all seen the benefits of working in a collaborative way to address practical preservation issues.