Harmony in the field of Digital Preservation:
First look at the survey results!
By Ross Spencer and Bernhard Hampel-Waffenthal
Cover photo by Bailey Zindel on Unsplash
Welcome back everyone!
Bernhard and I wanted to share some of our first thoughts from the survey we posted back in December 2022 in the blog: Harmony in the Field of Digital Preservation.
Before we get too deep into the results here, we wanted to share a link to an upcoming OPF session on May 4 where we will discuss the results in more detail.
In summary, firstly, thank you to everyone who participated!!
We received 48 survey responses. We are incredibly grateful to the team at OPF for promoting this work.
The responses we saw were telling two different but complementary stories – in some questions we saw alignment, in others greater diversity. Bernhard and I both found responses that intrigued us, though there’s an incredible amount of detail to pour through and we’re still doing that.
We’ve cherry picked a handful below and reflect on these from both of our perspectives. We’d like to open up those points as prompts for further discussion, but also as a tease to more detailed discussion at the upcoming OPF meeting. We consciously left the discussion of the coding-related parts of the survey to that meeting, but of course feel free to ask about specifics ahead-of-time if you want to.
Anonymized results
We have compiled the results and further anonymized those in a Google Sheet here:
Please remember, although the responses are anonymized, people can still identify themselves in the results, so please be respectful in your own analyses of them, and we look forward to hearing more.
Analysing the results
Documentation for the win!!
We asked what skills people had that they’d like to contribute to a shared project. Documentation was the clear winner with providing feedback and end-user testing second and third.
Ross: I’m super stoked to see documentation so high on the list here. There’s always scope for more documentation when we look beyond the manual too. User guides are so important. We saw the JHOVE “A guide for beginners” resource created back in 2017 and recently a huge effort was put into documenting JHOVE error messages. Both new and existing projects can learn from the amount of care and attention that was put into these.
In our “anything else” answers we saw requirements analysis, business analysis, offers to submit examples of obsolete file formats, and writing unit tests. I am pleased to see the depth of thought people put into answering this question in particular and it bodes well for those who wish to continue this into a project.
Bernhard: It’s definitely very nice to see such high interest in documentation, which is important on multiple levels, starting from comments in the code itself, via technical documentation of interfaces, configuration options and error messages, to high-level descriptions of context, purpose and usage guides – a lot of constant attention and effort to keep all that in sync needs good workflows and motivated writers, which it amazingly looks like we have quite a few of.
Even if not quite as prominent in relation, the number of votes for translation is also notable, I wonder what languages in particular this could cover given the people expressing interest? At the very least keeping translatability in mind from the start of any potential coding project will be valuable.
Taking together the answers for the very related points of general feedback, end-user testing and logging issues would make that the overall clear winner, combined with different kinds of ahead-of-development analysis and discussion and review makes it promising that whatever ultimately comes out of this doesn’t miss actual needs. I’m delighted that such a broad array of activities and roles is being covered!
Pre-ingest needs more tooling
We asked what areas of preservation needed more tooling. The question was loosely designed around the core components of the OAIS model and so we saw a pretty even distribution of votes across those areas – but pre-ingest came out top with 64.6% of respondents asking for more tooling there.
Bernhard: While I somewhat expected that pre-ingest might come out on top, I didn’t think it would be with such a respectable margin. Given my general impression that the definitions for pre-ingest vs. ingest tend not to be overly precise and possibly dependent on the particular preservation system in use, and that I see quite a few overlaps in useful functionality between pre-ingest and preservation watch I am looking forward to further discussion on what this turns out to encompass.
As one response pointed out, areas after pre-ingest are often bound-up with larger commercial systems and may be hard to extend, I wonder if that played into the outcome we got? I feel this could also be an interesting discussion to have.
Overall, relating this to the preferences and skills expressed in our coding-related questions I think these fit together very well and I’m excited about the potential!
Ross: I feel confident that if we wanted to turn these results into a project that we can work on in the community then pre-ingest would be a good place to start. Pre-ingest is a favourite by far. I expected more votes for preservation watch and preservation activities and fewer for ingest and access. I’d like to look into this in more detail and see what specific needs people have around pre-ingest and understand where pre-ingest stops and ingest begins.
Tooling as pedagogy
In the anything else section for “What areas of digital preservation would you like more tooling for?” we received one that was a little more out of left field, but resonated with both of us:
“Pedagogical. Perhaps this is more “expansion of current tooling” rather than “more tooling” per se, but we need either more self-describing tools and/or pedagogical tooling to teach digital preservation principles, and not just in the “learn by doing” method that dominates current professional development (e.g. tutorials/workshops that are highly specialised to a particular piece of software or task).”
Bernhard: This is a great point and a cross-cutting answer that can touch any of the areas we listed and be a fruitful ground for exploration. It could mean relatively small things like well-chosen explanatory error messages, tooltips and descriptions giving helpful context, generally carefully thought-out interaction design – whether for existing tools or a new project – or it may point to a need for a new, explicitly pedagogically-oriented tool, lots to think about!
Ross: What else is there to say?! THIS!
Building a community project from scratch
One of the more burning questions we had in the survey was whether there was interest in building a community project from scratch. 80% of people (36 out of 45) responded yes.
Bernhard: It’s lovely to see such a clear positive response, further underlined by responses to the open-ended questions further down. I really hope something can come out of this, even if the time and resources available individually may be modest and the time zones involved diverse, but working together persistently over time can always grow great things. 🙂
Ross: This is awesome! Of course we don’t yet know the breakdown of individuals exactly, but this is one result I was hoping for because of what we can get out of working together, and the potential that brings, not just for a project in the immediate future, but for working more closely as a community in the future too. Combined with some of the survey’s other results and the diversity of what people may bring, it’s really exciting.
What wasn’t in the survey
One insight that the survey couldn’t elicit as we hoped was more information on why folks hadn’t contributed to or collaborated more with projects previously? The hope being to gather more perspectives about hindrances or existing shortcomings.
Bernhard: Reading between the lines some reasons can be teased out, but to get the necessary context to know whether contributing was a decision factor in the first place that was then discarded – the reason for which I would have been most interested in – or whether doing a solo project was the natural starting point the question was probably not pointed enough. Maybe more can be found out in further sessions?
I’m particularly interested in looking a bit behind the sentiment of “If I wasn’t going to do it, nobody else was”, maybe there is a lack of opportunities or spaces to discuss openly about unmet needs and missing features, or does it stem from the experience that there always seem to be too little resources available elsewhere?
Ross: There are clues in some of the answers, such as the lack of politics when working alone. In hindsight, asking this question more directly would have been useful. There’s a chance we may learn more about this when we talk about it with people in person.
Further elaborations…
We included a more general question in the survey:
Do you have any other thoughts on the subjects raised here? Please elaborate as much as you’d like below.
People used the opportunity to share their support for these questions, and others used the opportunity to make clear some of their fears, and ask questions that help them understand the potential scope of what we’re asking. Four answers at the intersection of each of these issues are.
“ I really feel like there are significant gaps in my technical understanding because I’m self-trained (for a large part, despite some CS coursework in college). It’s hard to communicate unknown unknowns to people. I want to write better code, that has good test coverage, that better anticipates edge cases, and is informed by best practice.
“ I’m really looking forward to seeing the results of this survey! I hear so often at work that it’s really difficult to contribute to projects without coding experience or extensive knowledge of the sector overall. It would be great to know tangible ways to rectify this and bridge these gaps.
“ The community is still too “R&D” focused. Start storing data and look after it later. Cheaper and more efficient!
“ Tooling landscaping/benchmarking conducted ? How to include the “industry” into this ? DP is not only OS, but also COTS (commercial off the shelf) products!
Ross: The positive responses to the survey are great to see and I am reasonably sure from these that the survey hit the right notes – we learned something! In other responses such as concerns around open source and commercial products are understandable, and so is the idea that the community is (potentially) too R+D focused. I have a sense we have answers for these as a group and hope these concerns can be addressed in person and over the course of whatever comes next.
Bernhard: The number of thoughtful comments and elaborations are really valuable, coming from a wide number of angles. Personally I of course very much loved to see the very appreciative notes, the attempt to start an initiative from a group of motivated people before fixing a particular project in place being well-received especially. A lot of concerns and cautions and priorities that need to be gotten right are always good to be reminded of, in the end the need for transparency, openness and the willingness to learn from each other are the foundations from which I’m positive something can be made. And also: thanks Ross for initiating this together!
Conclusion
Did we see harmony? Did you see harmony? We think so, but we want to explore the idea more. The number of results here, and the engagement we saw has been really wonderful. The clues that might lead to us all working more closely together are there.
From the original blog we wanted to follow up with:
- An opportunity to use the survey results to discuss synergies in the community, and where there are areas that we could bring closer together.
- An opportunity to start a community owned project, to allow contributors to work towards a common outcome, e.g. plugging one of the gaps in our tooling capabilities.
We’d like to lead a walk-through of the results and a discussion about them to see if we can find alignment on what happens next.
The idea to see how we can build even more harmony across the sector, perhaps there are lessons from existing projects that we can codify? Perhaps there are clear gaps in our tooling that people can use to identify their next project? And perhaps there’s an opportunity for a number of us to work together on something that offers opportunities for new contributions, learning about developing an ecosystem as a community.
But first… or last but not least!
There was a really important question at the end of the survey… “What is your favourite animal and why?”
The respondents didn’t disappoint! But what was it?!!
You’ll have to come to the event in person to find out!! ;D
- May 4, 2023: Harmony in the field of Digital Preservation: A Review of the Landscape; and a Landscape to the Future