- This event has passed.
Preservation in Practice: Too broken to be archived?
20th Jul 2015 : 1:00 PM - 2:00 PM
Practical experiences with archiving PDF files
This webinar deals with archiving PDF files. As PDF files in our repository have myriad data producers, the heterogeneity of PDF files is overwhelming. Unfortunately this means the creation of errors as well. Usually, the original data producers cannot be contacted any more, therefore we have to do the best we can with the PDF material.
As we ingest PDF files and (almost no) PDF/A-files, the validation is done with JHOVE. JHOVE flags issues with about 20% of our PDF files. We estimate that most of them do not contain really bad issues as they can be repaired or even migrated to PDF/A without any problems.
For this webinar, we have picked some of the truly broken PDF files which either cannot be repaired or converted or look strange after migration. As non-archiving is not an option, we have to think of creative ways to rescue what we can. Expect lots of screenshots of real-world PDF problems and some nice tools to automate as many steps as possible.
Yvonne Friese, Deutsche Zentralbibliothek für Wirtschaftswissenschaften