Blogs: Tools

Blog posts filtered by the Tools subject tag.

Browse blogs by subject

All subjects Access Analysis Android apache tika ApacheTika AQuA ARC ARC to WARC archives archiving audiovisual Benchmark benchmarking best practice best practices Bit rot bitcurator board game British Library Characterisation Community compression Corpora CSV-Validator curation Database Database Archiving Database Preservation Delivery Digital Forensics digital preservation digitisation Disk Images DROID E-ARK Project EaaS Education Emulation epub Experimentation extensible Fido File Formats FLAC Flashback floppy disk floppy disks floppy drive Format Identification Format Registry GitHub Hackathon Hardware obsolescence help httpreserve Identification IDPD17 IMPACT Internet Standards iPRES. community survey isolyzer jhove job JP2 JPEG2000 jpylyzer LZW magnetic media Matchbox MediaConch Members Metadata metadate Migration Monitoring Normalisation OCR open Open Planets Foundation Open Preservation Foundation Open source OPF diary Optimization Packaging PDF PDF/A Planets policy PREFORMA PREMIS preservation Preservation Actions preservation planning Preservation Risks Preservation Strategies Preservia Process Projects PRONOM Provenance pywb recordkeeping records Representation Information Research data research infrastructure Resources RFC Rogues Gallery Rosetta Roy SCAPE Server Siegfried Signature Development Software Software benchmarking SPARQL specification spreadsheets SPRUCE standards technical technical registry testing TIFF Tika Tools training validation veraPDF Virtual Machines w3c WARC Watch WAV WAVE Web Archiving Web Publications wget Wikidata Workflow Workflows Zip

Building a Debian Package from a program written in Ruby is not a straightforward task. This post intends to be a step by step practical guide on packaging ruby programs based on the lessons we learned during the debianization process. We will use in this guide a sample program: Pagelyzer ( This program is an […]

By jordi.creus, posted in jordi.creus's Blog

18th Feb 2013  2:25 PM  18427 Reads  No comments

As part of our work on test-beds for the SCAPE project we have been investigating the various ways in which a large scale file format migration workflow could be implemented.  The underlying technologies chosen for the platform are Hadoop and Taverna.  One of the aims of the SCAPE project is to allow the automatic generation […]

By willp-bl, posted in willp-bl's Blog

14th Feb 2013  1:48 PM  16021 Reads  No comments

Last week I had the honour to host the OPF Webinar "Digital Preservation at your command, part II". During the Webinar attendees were shown the difference and/or similarities between the command line interfaces of MS DOS, Linux and Apple. Here is a short summary of the Webinar:* Comparison of command line interfaces (MS DOS, Linux, […]

By TechMaurice, posted in TechMaurice's Blog

4th Feb 2013  6:05 PM  12083 Reads  No comments

The most important new feature of the recently released PDF/A-3 standard is that, unlike PDF/A-2 and PDF/A-1, it allows you to embed any file you like. Whether this is a good thing or not is the subject of some heated on-line discussions. But what do we actually mean by embedded files? As it turns out, […]

By johan, posted in johan's Blog

9th Jan 2013  1:42 PM  134965 Reads  16 Comments

The PDF format contains various features that may make it difficult to access content that is stored in this format in the long term. Examples include (but are not limited to): Encryption features, which may either restrict some functionality (copying, printing) or make files inaccessible altogether. Multimedia features (embedded multimedia objects may be subject to […]

By johan, posted in johan's Blog

19th Dec 2012  3:15 PM  16819 Reads  1 Comment

In the middle of November 2012, the first OPF Hackathon on Emulation took place in Freiburg, Germany. It brought together practitioners from different national libraries, library information services as well as a couple of researchers in the domain. The aim of the three-day Hackathon was to work on practical use-cases and real-live challenges stemming from […]

By Dirk von Suchodoletz, posted in Dirk von Suchodoletz's Blog

4th Dec 2012  3:15 PM  12672 Reads  No comments

Several of us at The British Library took part in the CURATEcamp file id hackathon on Friday. We decided that one issue we could make a useful impact on was identification of various ebook formats. eBooks are an important content type for the British Library, especially with the expected implementation of non-print legal deposit legislation […]

By willp-bl, posted in willp-bl's Blog

19th Nov 2012  3:53 PM  16650 Reads  1 Comment

In the last months, I have been researching the problem of large-scale content profiling for preservation analysis. I do this for a number of reasons. For one, I support the opinion that formats are just another property. Undoubtedly, a very important one, but knowing which formats you have is not sufficient for good preservation planning […]

By peshkira, posted in peshkira's Blog

19th Nov 2012  11:03 AM  18396 Reads  No comments

As many of you may know, Cal Lee, Andi Rauber and myself recently attempted to facilitate a broad discussion on emerging research challenges within the DP community at a workshop at IPRES 2012. We solicited – and received (thanks again to all contributors!) – wide-ranged contributions from Europe, North America, and New Zealand. The invitation […]

By cbecker, posted in cbecker's Blog

13th Nov 2012  8:08 AM  13540 Reads  No comments

I've already written a number of blog posts on format validation of JP2 files. Format validation is only a one aspect of a quality assessment workflow. Digitisation guidelines typically impose various constraints on the technical characteristics of preservation and access images. For example, they may state that a preservation master must be losslessly compressed, and […]

By johan, posted in johan's Blog

4th Sep 2012  11:04 AM  16922 Reads  2 Comments