File formats

Preferred formats are file formats of which DANS is confident that they will offer the best long-term guarantees in terms of usability, accessibility and sustainability. Deposits of research data in preferred formats will always be accepted by DANS.
Non-preferred formats are file formats that are widely used in addition to the preferred formats, and which will be moderately to reasonably usable, accessible and robust in the long term. DANS favours the use of preferred formats and recommends depositors to try to deposit data as much as possible in preferred formats.

As a general guideline, DANS believes that the file formats best suited for long-term sustainability and accessibility:
• Are frequently used
• Have open specifications
• Are independent of specific software, developers or vendors

In practice, it is not always possible to use formats which satisfy all of these criteria.

If your data are stored in other formats than those mentioned below, please contact DANS

  • RDF/XML (.rdf)
  • Trig (.trig)
  • Turtle (.ttl)
  • NTriples (.nt)
 Digital data are stored in file formats, which are often standard software formats. The software and file format selected will usually depend on the user’s primary purpose of use.

To create a table, for instance, spreadsheet software will be used more often than a word processor. This is because data tables have specific properties that are better supported by specialized software. This may include the ability to sort data, to use formulas, to set up a filter, and so on. If such properties are stored in a spreadsheet application the user may expect the file format to preserve these properties, or ‘significant characteristics’. If the table is created using a word processor it is less likely for the software to support these properties. The word processor, on the other hand, will be more suitable for formatting an article, for instance using a functional table of contents and adding page numbers. Such features are not supported by spreadsheet software.

When information is stored from a software program, it is usually saved in that program’s standard file format. This is, however, no guarantee that in the future the data file contents can be used or displayed in the way that was intended when the file was created. Formats may, for instance, be dependent on particular software. Software can become obsolete or only support specific versions of formats. It is also possible that specific format properties are only present in the software used, or even only in one specific version of this software. Files may also be dependent on the use of expensive or exclusive software.

To preclude the risk of obsolescence and ensure the accessibility and sustainability of important file properties, a number of measures can be taken. One of these measures is to use file formats that have a high probability of remaining usable for many years.