Preferred formats are file formats of which DANS – based on international agreements – is confident that they will offer the best long-term guarantees in terms of usability, accessibility and sustainability. Deposits of research data in preferred formats will always be accepted by DANS.

Non-preferred formats are file formats that are widely used in addition to the preferred formats, and which will be moderately to reasonably usable, accessible and robust in the long term.

As a general guideline, DANS believes that the file formats best suited for long-term sustainability and accessibility:

  • Are frequently used.
  • Have open specifications.
  • Are independent of specific software, developers or vendors.

In practice, it is not always possible to use formats which satisfy all of these criteria. It may be desirable to make certain original data available in ‘Non-preferred format(s)’ because these can be characterized as current usage formats. Examples include Esri Shapefiles, Microsoft Access databases, SPSS .sav files. DANS then asks you to deposit your data in these original formats as well as in Preferred formats aimed at long-term sustainability.

If your data are stored in other formats than those mentioned below, please contact DANS.

Type Preferred format(s) Non-preferred format(s)
Text documents
Plain text
Markup language
Programming languages
 
Spreadsheets
Databases
Statistical data
Raster images 
 
Vector images 
Audio
Video
Computer Aided Design (CAD)
Geographical Information Systems (GIS)
Georeferenced images 
Raster GIS
3D
RDF
  • RDF/XML (.rdf)
  • Trig (.trig)
  • Turtle (.ttl)
  • NTriples (.nt)
  • JSON-LD
 
Computer Assisted Qualitative Data Analysis (CAQDAS)

Digital data are stored in file formats, which are often standard software formats. The software and file format selected will usually depend on the user’s primary purpose of use. To create a table, for instance, spreadsheet software will be used more often than a word processor. This is because data tables have specific properties that are better supported by specialized software. This may include the ability to sort data, to use formulas, to set up a filter, and so on. If such properties are stored in a spreadsheet application the user may expect the file format to preserve these properties, or ‘significant characteristics’. If the table is created using a word processor it is less likely for the software to support these properties. The word processor, on the other hand, will be more suitable for formatting an article, for instance using a functional table of contents and adding page numbers. Such features are not supported by spreadsheet software. When information is stored from a software program, it is usually saved in that program’s standard file format. This is, however, no guarantee that in the future the data file contents can be used or displayed in the way that was intended when the file was created. Formats may, for instance, be dependent on particular software. Software can become obsolete or only support specific versions of formats. It is also possible that specific format properties are only present in the software used, or even only in one specific version of this software. Files may also be dependent on the use of expensive or exclusive software. To preclude the risk of obsolescence and ensure the accessibility and sustainability of important file properties, a number of measures can be taken. One of these measures is to use file formats that have a high probability of remaining usable for many years.

 

 

© DANS. R.5.5. Version 1.1, March 30, 2023