File formats

Preferred formats are file formats of which DANS is confident that they will offer the best long-term guarantees in terms of usability, accessibility and sustainability. Depositing research data in preferred formats will always be accepted by DANS.
Non-preferred formats are file formats that are widely used in addition to the preferred formats, and which will be moderately to reasonably usable, accessible and robust in the long term. DANS favours the use of preferred formats and recommends depositors to try to deposit data as much as possible in preferred formats.

As a general guideline, DANS believes that the file formats best suited for long-term sustainability and accessibility:
• Are frequently used
• Have open specifications
• Are independent of specific software, developers or vendors

In practice, it is not always possible to use formats which satisfy all of these criteria.

If your data are stored in other formats than those mentioned below, please contact DANS at .

  • Preferred format(s)
  • Non-preferred format(s)
Text documents
Plain text
  • Unicode text (.txt)
  • Non-Unicode text (.txt)
Markup language
Programming languages
Statistical data
  • SPSS (.sav)
  • SAS (.7dat; .sd2; .tpt)
Raster images
Vector images
Computer Aided Design (CAD)
Geographical Information (GIS)
Georeferenced images
Raster GIS
  • RDF/XML (.rdf)
  • Trig (.trig)
  • Turtle (.ttl)
  • NTriples (.nt)
Computer Assisted Qualitative Data Analysis (CAQDAS)

 Digital data are stored in file formats, which are often standard software formats. The software and file format selected will usually depend on the user’s primary purpose.

To create a table, for instance, spreadsheet software will be used more often than a word processor. This is because data tables require specific properties which are better supported by specialized software. This may include the ability to sort data, to use formulas, to set up a filter, and so on. If such information is stored from a spreadsheet application the user may expect the file format to preserve these properties, or ‘significant characteristics’. If the table is created using a word processor it is less likely for the software to support these properties. The word processing application, on the other hand, will be more suitable for formatting an article, for instance using a functional table of contents and page numbers. Such features will not be supported by the spreadsheet application.

When information is stored from a software program, it is usually saved in that program’s standard file format. This is, however, no guarantee that in the future the file contents can be used or displayed in the way that was intended when the file was created. Formats may, for instance, be dependent on particular software. Software can become obsolete or only support certain versions of formats. It is also possible that specific format properties only work in the software used, or even only in one specific version of this software. Files may also be dependent on the use of expensive or exclusive software that not just anyone can access.

To preclude the risk of obsolescence and ensure the accessibility and sustainability of important file properties, a number of measures can be taken. One of these measures is to use file formats that have a high probability of remaining useful for many years.