EDF and EDF+ file format

Compiled by Paul Bourke
May 2020

The EDF format (stands for European Data Format) was designed to store medical time series data, it is most commonly used for EEG data. A non-breaking extension was added in 2003 called EDF+. It is an open documented standard that is agnostic to any recording system or hardware/software supplier. The initial document describing the format, published in 1992, is here: document.pdf.

The original EDF files are particularly easy to read, they consist of two header blocks followed by all the data. The first header block contains various information about the recording, the device specification, ADC range and filters. Pertaining to the data it contains the number of data records, the length of one record and the number of "signals" (for example, electrodes for eeg). All of the fields in the two headers are plain human readable ASCII characters. The header is 256 bytes long, the entries are summarised below.

Bytes		Description
8		version of this data format, usually 0 for original EDF
80		patient identification
80		local recording identification
8		startdate of recording (dd.mm.yy)
8		starttime of recording (hh.mm.ss)
8		number of bytes in header record
44		not used in original EDF specification
8		number of data records
8		duration of a single data record in seconds
4		number of signals in data record

The second header has 256 bytes for each signal, but these 256 bytes are split on a per signal basis. For example, the first entry is a label for the signal, it is 16 bytes long. The second header will start with the labels for each signal. This is followed by all the transducer types, and so on. The second header entries are as follows.

Bytes		Description
16		label for the signal
80		transducer type
8		units
8		minimum possible value in units
8		maximum possible value in units
8		minimum value numerically
8		maximum value numerically
80		type of any prefiltering
8		number of samples in each data record
32		reserved

The data follows, in the original EDF format the data was represented in two byte signed integers, little endian. See EDF+ for a relaxing of this for other data formats such as floating point. The number of samples will be the number of records times the number of samples in a data record times the number of signals.

To parse an EDF file the minimum might be as follows.

Read 256 byte header. Extract the number of records (nrecords), the number of signals (nsignals) and optionally the duration of a record if you want to compute sampling frequency.
Read nsignals times each of the second header entries. Extract the number of samples in each data record (nsamples).
Read nrecords * nsamples * nsignals of data values.

Explicitly the ordering of the samples is as follows.

   for (i=0;i<nrecords;i++) {
      for (j=0;j<nsamples;j++) {
         for (ns=0;ns<nsignals;ns++) {
            read 2 byte integer sample
         }
      }
   }

As an example see: test.edf. It consists of 350 records, each record is 1 second long, and there are 21 signals (electrodes). Each record is the same length, namely 256 samples.