CLIWA-NET & 4D-CLOUDS file format

You are here: Home / Instruments / Data format    

Site map | Feedback

Main menu

 

This is the dataformat that will be used for the BBC campaign which is organized by the 4D-clouds project and the CLIWA-NET project, amongst others. All data is stored as ASCII files or NET-CDF files, if possible (gnu-)zipped and the maximum number of files is one file per hour or flightleg. The data format is described below in two sections, one on the filenames and one on the file format. Data providers, please provide quickviews together with the data.

File naming standard

The filename convention is SS_IIIIIIII_YYMMDDNN.DAT, with S=station (see table 1) and I=instrument, variable or model (see table 2 for an alphabetic list), and YY for year, MM for month, DD for day, and NN for hour or number. For example, GE_MRADMICC_00080100.DAT is the MICCY microwave radiometer at the Geesthacht site on August 1, 2000, starting at 0 hours at night. If you want to propose a new variable/abbreviation for this list, please send a mail to Victor.Venema@uni-bonn.de, so that they can all be listed here.

Table 1: Stations
BE Bern, sw
CA Cabauw, nl
CE Cessna, nl
CH Chilbolton, uk
DB De Bilt, nl
DE Deelen, nl
EI Eindhoven, nl
GE Geesthacht, de
GR Gilze Rijen, nl
HE Helsinki, fi
Stations, cont.
KI Kiruna, se
LI Lindenberg. de
ME Merlin, nl
ON Onsala, se
PA Paris, fr
PR Partenavia, nl
PE Petersburg, ru
PO Postdam, de
VO Volkel, nl

File format standard

We have two file formats, a new NET-CDF format and the ASCII Format, dating back from CLARA, and now used in ClIWA-NET and 4D-Clouds. This format is described below. If possible please make your new work in NET-CDF.

For the details of the NET-CDF format, please try to stay as close as possible to the CloudNET-format which is mainly thought for radar and lidar data. For other datatypes you can use the Climate and Forecast metadata convention.

ASCII File format

A data file consists of three parts: a header, an optional line with y-axis or variables and a data segment. The header gives basic information on variables and some comments. A line with a y-axis or variables is included to make processing in some software packages easier. After the header there is the data section. The measurements are stored one row per time step. Columns contain time (in decimal of an hour), variable 1, variable 2, ...

At the moment there are two file formats in use: Format 2.3 and 3.0. Format 2.3 was made for 1 or 2 dimensional data. As for 4d-clouds, more dimensions were neccessary, so a new format was made: 3.0. If possible use the newest format, 3.0. Data providers, please adhere to the standard as close as possible, so that other people do not have to rewrite their reading routines when using your data. There is a lot of redundancy in the format, this can make the files bigger, but it helps to find errors and can make using the data easier for new users.

File format 2.3

- The number header lines including this line
- Header format version: 2.3
- Data version
- Instrument name
- Latitude and longitude of the instrument
- Elevation of the instrument above the ground
- Starting date and time of the data
- Ending date and time of the data
- Info about the time axis of the data
- Info about the other axis, or for 2D data: 0,0,'null','null'
- The number of variables used in the data
- x lines with the description of the variables: 'name', 'unit', number of columns
- Line marking the beginning of the comment lines
- x lines with comments
- Line marking the end of the header
- Optional y-axis or variables
- Many line with data

File format 3.0

- The number header lines including this line
- Header format version: 3.0
- Data version
- Instrument name
- Latitude and longitude of the instrument
- Elevation of the instrument above the ground
- Starting date and time of the data
- Ending date and time of the data
- Number of dimensions
- Number of datalines
- D lines with information about the axis of the D dimensions
- The number of variables used in the data
- x lines with the description of the variables: 'name', 'unit', number of columns
- Line marking the beginning of the comment lines
- x lines with comments
- Line marking the end of the header
- Optional y-axis or variables
- Many line with data

Explanations

'# HD LINES',x
The number header lines: The total number of lines of the header (x=integer), including this line itself, but excluding the line with the y-axis or variables.
'FORMAT VERS',x.x
Header format version: This is the version number of the file format, x.x is either 2.3 or 3.0.
'DATA VERS',x.x
Data version: This data version number is there to identify updates of the data. The first version will be: 'x.x = 1.0'; smaller updates of the data (new calibrations, small errors corrected, or time synchronized) update the last number: 1.1, 1.2. Bigger changes (new processing methods) update the first number 2.0, 3.0.
'INS NAME','name'
Instrument name: this name can also be the tool(s) or the algorithm used, in case of higher level products.
'LAT/LON',f1,f2
Latitude (f1=float) and longitude (f2) of the instrument, if not relevant or if changing (aircraft): 0, 0.
'ELEV',f,'unit'
Elevation of the instrument above the ground of f units, if it is changing like for the aircraft: -9999 m. Please, give the height corresponding to the ground as height axis of the products or (and) specify this relation in the comments.
'START',YYYY:MM:DD,HH:MM:SS
Starting date and time of the data, year (YYYY) in four numbers, month (MM), day (DD), hour (HH), minutes (MM) and seconds (SS) in two.
'STOP',YYYY:MM:DD,HH:MM:SS
Ending date and time of the data, year (YYYY) in four numbers, month (MM), day (DD), hour (HH), minutes (MM) and seconds (SS) in two.
'DIM',x
Dimension of the data product, which also specifies the number of '* INFO' lines in the header. This is only used in Format version 3.0.
'DATA LINES',x
The number of lines (x) with data. This is only used in Format version 3.0.
'U INFO',f1,f2,f3,'name','unit',dt
These lines provide the axis information. The number of these lines is equal to the dimension given in DIM, and they are called U, V, W, X, Y, Z, A, B, ... INFO, respectively. f1, is the number of data bins, f2 the value of the starting bin, f3 the value of the last bin. If the bins are not equidistant, please, specify this in the comments. The first dimension should be time. Only this dimension has to sspecifya time shift (dt), which is the difference between the time the instrument indicated originally and the time used in this data synchronized with other instruments. The last dimension specifies the data bins of the y-axis. If no line with y-axis or variables is given after the header, then x,y,z,'name','unit' should be: 0.0,0.0,0.0,'null','null'. If there is a line with variables present the last dimension INFO should be 1.0,0.0,0.0,'null','null'.
'X INFO',x,y,z,'name','unit',dt
'Y INFO',x,y,z,'name','unit'
See 'U INFO for comments. these names are used in format version 2.2.'
'# VAR TYPES',x
The number of variables (x), which is also the number of headerlines the follows with variable information. Note, that this does not have to be the same as the number of column, as the number of columns with a certain variable can be more than one. Furthermore, please note that the time column is supposed to be always there and is not listed as variable, nor counted in this number of variables.
variables
'variable','unit',x
Name of the variable ('variable'), its unit ('unit') and how many columns have this value (x), this could for example be drop counts for each diameter bin of an FSSP or radiances at a number of frequencies. If x is bigger than one, please make sure that the user knows exactly what they are, either in the y-axis information (which can only be done for one variable) or in the comments lines.
%% BEGIN COMMENT
%% END COMMENT
Between these two lines you can dump everything, you still wanted to say. It is a good idea to give e-mail and web-adresses for more information and cooperation.
y-axis or variables
This line is optional, aalthoughits presence should correspond with the right information in the last dimension specified in the header, see 'U INFO'. This line can be used if there is one dimension that is not sampled equidistant. It also makes pplotting2D data easier in packages like Transform and 1D data in sspreadsheetsand many graph-packages. This line is not part of the header.
Data lines
The data lines should start with a time, then other data sseparatedby tabs. Please do not include any non-numeric data in here, but specify values for NaN, +INF, -INF, etc. in the comments.

Examples of the headers the 4d-clouds members can be found in this directory and for seven other instruments below:

Below are given some examples of complete files. Please note, that the following example files have been reduced (only start and end of time series are present) in order to allow easy online viewing. The examples are taken from the CLARA measurement campaigns in 1996 in the Netherlands.