Constrained surrogate cloud fields

A colorfull illustration of an evolutionary search algorithm using smillies
Figure 1. Illustration of the main components of an evolutionary search algorithm. In every cycle the best solutions are allowed to reproduce and are changed/mutated to get new solutions.
Picture showing the search progressing through increasingly high resolutions
Figure 2. By using the scaling properties of cloud fields the search can be speeded up by first searching at lower resolutions before going to higher resolutions. The figure at the top is the template cloud, on which the statistics for the fitness function (the opposite of a cost function) are based. For this example we used histograms, height profiles, power spectra and a length measure, for cloud base and top height, liquid water path (LWP) and number of cloud layers.
An almost linear plot of the number of tries needed versus number of pixels.
Figure 3. The search can relatively easily be done at high resolutions, as the number of clouds for which the fitness has to be calculated is about proportional to the number of pixels. To get this good scaling, one has to tune the criterion for going to the next resolution very well. Because, errors that are still there at low resolution take much time to repair at higher resolutions.


This page explains how we are creating 3-dimensional cloud fields (Liquid Water Content fields), based on (almost) arbitrary statistical properties of measured clouds. The clouds are generated by searching for a 3D cloud field that has the same statistical properties as the measured clouds. These statistical properties provided the constraints for the search. We generate such cloud fields as surrogate product for measured cloud fields, which are not available, as it is close to impossible to measure a full 3D cloud field. Therefore, these cloud fields are called constrained surrogates.

Search algorithm

The search is made using an evolutionary search algorithm, see figure 1. The main difference of such a search algorithm compared to traditional ones, is that it works on a population of possible solutions, and thus manages not to get stuck in local minima easily. It is a robust and flexible method. We used a population of 40 to 100 clouds. The fitness/cost of each cloud with respect to the statistical cloud properties determines its reproductive success. Mutations/permutations are used to keep the population diverse. Other global search algorithms may also work.

The search for large cloud fields is possible by utilising the scaling properties of clouds (figure 2): The largest variations are at the large scales. Only after finding a good solution at a coarser resolution, the resolution is made finer. Without this key idea the search would have been too large to be possible. For example the number of possible clouds with resolution 64x64 and 256 LWC values is 2564096, much more than the number of atoms in the universe (about 1078).

Convergence algorithm

The algorithm, which is written in IDL, needed 22 hours on a 700 MHz desktop PC to calculate one of the surrogate clouds (4096 pixels) shown here. It would not come as a surprise if we could still reduce the computation time by an order of magnitude, but the algorithm will remain a calculation intensive task. At the moment we are writing a C++ version that could also run on a parallel computer. With this version we will generate 3-dimensional clouds, in the same way we made the 2-dimensional surrogate cloud in Fig. 2. Going from 2D to 3D is in principle an trivial task, which we already did for a previous version of the algorithm with only cloud boundary statistics. With a good network connection you can download a 6 Mb-animation of this 3D cloud field

The calculation of the fitness of the clouds takes most computer resources. The number of clouds that are generated before convergence is an about linear function of the number of pixels, see figure 3. That means that the calculation of large matrices is relatively easy. Given that the number of possibilities is a power law.

This method to generate surrogate cloud fields is, thus, much slower than my iterative method, but it is much more flexible and might be able to find a better match with the measured statistics. As it is easy to change the cost function and try new statistical parameters, it is ideally suite to investigate with statistical properties are needed to describe cloud structure.


I do not have a finished article on this topic yet. The most informative is probably a poster presented at EGS in 2003. The talk in the multi-fractal session at EGU 2004 (pdf | ppt), does not contain newer work, but details more on the reasons for generating clouds fields the way I do. Furthermore, I once started a report on a previous version of the search algorithm that included only cloud boundary statistics. This rather unpolished and out-dated report does provide more detail on the search algorithm itself. Read at your own risk.
Schreiber (1998) wrote an article on how to make constrained time series for a totally different application.


Searching for a 3D-cloud field with measured cloud properties
EGS-AGU-EUG Joint Assembly, Nice, 06 - 11 April 2003.
Victor Venema, Sebastián Gimeno García, and Clemens Simmer

Iterative and constrained algorithms to generate cloud fields with measured properties (pdf | ppt)
Invited talk in the multi-fractals session, EGU2004, Nice, France.
Victor Venema Clemens Simmer Susanne Crewell

An evolutionary search algorithm to generate 3D cloud fields with measured cloud boundary statistics
Report, 2003.
Victor Venema

Constrained randomization of time series data
Phys. Rev. Lett., 80, (1998), 2105.
Thomas Schreiber
Last update: 16 August 2004