French version

CarottAge and Arpentage : two data mining tools for analyzing land-use successions in temporal and spatial domains

Jean-Francois Mari


We have developed a knowledge discovery system based on high-order hidden Markov models for analyzing temporal data bases. This system, named CarottAge, takes as input an array of discrete data -- the rows represent the spatial sites and the columns the time slots -- and builds a partition together with its aposteriori probability. CarrotAge has been developed for studying the cropping patterns of a territory. It uses therefore an agricultural french database, named Ter-Uti, which records every year the land-use category of a set of sites regularly spaced. The results of CarrotAge are interpreted by agronomists and used in research works linking agricultural land use and water management. Moreover, CarrotAge can be used to find out and study crop sequences in large territories.

Display of the results given by CarottAge on a small agricultural area in the Seine basin.

CarottAge is a free software under a Gnu Public License. It is written in C++ and runs under Unix and X11R6 systems. It has been designed specifically for mining land use data, based on HMM2. It is able to analyze temporal and spatial sequences of land use in a territory. CarottAge is now used by agronomists -- and also by geneticians for mining genomic data -- without any assistance of the designers.



  • Arpentage on the french metropolitan territory
  • Arpentage the Yar watershed
  • ArpentAge (Analyse de Régularités dans les Paysages : Environnement, Territoires, Agronomie = Landscape Regularities Analysis: Environment, Territory, Agronomy) is an acronym that also means landscape surveying in French. It is the name of our knowledge discovery system based on high-order hidden Markov models for analyzing spatio-temporal data bases.

    The way, a farmer organizes his territory is a spatial and time process. The land-use category of a given site depends upon the land-use categories of the neighbourhood. For example, grasslands are mainly located close to the village whereas maize fields are usually far away from the forests. The Markov random field is an elegant mathematical model to take into account the uncertainty of the land use categories of the neighbouring locations around a given place. This model clusters the territory into patches where the distribution of land use categories follows some probability law. Following Benmiloud and Pieczynski (1995), we approximate a MRF by a HMM2 by introducing a total order in the sites. We use a Hilbert-Peano scanning that partly preserves the neighborhood system of the points.

    A 4x4 Hilbert-peano fractal curve. Two points that are neighbor in the Hilbert-peano curve are neighbor in the plan. The reciprocal is not true. The land-use category at year t of a field depends also upon its former category at time t-1 and t-2. A higher-order Markov model adequately assign a probability of the temporal successions of the land-use categories of the sites in a territory and reveal some temporal patterns. ArpentAge proposes an unified Markovian framework to represent both spatial and time dependencies of the sites and to cluster the territory into patches where the successions of land-use categories are drawn by a higher-order Markov process.

    ArpentAge takes as input a 4 dimensional matrix: 2 associated to the space (the x, y coordinates), 1 for the temporal information (the time slot) and 1 for the composite observations associated to a site at a given time slot. ArpentAge allows the user to specify the architecture of the Hidden Markov model according to the data and his objectives. Displaying tools and the generation of shape files have also been defined (demo) .

    Legend : Map of homogeneous areas with regard to land-use successions for a 350 km^2 territory in the Niort Plain (France) during the period 1996-2007: (a) 70% of successions involving Sunflower, Wheat, Rapeseed, (b) 60% Urban area and Peri-Urban (c) 60% of successions involving maize (d) 50% of grassland, (e) 70% of forests (f) 40% of successions with barley (g) nearly 50% of successions with ryegrass (h) 50% of successions with pea.


    J.-F. Mari, El-Ghali Lazrak, and Marc Benoît. Time space stochastic modelling of agricultural landscapes for environmental issues. Environmental modelling & software, 46:219 - 227, August 2013. in HAL repository

    Landscape regularity modelling for environmental challenges in agriculture, E.G. Lazrak and J.-F. Mari and M. Benoît, in Landscape Ecology, Sept. 2009, in HAL repository .

    Temporal and Spatial Data Mining with Second-Order Hidden Markov Models, Mari, J.-F. and Le Ber, F.: Soft Computing, Springer-Verlag, doi={10.1007/s00500-005-0501-0}, ISSN:1432-7643, pp 406 -- 414, vol 10, (5), in HAL repository .

    Lazrak G., Benoît M., Mari J.-F.; Arpentage: Analyse de Régularités Paysagères pour l'Environnement dans les territoires Agricoles, Symposium "Spatial landscape modelling: from dynamic approaches to functional evaluation", Toulouse, 2008

    Le Ber, F. and Benoît, M. and Schott, C. and Mari, J.-F. and Mignolet, C.; Studying Crop Sequences With CarrotAge, a HMM-Based Data Mining Software, Ecological Modelling , 2006, 191,(1), pp. 170 -- 185.

    More informations on the Data Mining team Orpailleur and its publications.


    Download CarottAge CarottAge for windows and Ter-Uti data with the user manual ! (59351432 octets under a zip archive).

    Download CarottAge (generic) for windows XP (63 M byte zip file). Various Windows DLL are provided.

    >Download Arpentage v 1 (5404411 byte zip file). You need to have the gnu tools "make" and "gcc".

    >Download Arpentage(using anonymous subversion). Use the following command to download the latest release of ArpentAge. The documentation of the project and a demo will be released soon. This command will not work if your system administrator doesn't allow you to use anonymous svn. Contact him.

    svn checkout svn://

    Jean-Francois Mari, Univ. de Lorraine, LORIA CNRS - INRIA-Grand Est, Campus scientifique BP 239 F54506 Vandoeuvre les Nancy