ESA > Join & Share > Technology Projects > PALDMC Project

PALDMC Project

Parallelization for Data Mining Components (PalDMC)

Iguassu (CZ)

The main objective of the PalDMC (Parallel Data Mining Components) project proposed by consortium of Iguassu Software Systems (ISS) and Advanced Computer Systems (ACS) is to design and implement a parallelised versions of core Data Mining methods to be used by Earth Observation (EO) Image Information Mining (IIM).

Project a Glance

IIM, as the name suggests, is a set of techniques used to exploit, as much efficiently as possible, meaningful information contained by images. It generally requires the application of sophisticated algorithms with high processing cost. Moreover, the analysis of EO data is further complicated by continually increasing data volumes, thanks to the trend toward very high-resolution sensors and the use of image time series, where several acquisition systems, large set of multi-band images, and multiple acquisitions times are considered. In order to make these calculations feasible parallel processing option must be inevitably taken into account, at least, for the most computationally expensive 'bottle-neck' algorithms involved in the data analysis. Notable examples of these core algorithms are feature space dimensionality reduction, synthesis, and similarity clustering.

ESA, as a significant EO data provider, has been active in this area and several EO-IIM related projects have been developed so far. The main result of these activities is the KEO (Knowledge-centred Earth Observation) system. Nonetheless, the large processing costs of information mining systems like KEO demands robust and scalable (parallelised) machine learning core functionality to be integrated into the system architecture.

ISS has already developed some parallelised version of KEO algorithms in the frame of PECS and ESA IIM-TS projects, and will provide new parallelised IIM core algorithms.

Objectives and Benefits

The PalDMC project aims at achieving two major goals:

1) Extend the portfolio of the already parallelised algorithms;

2) Extend the level of parallelization cross-node cluster-wide by using some of the grid paradigms and/or GP-GPU functionalities.

Application and Result Expected

The outcome of the project is the full design of SW prototypes to be implemented, based on the overview of theoretical fundamentals of parallel processing, the analysis of considered parallel HW/SW options and results of the pre-prototyping experiments and test activities.

Contributors to this page: Michele Iapaolo .

Page last modified on Monday 25 of June 2012 16:26:10 CEST by Michele Iapaolo.