Loading...
 
ESA > Join & Share > Blogs > EO Research and Beyond

EO Research and Beyond

What RSS can do for supporting your EO Research, application or service
Created by
RSS feed

Scaling up RSS on-demand processing capacity

One of the key facet of RSS support to EO researchers is the ability to provide flexible resources when needed. Such a distinctive strength has been built and consolidated over the years having in mind an easily scalable service model as main objective. As mentioned in our recent paper “A Model for the Scientific Exploitation of Earth Observation Missions: The ESA Research and Service Support” published in the IEEE GRSS Newsletter of March 2012, the RSS model “is particularly adequate for those users who have started their work locally on their workstations with some data samples and want to “scale up” to massive data processing”. 

As a matter of fact, in order to fulfil the unexpected rise of RSS users’ processing requests for August, during the last four weeks we have successfully experimented the RSS infrastructure expansion by resorting to external cloud resources. Such expansion regards the RSS on-demand processing service (see Level C in “The 5-level RSS service model” post) that provides scientists with the required processing resources for their EO algorithms integrated into G-POD

As anticipated in the post “How RSS provides flexibility for compute-intensive EO algorithms”, the recent RSS Virtualization brought a new powerful option allowing to “flexibly expand the G-POD infrastructure by resorting to temporary external resources where RSS virtual resources can be hosted and operated under the RSS full control”. This is what we have done in August to effectively respond to the increased on-demand processing requests from our users. 

In term of performance, we have verified that processing time and output are completely aligned with the G-POD infrastructure. The total RSS on-demand processing capacity has been only increased by some 10%. However, thanks to the experimented infrastructure enlargement, although limited in this initial stage, almost all users requests have been fulfilled.

It is worth to notice as well that the total set-up time of these additional resources resulted in the range of 1-2 working days, most of it spent for the initial set-up of the external environment requiring non-recurrent specific activities. Further expansion of the external environment would require much less time, meaning that the total RSS on-demand processing capacity could be doubled in less than 1 week! 

These results concretely demonstrate the maturity of the RSS service model for the Scientific Exploitation of Earth Observation Missions in the coming years. 

 

For more information

RSS Portal: rssportal.esa.int

Author's email: giancarlo.rivolta@esa.int

RSS at Helix Nebula General Assembly

Helix Nebula consortium members are meeting at CERN in Geneva this week to review the progress towards the launch of ‘Helix Nebula: the Science Cloud’, the European cloud computing platform for science that will be made available to research institutions and industry after the successful completion of the on-going pilot phase. 

Three flagship projects were proposed by the Helix Nebula scientific partners, CERN, EMBL and ESA, for the pilot phase with the following objectives:

-       CERN: to accelerate the search for the elusive Higgs particle;

-       EMBL: to boost large‐scale genomic analyses in biomedical research;

-       ESA: to support research into natural disasters

For further details on Helix Nebula please refer to www.helix-nebula.eu . 

RSS is attending the Helix Nebula General Assembly as well, having a central role in the ESA flagship project, the SuperSites Exploitation Platform, currently in its Proof of Concept phase. 

The project is aimed at the creation of a cloud environment supporting the exploitation of Earth Observation data available from GEO SuperSites for earthquake and volcano research. 

The evaluation of the cloud providers involved in the ESA flagship project has been started by RSS several weeks ago. RSS preliminary results will be shared in these days with the Helix Nebula partners. 

Although not directly related to our mission to Geneva, we cannot help but notice the remarkable spatial-temporal coincidence of the discovery of a particle consistent with the Higgs boson announcement during the ATLAS and CMS seminar held at CERN yesterday July 4th. 

Our admiration and applause to Professor Peter Higgs and to the ATLAS and CMS teams! 

 

For more information

RSS Portal: rssportal.esa.int

Author's email: giancarlo.rivolta at esa.int

How RSS provides flexibility for compute-intensive EO algorithms

One of the scientific activities currently supported by RSS G-POD is the KLIMA-IASI project, led by Principal Investigators from IFAC-CNR, Florence. During the KLIMA-IASI algorithm development phase, CNR scientists resorted to RSS resources both for procuring and storing a 8TB IASI dataset and for running successive versions of the algorithm on the G-POD infrastructure. 

Nowadays the algorithm is considered mature and optimised enough for massive processing, that is, ready to be run on the entire dataset. It can be said that the KLIMA-IASI project is ready to enter the “Core Phase” of the EO Research Process supported by the RSS service.

From the RSS service standpoint, it can be said as well that, once successfully concluded the level-B or “Development Support” level of the 5-level RSS service model, we are now ready to enter its level-C or “Processing Support” level (a detailed description of the RSS model proposed to support EO Research in the coming years is given in the paper introduced by the "Scientific exploitation of EO data" post in the RSS Join&Share "Near Real Time" Blog). 

However, being the KLIMA-IASI algorithm significantly compute-intensive, we need to plan enough processing resources for allowing CNR Principal Investigators to achieve the KLIMA-IASI scientific goal on time. Considering the most recent KLIMA-IASI project timetable (just discussed a couple of days ago), the average RSS G-POD processing capacity available during the next months, would allow to process approximately 60% of the selected input dataset, therefore additional processing capacity is needed to achieve the scientific goal of the project. 

 

How can RSS provide flexible processing capacity?

G-POD processing capacity (388 cores) is shared among several PIs from different research centres, running different algorithms, and only part of it can be completely reserved for a single project. If the KLIMA-IASI processing could have been started some months ago, the average G-POD processing capacity available would have been sufficient. However, it can happen that the project timeline needs to be updated during the algorithm development phase, in the light of its preliminary results, in order to further improve the algorithm itself and possibly fine-tune the project goals. Therefore, flexibility is key for successfully supporting EO research. 

Then, the question is: how could RSS provide additional processing capacity over a period shorter than initially planned? 

Well, the recent RSS Virtualization provides the answer. Indeed, thanks to the RSS virtualization, we can flexibly expand the G-POD infrastructure by resorting to temporary external resources where RSS virtual resources can be hosted and operated under the RSS full control. 

The net response time for such a kind of flexible expansion is in the order of a few working days. For the specific G-POD project here considered, KLIMA-IASI, the selected solution made available to the PI is appropriate for supporting the achievement of the scientific goal, in compliance with the project constraints. 

 

In case of more stringent constraints, which solution could be made available? 

The next step is under analysis: we (RSS) are now comparing and evaluating, among those available, possible solutions to be implemented for moving forward the RSS infrastructure in order to provide further enhanced processing flexibility to EO data users. 

In the coming months, I will share the outcomes of this analysis. It will take into account the KLIMA-IASI flexible processing results as well. 

 

For more information

RSS Portal: rssportal.esa.int

Author's email: giancarlo.rivolta at esa.int

RSS Virtualization and Datafarm: a step towards the Cloud

In 2011 we started the RSS infrastructure virtualization project. Its goal was the virtualization of all RSS environments. Now that the project’s objective has been successfully achieved, we can list the results: 

-       Migrated 97 physical machines to 16 physical servers (12 cores each)

-       Reduced the total number of RSS physical servers from 147 to 66

-       Decreased power consumption from 53kW to 24kW (-55%)

-       Annual cost saving for electricity around 80k€ (cooling included)

-       Decreased carbon footprint by 220t/year (Mg/yr) 

-       Increased the average utilization per server from less than 15% to >60%

-       Disposed 95 physical machines and released 4 racks

Besides these results the following benefits have been also obtained:

-       Hardware management simplification

-       New processing nodes set-up cost/time reduction

-       Increased operations efficiency and flexibility

The RSS virtualization project started when another major RSS project, the DataFarm project, was already on-going in September last year. The DataFarm project, aiming at allowing direct access to all RSS data from every RSS environment, has been successfully completed as well a few months ago. The RSS DataFarm allows much more flexibility than before in accessing data. For example, it is now possible to ingest data directly from the former G-POD dedicated storages into the RSS WebMap Server, with no need to copy data on a local storage. The same applies to SSE, KEO and the other RSS environments.

Other benefits brought by the DataFarm are:

-       Optimized storage space utilization

-       Easy access control

-       Easy scalability

At the moment the RSS archive, composed by ENVISAT, ERS and third party missions data has a total volume around 300TB growing by some 40TB/yr.

RSS Virtualization and DataFarm move a step towards the Cloud as well. Indeed, this novel RSS infrastructure model can be naturally extended to the Cloud, therefore constituting a robust and scalable basis for providing more and more efficient and flexible support to EO data users in the coming years. 

 

For more information

RSS Portal: rssportal.esa.int

Author's email: giancarlo.rivolta at esa.int 

To what extent EO validation processes can be supported by RSS?

Supporting algorithms’ validation in EO Research is undoubtedly within the scope of RSS. But which is the limit? Should RSS support only algorithms’ prototyping or consider in its scope larger development cycles as well, for example until algorithms’ maturity is sufficient to become operational processors? To what extent such support could contribute to improve other existing validation processes in the EO domain?

Well, one of the strengths of RSS is its effectiveness in providing flexible solutions to users’ needs. Therefore, major processes designed for delivering value to EO data users that for a number of valid reasons might result to be too rigid in particular cases, could be effectively complemented by RSS processes, properly designed. Concrete examples are the processes that would benefit from the direct interaction with PIs and software developers, such as the validation processes.

During the test and validation phase, the direct involvement of the PI is in fact usual in RSS. For example, the RSS processing on demand service, that relies on the ESA Grid Processing On-Demand (G-POD) infrastructure, entails active collaboration with the PI during all the phases: algorithm integration, validation, processing and re-processing. The service has further streamlined its processes in the last years, thanks to the increased competence, “customer” focus and goal orientation of the team.

One of the keys of the RSS success in delivering effective, efficient and flexible services to EO data users is the solution oriented approach in dealing with problems and errors as soon as they emerge, enabled by the specific know-how continuously refreshed in the RSS team. Another key is the response time, that is minimized thanks to the process steps reduction and simplification.

An example of the support that RSS can provide to validation processes is the G-POD testbed approach presented at the SMOS Science Workshop in Arles in September 2011. Both poster presentation and live demos were given to interested SMOS scientists to introduce the testbed approach.

Indeed, since then more than 11 SMOS PIs have been enabled to use the L1 and/or L2 testbeds for validation, and 7 re-processing campaigns have been completed. In the meantime, the SMOS NRT ingestion service has been activated, allowing PIs to run NRT validation tests on L2 beta processors.

Testbed objectives and benefits for users

The SMOS testbed on the ESA Grid Processing On-Demand (G-POD) environment has been designed to support missions characterized by frequent new processor releases. The testbed’s main objective is to provide an environment supporting early processors tests, including validation of the processor improvements supported by substantial data and processing resources.

In combining the SMOS processors with G-POD’s flexibility, processing power and easy interface, the G-POD SMOS testbed allows:

- Scientists to create their own Level 1 and Level 2 products using custom configuration

- Scientists to integrate, free of charge, their own algorithms in the G-POD environment to process SMOS

- The SMOS processor development team to test new processor releases, seamlessly plugging them into G-POD and then running the software over a large data set

- The SMOS quality control team to perform comparisons between the products processed in operations and products processed by G-POD, improving product quality

Can the RSS testbed approach be applied to other missions/sensors?

Yes, abosolutely. The same testbed approach successfully applied to SMOS algorithms’ validation can be straightforwardly applied to other EO sensors.

 

For more information

RSS Portal: rssportal.esa.int

Author's email: giancarlo.rivolta@esa.int

Sentinels for Science document review

Some weeks ago we have published on this site the Sen4Sci Announcement inviting colleagues from Land, Solid Earth, Ocean and Cryosphere communities to download for review the draft documents that, at this stage, give general summary of potential Sentinel science products. The review of these documents, open until the end of September, is public: anybody can download the documents and read the comments that will be posted in the four thematic forums.

Indeed, the Sen4Sci forums are intended to facilitate an expert peer-review in a standard scientific way, allowing experts to provide specific comments (referring to the particular line and page numbers) within the appropriate thematic forum. However, the comments and the draft documents will be accessible by the broad scientific public as well, that is encouraged to provide general and/or specific feedback.

Whoever is interested in providing comments (either general or specific) will need to register and then login into the Join&Share area. Once logged-in, the reviewer will contribute to develop the topic of interest, by replying to one or more of the following posts:

 

Review of SEN4SCI Land (& solid Earth) document

Review of SEN4SCI Ocean document

Review of SEN4SCI Cryosphere document

 

For reasonably short comments (no more than 1000-1500 characters), the “Reply” text box can be used. Of course, the “Attach file” functionality is available as well.

 

Regarding the interest measured in terms of "Reads" per topic, in the first few weeks (until this morning, Aug 25th) we have observed what is summarized in the table below. There are some hundreds of people interested in each one of the Sen4Sci thematic areas; probably part of them is already working on the document review. However, at the moment there is no reply yet. 

   

Forum

Topic

Reads

Land Research

Review of SEN4SCI Land (& solid Earth) document

672

Ocean Research

Review of SEN4SCI Ocean document

451

Solid Earth Research

Review of SEN4SCI solid Earth (in Land) document

397

Cryosphere Research

Review of SEN4SCI Cryosphere document

380

   

In terms of document downloads directly from the Sen4Sci Wiki page, until this morning we had:

Document

Downloads

SEN4SCI_Land_v03.pdf                               

 

83

SEN4SCI_Cryosphere_v03.pdf                    

 

70

SEN4SCI_Ocean_v03.pdf                            

 

83

 

 

During this document review phase the Sen4Sci community can rely on the RSS Team support for Join&Share related operational or technical matters. In case of problems in accessing the Sen4Sci Wiki and Forums, and/or in downloading or uploading files, reviewers are invited to request support to the RSS Team (rss_team@esa.int ).

 

Supposedly, reviewers will start providing their reviews in the next weeks. By mid-September, in three weeks from now, we will see the number of replies to each topic and estimate the number of reviews expected by the end of the month. 

 

 

For more information

RSS Portal: rssportal.esa.int

Author's email: giancarlo.rivolta@esa.int

RSS, Cloud Computing and cost reduction

Flipping through an Italian magazine this weekend, I have found an article about Cloud Computing that caught my attention. In a few pages it has conveyed a clear message: in the next 10 years the Cloud Computing market will grow approximately 6 times because, depending on the type of activity, IT costs incurred today by Small and Medium Enterprises, Scientific Communities, Universities and Research Centres, for setting-up, operating and maintaining their own in-house infrastructure can be reduced by 30% to 60% and more, migrating to the Cloud. 
The general public that has read this article has formed an approximate general idea of what Cloud Computing means and its potential impact on the finance of small/medium organizations. But what does it mean for EO Research and RSS users in particular? 
 
Some examples
Before answering the question above, let me report here a couple of impressive examples given by the article. As an example of cost reduction, the article mentions a simulation made by researchers from the Pennsylvania State University indicating that a small e-commerce company using the Cloud would pay for IT only 10% of IT costs associated to the same business done with an in-house approach. Really impressive! 
However, although not explicitly cited by the magazine, I have found that the mentioned researchers have recently published their work in the paper "To Move or Not to Move: The Economics of Cloud Computing", where they conclude that the complete migration to the Cloud is for now appealing for small organizations, while a mixed approach Cloud/in-house can be convenient for medium organization. 
Another example provided by the magazine regards the continuously increasing processing power need of Universities and Research Centres. It is about an experiment involving heavy data processing conducted by researchers from the University of Newcastle regarding the generation of predictive models of molecular behaviour: the experiment took 3 months using the Cloud approach instead of 5 years! 
 
Similarities with RSS
Since years RSS provides resources for EO Research, Applications and Services. There are some similarities between RSS and the Cloud Computing concept, as far as cost and processing time reduction at the user side are concerned. In fact, RSS has set-up, operates and maintains an infrastructure made available to EO data users with the aim of facilitating EO data exploitation by reducing costs, processing time and barriers. 
Demonstrating that RSS allows EO data user to save time is not a complex task. In my previous post I have already mentioned an estimated 80% time saving for a student working on his Ash Detection from Satellite master thesis. Furthermore, several cases of processing time reduction could be mentioned as well (e.g. IASI data processing time reduced from days to hours).
Estimating cost saving is more complex. Besides evident savings regarding infrastructure set-up, operations and maintenance, there are other valuable but not obvious savings regarding the EO research process supported by RSS. But being this a more complex and not obvious topic it deserves to be approached in a dedicated post rather than in the last lines of this post. 
 
Of course, it will be one of the next posts. 
 
For more information:
RSS Portal: rssportal.esa.int
Author's email: giancarlo.rivolta@esa.int

RSS for University students, research and enterprise

Supporting the development of new EO applications and services can contribute to the EO industry growth and competitiveness. Indeed, considering a business model focusing on EO data exploitation, the support provided by RSS could be assimilated to the value delivered by key suppliers, thus leaving exclusively key activities to RSS users. Therefore, the advantage for RSS users focusing on EO data exploitation is that the cost of their value propositions (EO application or service) can be much cheaper.
Although not directly regarding an EO enterprise, an example of such cost reduction comes from the recent activity of a student from the University of Rome La Sapienza, Department of Electronic Engineering, who successfully worked as G-POD user for some months on his thesis on real time Satellite Ash Detection. According to his estimate, the time saving in identifying, accessing, preparing and processing data, can be quantified in the range of 80% with respect to the total time initially planned without considering RSS support. Of course, he used the time saved to further progress in his study on volcanic ash. The student brilliantly graduated with first-class honours, perfectly on time!
Another example comes from ALANIS, a research project on fire plumes and aerosol dispersion monitoring led by Noveltis and involving several European scientific institutions. Some new tools have been developed and integrated into G-POD for supporting this project, in particular: Lat-Lon Beam sub-setting, MERIS FSG re-projection, and AATSR Nadir-Forward stereo matcher. In this case, like in other similar cases regarding both G-POD and other RSS environments, besides delivering value to the specific project, thus reducing its overall cost, the new tools developed by the G-POD team are made available as additional resources to other RSS users as well.
The examples above show that a virtuous circle can be activated by RSS. In fact, the first example regards the "learning curve effects" step: moving certain activities into RSS, rather than keeping them with the user, prevents the impact of such effects on the overall cost, since RSS has already completed the learning curve. The second example regards the "economies of scale" step: when supporting a project requires the development of new tools, these will be designed by RSS aiming at their applicability to other PIs or projects, thus ensuring future faster response with no additional cost. These first two steps in the virtuous circle determine as successive steps overall efficiency increase and cost reduction.

For more information:

RSS Portal: rssportal.esa.int

Author's email: giancarlo.rivolta at esa.int

 

Communities on the Join&Share area

Next March 22nd, 2011 more than 200 EO scientists, international experts in Cryosphere, Oceans, Land and Solid Earth research, will attend the 3-day Sen4Sci Workshop at ESRIN. In order to support the discussion of scientific priorities and potential Sentinel science products, allowing as well the participation of many scientists that will be not able to attend the Workshop, RSS has made available a new e-collaboration space in the Join&Share area, dedicated to the “Sen4Sci Communityâ€. Such space foresees a Wiki and 4 thematic forums (Land Surface, Solid Earth, Ocean and Cryosphere) that will be moderated by selected experts. Forums’ content will be initially readable by anybody, while the participation to the discussion, that is posting in a forum of interest, will be allowed to registered people.

 

In general. RSS provides the Join&Share social network area, to promote collaboration, offering EO actors the opportunity to share ideas, projects and skills in the Earth science research and services. To access the Join&Share area an SSE account is needed. If you don't have one, you can register here.

Besides the "Sen4Sci Community" said above, the following e-collaboration spaces, with one or more forums each, are available as well: 

The following table shows the number of posts and visits for some publicly visible forums available on the Join&Share until today March 16th:

 

 

 Name
Visits
Posts
Last Post

SSE Forum

Forum for the SSE user and service provider community

342317
208

Fri 11 of March 2011 15:22 CET

Toolbox Forum

This forum is intended for use by the TOOLBOX user community. In here, users can post questions and problems found with the TOOLBOX.

185920
156

Mon 17 of Jan. 2011 14:10 CET

WebMapViewer Forum

This forum is targeted at the WebMapViewer community. WebMapViewer users can post questions and problems and the WebMapViewer team will reply.

165563
117

Fri 20 of Aug. 2010 16:43 CEST

HMA Forum

Forum for the HMA project.
132126
94

Wed 02 of March 2011 17:33 CET

GPOD Forum

Forum for the GPOD community
8411
1

Fri 16 of July 2010 14:17 CEST

IIM Forum

Forum for the IIM community
7412
1

Mon 17 of Jan. 2011 18:03 CET

 

 

Except the newest 2 (GPOD and IIM), all the public forums have been visited during the last couple of years more than 100,000 times and their number of posts is in the range 100-200. For the GPOD and IIM communities the "public" numbers are much lower not only because these forums are relatively new, but also because they prefer a "restricted" communication. For example, GPOD community scientists currently have 7 restricted forums related to 7 different GPOD projects, each of them only visible by registered users belonging to the corresponding GPOD project. These numbers give an idea of the potential of the thematic forum as a collaboration tool.

Whether restricted or not, Join&Share Wikis and Forums bring the advantage of simultaneously making available relevant updated information to a selected target audience for a certain activity. The best way to exploit these e-collaboration tools is therefore focusing on the definition of "relevant information" and "target audience".

For requesting information about Join&Share Wikis and Forums, either existing or to be set-up, you can contact us or send an email to rss_team@esa.int.

 

 

 

 

 

The 5-level RSS service model

In order to support the EO Research Process described in the previous post, RSS aims at realizing the following conditions for the benefit of EO data users:

-algorithm development made easy and validation supported;

-reference datasets available to the communities;

-dedicated processing environment available to EO science users and service developers. 

 

To this aim, RSS has developed a specific service model. Such model, in its different steps and phases, is schematized in the following table.  

 

Level

Service Type

Accessible Resources

Context and examples

A

Basic Science Support.

- EO Data through standard Query-Order

- Free Toolboxes for EO Data Processing

These services are used over the full research process from the first step Available Information Study.

Examples: ESA Ordering Services, FTP Sites, Toolboxes like NEST, BEAM, BRAT. 

B

Development Support

- EO Reference Data Sets and toolboxes/processing components for algorithm development and test.

These services are employed during the Hypotheses Formulation step for the preparatory work leading to the Perform Experiment and Collect Data step of the research process.

Examples: Component-based Processing Environment, Reference Data Sets and G-POD Sandbox Services.

C

Processing Support

- Grid / Cloud Computing resources enabling mass data processing and collection of results.

These services are required during the Perform Experiment and Collect Data step of the research process.

Examples: G-POD.

D

Support to Product Validation

- Reference data like ground truth or products and related processing resources required to conduct the validation.

These services are essential during the Analyse Data step to establish the quality of the processing and a pre-requisite for any collaborative product.

Examples: Reference Data Sets.

E

Production and Service Support

- Systematic data processing in NRT and/or on long time series

- Service publication

- Service orchestration

These services are linked to the Publish Results step of the research process, when the research has been concluded and the results are made available to the community.

Examples: Service Support Environment, G-POD, Join&Share.

 

The model is organized in 5 levels, from A to E: Basic Science Services, Development Support Services, Processing Support Services, Support to Product Validation, and Production and Service Support.

The first level, or A, Production and Service Support, encompasses the access to EO Data through standard Query-Order services and to free Toolboxes for EO Data Processing. These services are used over the full research process from the first step Available Information Study (see the process steps in the " EO Research Process Re-engineering " post).

The second level, Development Support Services, covers the access to EO resources and toolboxes/processing components for algorithm development and test. These services are employed during the Hypotheses Formulation step for the preparatory work leading to the Perform Experiment and Collect Data. This type of service is provided through specialized facilities and applications supporting software development and re-use, data management, processor testing.

The third level, Processing Support Services, includes the access to Grid / Cloud Computing resources enabling mass data processing and collection of results. These services are required during the Perform Experiment and Collect Data. Processing and data handling needs are typically very high and concentrated in time; hence, use of a shared facility.

Level D, Support to Product Validation, foresees the provision of reference data like ground truth or independently produced products and related processing resources required to conduct the validation. These services are essential to establish the quality of the processing and a pre-requisite for any collaborative product.

The last level, or level E, Production and Service Support, covers the configuration of new services allowing the systematic data processing in near real time (NRT) and/or on long time series. Also covers the management of user access to the new services and the possibility of service orchestration. These services are linked to the Publish Results activity, when the research has been concluded and the results are made available to the community.

 

First PagePage: 3/41234