15th ~ 16th OCT 2015 MADRID, SPAIN #BDS15
THE 4th EDITION OF BIG DATA IN Oct 2015 WAS A RESOUNDING SUCCESS.
Thursday 15th
from 18:30 pm to 19:15 pm
Room 25
-
Technical
Euclid is a high-precision survey mission developed in the frame of the Cosmic Vision Program of ESA in order to study the Dark Energy and the Dark Matter. Its Science Ground Segment (SGS) will have to deal with around 175 PB of data both coming from Euclid satellite data, complex pipeline processing, external ground based observations or simulations, and with an output catalog containing the description of around 10 billion of objects with hundreds of attributes. Thus, the implementation of the SGS is a real challenge in terms of architecture and organization. This talk describes the Euclid project challenges, the foreseen architecture, the ongoing proof of concept challenges and the plan for the future.
Read more
GROUND SEGMENT ARCHITECTURE
The Euclid SGS development is therefore a real challenge in terms of architecture design (storage, network, processing infrastructure) and of organization. Thus, 9 Euclid SDCs will have to be federated, ensuring an optimized data storage and processing distribution and providing sufficient networking interconnection and bandwidth. In terms of organization, more than 14 countries will be involved in the project and hundreds of non-necessarily collocated people will have to work together either on scientific, engineering or on IT aspects.
In particular, the reference architecture, currently proposed, for the SGS, will be based on:
This architecture concept has already been validated through "SGS Challenges", allowing namely to distribute and execute first simulation prototypes on any of the SDCs thanks to IAL and EAS prototypes. The first outcomes will be presented. This challenge approach allows deploying working prototypes at early stages and is a great factor of motivation for the teams disseminated among different laboratories and Computing Centers around Europe.
VIRTUALIZATION
Another factor of potential complexity is the fact that most of the SDCs rely on existing Computing Centers that usually do not share the same infrastructure and operating system. Rather than having to setup, test and maintain different targets for the Euclid software, the choice has been made to rely on virtualization in order to be able to deploy the same guest operating system and the same Euclid software distribution on any of the 10 Euclid SDCs host operating system and infrastructure.
This virtual processing node image also called “EuclidVM” will simplify a lot both the development of the Euclid processing software and its deployment. At the time being, we are studying the CernVM ecosystem(µCernVM, CernVM-FS, elastic virtual cluster based on Openstack) with the support of the CERN that developed it. This technology seems relevant for Euclid EC SGS and could be applicable with few adaptations, thus avoiding having to “reinvent the wheel”.
ESAEuclid Science Operations Center System Engineer