Poster Abstract

P.11 Teake Nutma (Kapteyn Astronomical Institute, University of Groningen)

Managing the Euclid data model

The Euclid common data model is central in, and essential to, the Euclid
science ground segment. It defines the format of all data exchanged between
the pipelines and stored in the Euclid Archive. It not only ensures
all components can communicate with each other, but also enforces strict
requirements on data lineage.

But with more than 25 active contributors, managing the data model has been
a challenge. Care must be taken that changes in the XML of the data model
do not break its Python, C++, or database bindings. Furthermore, downstream
pipelines have to be aware of potentially breaking changes in the data
definitions of upstream pipelines.

This poster describes recent progress in tackling these problems. The
former problem has been mitigated with a new data model validator tool run
during continuous integration. The latter has partially been
solved via git management rules for data model custodians, pipeline
custodians, and other contributors. Both approaches have only been possible
after the migration of SVN to git, allowing the introduction of modern tooling.