
The project LAGO, in which Eviden has led the task “Secure Dataset Publishing”, ended successfully last April after 30 month of work involving several colleagues from the Computer Vision Unit.
The consortium presented the outcomes of the project consisting of a trusted, legally compliant framework and tooling for sharing FCT research data.
Some of the main results have been the following:
• Design of reference architecture and governance framework, accompanied by a reference implementation proving practical viability.
• A set of tools for data characterization, labelling, privacy preservation, data synthesis and monitoring mechanisms that are core to a secure and reliable research data ecosystem.
Having achieved its objectives, the recommended next steps are to scale the reference implementation, onboard additional stakeholders, and establish continuous compliance and governance monitoring to ensure long-term sustainability.
Eviden's contribution to the project has been crucial, with the team developing five of the 17 tools (Synthetic Generation, Autolabelling, Face Anonymisation, Watermarking and Data Poisoning).
The team has also shared its expertise in shaping the reference architecture and governance framework.
In addition, Eviden’s developments have been highly beneficial for Ipsotek. In particular, they established the foundations for adopting synthetic data generation to augment datasets with new or under-represented classes when real data is unavailable. This capability accelerates model development, reduces collection costs and mitigates class imbalance and scarcity.
The data produced in the project has already been used to train Ipsotek’s current models, improving class coverage and robustness.