Secure Distributed-Learning on Threat Intelligence
An armasuisse S+T – EPFL/LDS Collaboration Project in the Framework of C4DT
In the current interconnected world, the number of new threats and incident indicators is constantly increasing, to the point that it is impossible to adapt the detection and mitigation systems without an updated and comprehensive knowledge base that can be used to decipher the patterns of the incidents and to train advanced models that can predict and detect them.
Current efforts for sharing threat-intelligence data (e.g., the Malware Information Sharing Platform, MISP,), work on a centralized database, where all the participating organizations have to upload their threat data.
Cyber security information is often extremely sensitive and confidential, it introduces a tradeoff between the benefits of improved threat-response capabilities and the drawbacks of disclosing national-security-related information to foreign agencies or institutions. This results in the retention of valuable information (a.k.a. as the free-rider problem), which considerably limits the efficacy of data sharing. The purpose of this project is to resolve the cybersecurity information-sharing tradeoff by enabling more accurate insights on larger amounts of more relevant collective threat-intelligence data.
This project will have the benefit of enabling institutions to build better models by securely collaborating with valuable sensitive data that is not normally shared. This will expand the range of available intelligence, thus leading to new and better threat analyses and predictions.
This is made possible by offering provable technological guarantees where authorized users of the platform can access only the global insights (cyberthreat models) built on the whole network data, whereas no access or transfer is granted on the local contributed data that remains under the control of its source. We achieve this purpose
- by designing a MISP-compatible distributed architecture without a centralized database (each institution keeps full control over their data records that never leave their security perimeter), and
- by integrating advanced cryptographic techniques based on the paradigm of multiparty homomorphic-encryption, which makes it possible to efficiently and scalably compute machine-learning models on encrypted distributed data, and to enable a secure release of either the model or only of predictions produced by the model (model-as-a-service), according to the system model.
CYD Campus Collaborators
Dr. Vincent Lenders, Director of the Cyber-Defence Campus and the head of the Cyber Security and Data Science Department at armasuisse Science and Technology
Dr. Alain Mermoud, Scientific Project Manager at the Cyber-Defence Campus of armasuisse Science and Technology, EPFL Innovation Park
Dr. Colin Barschel, Scientific Project Manager at the Cyber-Defence Campus of armasuisse Science and Technology, EPFL Innovation Park
Dr. Metin Feridun, Scientific Project Manager at the Cyber-Defence Campus of armasuisse Science and Technology, ETH Zurich
Damian Pfammatter, Scientific Project Manager at the Cyber-Defence Campus of armasuisse Science and Technology, ETH Zurich
- Wagner, A. Dulaunoy, G. Wagener, and A. Iklody. “MISP: The Design and Implementation of a Collaborative Threat Intelligence Sharing Platform.” In Proceedings of the 2016 ACM on Workshop on Information Sharing and Collaborative Security (WISCS ’16). Association for Computing Machinery, New York, NY, USA, 49–56. 2016
- A. Mermoud, M. M. Keupp, K. Huguenin, M. Palmié, D. Percia David: “To share or not to share: a behavioral perspective on human participation in security information sharing”, Journal of Cybersecurity, Volume 5, Issue 1, 2019,
- C. Mouchet, J. R. Troncoso-Pastoriza, and J.P. Hubaux. “Multiparty Homomorphic Encryption: From Theory to Practice.” IACR ePrint Archive: Report 2020/304, 2020
- D. Froelicher, J. R. Troncoso-Pastoriza, J. S. Sousa and J. Hubaux, “Drynx: Decentralized, Secure, Verifiable System for Statistical Queries and Machine Learning on Distributed Datasets,” in IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3035-3050, 2020
- D. Froelicher, J. R. Troncoso-Pastoriza, A. Pyrgelis, S. Sav, J. S. Sousa, J.P. Bossuat, and J.P. Hubaux. “Scalable Privacy-Preserving Distributed Learning.” CoRR abs/2005.09532, 2020
- S. Sav, A. Pyrgelis, J.R. Troncoso-Pastoriza, J.-P. Bossuat, J.S. Sousa, J.-P. Hubaux, “POSEIDON: Privacy-Preserving Federated Neural Network Learning”, CoRR abs 2009.00349, 2020
- Lattigo library
- S. Gillard, T. Maillart, D. Percia David, A. Mermoud: “Machine-Learning Predictive Models for Distributed Privacy-Preserving Threat-Intelligence Platforms” Information Science Institute, Geneva School of Economics and Management, University of Geneva