Distributed Privacy-Preserving Insurance Insight-Sharing Platform
A Swiss Re – EPFL/LDS Collaboration Project in the Framework of C4DT
The collection and analysis of risk data are essential for the insurance-business model. The models for evaluating risk and predicting events that trigger insurance policies are primarily based on knowledge derived from risk data. Currently, sharing risk data involves losing control and risking a competitive edge; therefore, there exists no systematic way to enable access to global insurance-data. Mutual sharing of the information derived from the global data of a network of insurers and reinsurers represents a win-win for all the involved parties, thus enabling better risk models and better fraud detection, which leads to higher margins and better pricing. In general, B2B data sharing and re-use were identified by the European Commission as one of the main factors for enhancing business opportunities and improving internal efficiency across different sectors .
In this project, we consider a simplified scenario, depicted in Figure 1, to develop a proof of concept that can evaluate and quantify the aforementioned benefits. In particular, we consider two actors:
- Primary insurers, acting as data providers, have direct access to their risk data.
- Re-insurer(s), acting as a hub providing global and authoritative knowledge. They do not have direct access to individual risk data and, by default, only obtain it from insurers when there is a case that triggers a policy.
Figure 1: Insurance insight-sharing scenario
In particular, both primary insurers and reinsurers can benefit from sharing in two main domains: benchmarking and data checking. The former offers obvious benefits from a statistical point of view (larger data samples enable more accurate insight), and it is applicable in all prediction cases (losses, risk profiles, claims). Conversely, better predictions improve the operational processes of data checking (e.g., disambiguation, fraud-check).
Sebastian Eckhardt, Senior Business Analyst IT, Swiss Re “Confidential computing in combination with privacy preserving algorithms will not only raise the bar in terms of cloud security, it will – for the first time – allow use cases with data we cannot see, which will truly be a game changer for insurance and reinsurance companies over the next years.”
The purpose of this project is to assess the scalability and flexibility of the software-based secure computing techniques, which are being developed at the Laboratory for Data Security (LDS) of EPFL, in an insurance benchmarking scenario and to demonstrate the range of analytics capabilities they provide. These techniques offer provable technological guarantees that only authorized users can access the global models (fraud and loss models) based on the data of the whole network. No additional access or transfer is granted on the individual local data contributed by any participating institution. The additional sensitive data is used but remains under their full control and never leaves their security perimeters unencrypted. The evaluated system and protocols (a) rely on a fully distributed architecture without a centralized database, and (b) implement advanced privacy-protection techniques based on the paradigm of multiparty homomorphic encryption , which makes it possible to efficiently and scalably compute machine-learning models on encrypted distributed data, and to enable a secure release of either the model or only predictions produced by the model (model-as-a-service) [3,4].
Bianca Scheffler, Head Data Culture & Innovation, Swiss Re “For our future data challenge we will be in a situation that we do only get access to some valuable anonymized data through such virtual interfaces. This piece of work is building the technological and security foundation with generating the right policy for the right access on a fully distributed architecture.”
The tangible outcome of the project will be the evaluation of the performance and the scalability of multiparty homomorphic computation and secure distributed machine-learning technologies in a distributed system with insurance-related data, for the training of generalized linear models and/or simple neural networks on distributed encrypted insurance data.
 “Study on data sharing between companies in Europe” Luxembourg, Publications Office of the European Union. 2018, ISBN 978-92-79-77360-0, doi: 10.2759/354943
 C. Mouchet, J. R. Troncoso-Pastoriza, and J.P. Hubaux. “Multiparty Homomorphic Encryption: From Theory to Practice” IACR ePrint Archive: Report 2020/304, 2020
 D. Froelicher, J. R. Troncoso-Pastoriza, J. S. Sousa and J. Hubaux. “Drynx: Decentralized, Secure, Verifiable System for Statistical Queries and Machine Learning on Distributed Datasets” in IEEE Transactions on Information Forensics and Security, vol. 15, pp. 3035-3050, 2020, doi: 10.1109/TIFS.2020.2976612.
 D. Froelicher, J. R. Troncoso-Pastoriza, A. Pyrgelis, S. Sav, J. S. Sousa, J.P. Bossuat, and J.P. Hubaux. “Scalable Privacy-Preserving Distributed Learning” CoRR abs/2005.09532, 2020 – accepted at PETS 2021
 S. Sav, A. Pyrgelis, J.R. Troncoso-Pastoriza, J.-P. Bossuat, J.S. Sousa, J.-P. Hubaux, “POSEIDON: Privacy-Preserving Federated Neural Network Learning”, CoRR abs 2009.00349, 2020 – accepted at NDSS 2021