AlgoRail: Is France neglecting health by improving health data access?

In March 2019, the establishment of a ‘Health Data Hub’ was adopted by the French National Assembly to allow AI projects to easily access health-related data. At the fifth stop of our AlgoRail through Europe, Nicolas Kayser-Bril takes us back 200 years when France first started collecting health information of its citizens, because of a wide-spread epidemic.

In the 19th century, France was a rapidly industrializing country, where railroads sped up transport and factories brought together large numbers of workers packed in unhealthy, precarious dwellings. The main beneficiaries of this combination were bacteria, who could jump from host to host rapidly over wide distances. Governments across Europe understood the need to collect data to follow the spread of cholera and order quarantines in time to stop its progression. After the first cholera epidemic in 1832, the French government began gathering information throughout the country.

Health police

Until the first world war, health was the purview of the ministry of the interior – it was a matter of public order rather than well-being. To monitor and predict the spread of tuberculosis in large cities, authorities built a “health record” for each building, modeled on the criminal record of each individual. The health record was discontinued around 1908 after hospital personnel pointed out that building-level data was not of much use.

A change of perspective occurred in the first decades of the 20th century. First, the general health level of the population acquired a new military importance. Second, eugenics, a pseudo-scientific craft that claimed to improve a population by rooting out its unhealthy members, gained in popularity. Health and hygiene became political goals in themselves and gained their own ministry in 1920.

Health statistics, once concerned only with epidemics and controlling the poor, started to record well-being. Not all French doctors were enthusiastic about the change. They complained that such data collection would endanger doctor-patient confidentiality, but their main concern may well have been the loss of status.

The drive towards centralization

During the second world war, the French government made big strives to implement eugenics. From 1942, the National hygiene institute (Institut national d’hygiène, INH) conducted large-scale data collection to track the effects of the government’s crackdown on alcoholism and venereal diseases. It also built a central repository of information on 35,000 cancer patients. After the war, INH expanded and kept monitoring the nation’s health (it became the French National Institute of Health and Medical Research, Inserm, in 1964). Furthermore, a social security number was given to each citizen as a unique identifier which revived the old dream of governance through numbers.

In France as in other countries of the Western bloc, central planning was considered a necessity. The government felt it had to collect comprehensive data on morbidity (that is on the illnesses affecting the population). A first attempt to force hospital doctors to fill out forms after each procedure, to be sent to a central authority, failed in 1945. Another attempt was made in 1958, and another in 1972. Doctors, however, did not comply with their new obligations. They criticized the methodology, complained about the added workload, and failed to see any benefits for them.


This changed in the 1980s. A new attempt at centralizing morbidity data was started in 1982. By the beginning of the next decade, all hospitals were feeding data to a central authority and by 1991, the system was mandatory.

The data gathering effort was first and foremost a cost control mechanism. Knowing how many procedures each hospital carried out, and how much money each hospital received, the health ministry could rank their performance, in financial terms at least. However, all information on hospital procedures is of accounting, not medical, nature. This focus on numbers has led to practices focusing on maximizing revenue, as AlgorithmWatch reported in May 2019.

Data quality

The optimization of procedure encoding does a great disservice to data quality. Despite these concerns about data quality, the French government in 2019 pressed to build an even bigger database, called the “Health Data Hub”. The Hub aims at providing any health-related data to AI projects that request it.

The French data protection authority criticized the project because of its broad aims. Data from the Hub can be used for any “public interest” goal, opening the door to any commercial application. Critics also pointed out that personal data in the Hub is pseudonymized but not aggregated, so that it can be easily de-anonymized.

Toxic relationships

A doctor who wished to be identified only as Gilles started a “data strike” when the Health Data Hub was officially launched in December 2019. He and others called on colleagues to stop filling out the forms that feed the Hub. Since the 1980s, he said, France moved from “a healthcare that cures to a healthcare that counts,” pointing to the cost management systems.

No one knows what the Health Data Hub will bring, but history offers some insights: A great amount of health data does not automatically lead to better health.

That’s it for this fifth stop of our AlgoRail through Europe, on which we want to learn more about how algorithmic systems are used in our European neighborhood. Next week we will continue south to Spain.

This story was shortened by Julia Gundlach. The unabridged story was published on the AlgorithmWatch website.

The blog series AlgoRail is part of the Automating Society Report 2020 by Bertelsmann Stiftung and AlgorithmWatch, which will be published this fall and is coordinated by Dr. Sarah Fischer. In addition to journalistic stories like this one, the report gives an overview of various examples of algorithmic systems as well as current debates, policy responses and key players in 15 countries. A first issue of the report was published in January 2019.

This text is licensed under a Creative Commons Attribution 4.0 International License