Download
SNDS Data Scientist Apprentice
Référence du poste : DATA-Appr-2026-02
Cette offre vous intéresse ?
Nous rejoindre
Vous avez envie de faire carrière au sein d'un organisme public qui a pour mission de protéger efficacement la santé des populations ? Rejoignez-nous.
Santé publique France is France’s national public health agency. A public institution under the supervision of the Minister of Health, created through the merger of several public institutions by Order 2016-246 of April 15, 2016, the agency works to promote public health. As a scientific, expert, and public health safety agency, its missions include:
- Epidemiological observation and monitoring of the health status of the population;
- Monitoring health risks threatening the population;
- Promoting health and reducing health risks;
- Developing prevention and health education;
- Preparation for and response to health threats, alerts, and crises;
- Issuing health alerts.
The agency is organized into 12 scientific, cross-functional, or support divisions.
The agency’s strategic priorities and work program, established by its Board of Directors, are organized into three areas: Strengthening the capacity for anticipation and rapid response to address health threats; Measuring and assessing the extent of diseases and risk factors to guide their prevention and control; Strengthening the health impact of all public policies and the prevention and promotion of health.
Data Support, Processing, and Analysis Division
Mission
The DATA Division leads key projects to modernize medical and administrative data processing chains, with a particular focus on the National Health Data System (SNDS). These initiatives employ innovative methods in statistics and data engineering to address major public health
challenges through advanced big data analysis, predictive modeling, and the integration of heterogeneous data sources. Three priority areas illustrate this approach:
- Reconstructing patient trajectories within the SNDS to better assess the impact of environmental exposures and social and territorial inequalities on health.
- Developing prognostic scores based on the SNDS by deploying an infrastructure that facilitates the creation and validation of models (data selection and engineering, modeling, validation, and evaluation). The goal is to standardize the production of robust and reproducible scores, while accelerating their availability for research and decision-making.
- Democratize access to and analysis of complementary medical-administrative data, such as EDP-santé, and facilitate their linkage with complementary health, socioeconomic, and/or environmental data to promote more integrated and cross-cutting analyses (One Health, life-course approach, …)
Responsibilities
Within the ABISS unit of the DATA department and under the responsibility of the unit manager, the SNDS Data Engineer/Scientist apprentice will actively participate in the design, development, and optimization of data processing and analysis pipelines for the National Health Data System (SNDS). They will contribute to key projects aimed at modernizing the use of medical-administrative data by applying advanced methods in data engineering, statistics, and artificial intelligence.
Their activities will focus on the following tasks:
- Contributing to the modernization of SNDS data processing workflows: Participating in the design and improvement of medical-administrative data extraction and processing, in collaboration with teams from the DATA department and the agency’s business units. This includes automating extraction and transformation processes, optimizing queries and processing, and making them available (business departments, open source).
- Analysis and utilization of medical-administrative data: Participating in the reconstruction of residential trajectories and the matching of SNDS data with complementary sources (socio-economic, environmental, etc.).
- Development of predictive models: Support for the construction and validation of advanced models, from the selection and engineering of features to their evaluation. The apprentice will contribute to the implementation of an automated infrastructure to standardize the production of robust and reproducible models, with a view to their deployment for research and decision support.
- Support for methodological and technical monitoring: Contributing to documentation, training, and facilitating the internal network regarding best practices for operating the SNDS. The intern may also participate in making data available through analysis and visualization tools.
- Collaboration with institutional and academic partners: Involvement in collaborative projects with external stakeholders to enhance the scientific quality of the work and integrate methodological innovations into the agency’s practices.
- Technology watch and continuing education: Active participation in internal training sessions and technical support sessions organized by the ABISS unit to maintain up-to-date expertise on emerging tools and methods in data science and public health.
These activities take place within a dynamic and collaborative technical environment, utilizing modern development tools, languages tailored to data science, and high-performance computing infrastructure. The intern will work within a multidisciplinary team, interacting closely with epidemiologists, data scientists, statisticians, engineers, and members of the IT Department as well as the Chief Information Security Officer (CISO).
The main tools and technologies used include:
- Languages: R, Python, and SAS
- Collaborative environment: GitLab (version control, continuous integration, issue management)
- Formats and databases: PostgreSQL, DuckDB, Parquet files, CSV
- Visualization: Quarto, Shiny (R and Python)
- Development environments: VS Code, RStudio, Mistral AI
Our latest news
news
“Protecting the Public from the Risks of Alcohol.” The special report in *La...
news
Call for Applications to Fill Vacancies on the National Committee on...
news