You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Baptiste Marc MokaBM

Baptiste Marc Moka

Supermalter

Senior Data Scientist / AI Engineer & Professor

€675/day
3 projects
Paris, FR
8-15 years

Average response time: 1 hour

Freelancer profile translated to English.
Back to original language

About Baptiste Marc

**I develop solutions that optimize the use of your data on an efficient pipeline to strengthen your competitiveness**.

Data Science

Data Processing
— Python (poetry) / R
— ETL (Airflow, DBT, Fivetran)
— Data Analysis / EDA
— Stream (Beam, Dataflow, Kafka)
— Hadoop / Snowflake
— PySpark

Data Collection
— Data Engineering / UML
— Hadoop / Spark
— Web Scraping (Beautifulsoup, Selenium, Scrapy)
— LLM RAG / ChatGPT

LLM / ChatGPT API
— Prompt Engineering (CoT & Tool Calling)
— RAG (LangChain, LlamaIndex)
— Vector DB (FAISS, Chroma)
— Agents & Tool Use
— Guardrails, HPO & Monitoring

Prediction & Classification
— Modeling
— Statistical, AI & Machine Learning Models (50+ models)
— Deep Learning (Tensorflow)
— PyTorch, ScikitLearn
— Computer Vision
— Time Series
— NLP

Monitoring / Alerting
— Dashboards (PowerBI, Tableau, Looker, Shiny, Dash)
— Data Visualization
— ELK Stack

Product Development

Prototyping
— UX/UI & Research (Figma, Adobe XD)

Development
— Front (React, JS, HTML, SCSS)
— Back / APIs (Flask, FastAPI)
— Advanced SQL / NoSQL
— Automation (N8n, Make, Zapier)

Production
— CI/CD: GitHub, Jenkins, Serverless Framework
— Cloud: GCP / AWS (Lambda, API Gateway, S3, RDS)
— TensorFlow, PyTorch, MLflow, Scikit-learn
— Databricks, Dataiku,
— Docker, Kubernetes
— Cybersecurity (OWASP, ISO27001, EBIOS)

**Experience**:
  • 📊Lead Senior Data Scientist(9 years)
  • 🔬AI Researcher / Mathematics
  • 🏛️Professor of Data Science(Catholic University of Lille)
  • 🦄AI SaaS Founderweeki.io (5 years at Euratechnologies)

Education:
- PhD Student in Machine Learning
- Trained in the USA in NYC & Stanford 🇺🇸
  • English

    Native or bilingual

  • French

    Native or bilingual

  • Greek

    Native or bilingual

Can work on-site
Paris (up to 50km), Lille (up to 50km), Ferney-Voltaire (up to 50km), Thonon-les-Bains (up to 50km)

Experience

  • DECATHLON | TECHNOLOGY
    Senior Data Scientist for Supply Chain Forecast
    RETAIL (LARGE RETAILERS)
    January 2024 - January 2025 (1 year)
    Paris, France
    Lead Senior Data Scientist – Forecast & Supply Chain

    Mission

    Demand forecasting, inventory optimization, and replenishment.
    Models adapted to seasonal variations.

    Team: 1 PM, 4 DS, 2 DA, 1 ML

    Context

    Retail, complex supply chain, e-commerce, omnichannel
    Robust modeling for demand variability, consumer behavior, supply constraints

    Objectives

    Multi-scale forecasting
    Probabilistic forecasts (distributions, confidence intervals)
    Incremental learning (cutoffs)
    Deployment automation
    Inventory optimization, reduction of stockouts

    Technologies & Infrastructure

    Languages: Python (Pandas, Numpy, PySpark, Scikit-learn, TensorFlow)
    Frameworks: FastAPI, Streamlit, Poetry
    Cloud & Infra: Databricks, AWS S3, Git
    Storage & Pipeline: Colibra, Delta files, Parquet, Databricks JOBS template
    Orchestration: Airflow
    CI/CD & DevOps: GitHub Actions, Docker
    ML Management: MLflow, Databricks
    Monitoring & Viz: Tableau

    Data & Pipelines

    Centralization on Colibra / S3 / Parquet
    Data quality validated before ingestion
    Sources: sales history, economic indicators, external signals

    Transformations & Analyses
    Clustering, seasonality, PCA, statistics (Anova, Chi-squared, T-test, F-Fisher)
    Feature Engineering: lags, macroeconomics, KNN, clustering

    Modeling & Prediction

    Feature Selection: SelectKBest, Boruta, RFE, SFS, Random Forest Importance
    ML Forecast: LightGBM, XGBoost, RandomForest, CatBoost, AdaBoost
    Time Series: SARIMA, TFT, RNN, STL, ARIMA, Fourier, Seasonal Polynomial
    Optimization: HPO (Optuna, HyperOpt)
    Cost Functions: RMSE, WAPE, MAE, Tweedie, Quantile Loss

    Feature Importance

    Local: SHAP, LIME
    Global: Beta Coefficients, Friedman H, Permutation

    Deployment & Operations

    API via FastAPI, UI via Streamlit, hosted on Databricks APP
    Airflow automation, error/drift monitoring
    Drift alerts

    Reporting & Visualization

    Tableau Dashboards
    Comparative Backtesting
    Scenarios
    Databricks MLflow Forecast MLOps Data Pipeline
  • UPFUND
    Senior Lead Data Scientist for Real Estate
    REAL ESTATE
    January 2024 - January 2025 (1 year)
    Paris, France
    Machine Learning (ML) →

    - Prediction of real estate indicators (commercial, apartments, houses) using Machine Learning models, geospatial analysis, and time series forecasting.

    Data Engineering (DE) & Data Analysis (DA) →

    - Creation of data pipelines, preprocessing, exploratory data analysis (EDA)

    Research & Development (R&D) →

    - Identification and definition of research problems to guide projects in a structured and scientific manner. Definition of working hypotheses, with production of summaries of the models used.

    Knowledge Management (KM) →

    - Creation of a state-of-the-art on spatial statistics, time series, forecasting, and Machine Learning applied to real estate.
    - Centralization, structuring, and management of scientific knowledge to leverage expertise and facilitate its reuse.
    Data Pipeline Data Analysis MLOps Machine Learning Statistics
  • UNIVERSITE CATHOLIQUE DE LILLE
    Professor in Datascience / ML / Probability & Statistics
    BIOTECH
    January 2023 - January 2025 (2 years)
    Lille, France
    Course Program – Visiting Professor in Mathematics

    Foundations
    • 0.1: Elements of Calculus and Tools
    • 0.2: Epistemology and Theory of Knowledge

    Part 1 – Systems Theory
    • 1.1: Dynamical Systems
    • 1.2: Complex Adaptive Systems

    Part 2 – Stochastic Dynamics and Probabilities
    • 2.1: Measure Theory
    • 2.2: Probability Theory
    • 2.3: Common Probability Distributions
    • 2.4: Asymptotic Statistics
    • 2.5: Stochastic Processes and Time Series
    • 2.6: Information Geometry

    Part 3 – Data Observation
    • 3.1: Descriptive Statistics
    • 3.2: Exploratory Data Analysis

    Part 4 – Inference and Estimation Theory
    • 4.1: Parameter Estimation
    • 4.2: Experimental Design, Sampling, and Hypothesis Testing
    • 4.4: Decision Trees and Model Selection
    • 4.5: Bayesian Inference

    Part 5 – Examples of Linear and Regression Models
    • 5.1: Simple Linear Regression
    • 5.2: Multiple Linear Regression
    • 5.3: Other Regression Methods

    Part 6 – Other Examples of Classical Models
    • 6.1: Common Univariate Tests
    • 6.2: Common Multivariate Tests
    • 6.3: Non-parametric Statistics

    Part 7 – Examples of Non-linear Models
    • 7.1: Probabilistic Graphical Models
    • 7.2: Percolation Theory
    • 7.3: Spatial Statistics
    • 7.4: Extreme Value Theory
    • 7.5: Agent-Based Modeling
    • 7.6: Network Dynamics

    Technologies and tools used:
    MATLAB, R, Python, LaTeX, Jupyter Notebooks, SPSS, SAS, Excel, NumPy, SciPy, Pandas, Matplotlib, TensorFlow, PyTorch, Tableau, Power BI, SQL, GitHub.
    Data Science Data Analysis Bayesian Inference Data Engineer

Recommendations

Ceren D.CD
Léo SingezLS
FU
+9
Ceren D. and 11 other people have recommended Baptiste Marc

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • MASTER in MATHEMATICS and COMPUTER SCIENCE applied to COGNITIVE SCIENCE for BUSINESS
    Lille University
    2019
    — Data Science with Python: Machine Learning — Probability — Statistical Linear Models & Regression — English M1 — Web — Computing for Neurocognitive Science — Digital development for Neuropsychology — Philosophy of Mind — Ergonomy & Product design — R programming — Non Parametric Statistics — SAS for datascience — E-marketing — Technology for Psychological Research M2 — Ethics & deontology — Functionnal Neuroscience — Emotionnal Process & Affective neuroscience — Neurocognition — Artificial Neural Networks — Programming for Experimental research — UX design / Product and Experience optimization
  • NEW YORK CITY DATA SCIENCE ACADEMY
    New York Datascience Academy (NYCDSA)
    2019
    — Deep Learning — Statistical models — Hadoop — Spark — AWS — Datavizualizatiuon — Linux system — Advanced SQL — NoSQL — Web Scraping — Time Series Analysis — Reinforcement Learning — Computer Vision — Generalized Linear Models — Tree Methods — Support Vector Machines — Natural Language Processing — Code Optimization — Advanced Phyton — Advanced R

Skill set

Categories