You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Amelie MedemAM

Amelie Medem

Supermalter

Data engineer 🚀 - Data expert - Python Spark SQL

€650/day
5 projects
Paris, FR
8-15 years

Average response time: 12 hours

Freelancer profile translated to English.
Back to original language

About Amelie

Welcome to my Malt profile 👋

I regularly work on Data Engineer, Data Science, and Application Development missions and: ✅

📒Data Engineer

Data collection via various sources (Website, API, Database) - Hadoop - Data Ingestion
Data storage in various formats
Data Modeling
Creation of efficient data architectures - Data Architecture
Creation and performance optimization of data pipelines - Airflow - Data Pipeline
Development and optimization of data processing - Spark/PySpark/SQL - Data processing
Ensuring data quality with adequate monitoring - Data Quality
Data Ops: Git, Gitlab CI/CD, Jenkins
Making data available to Data Science teams

📒Data Science

I can collect raw data, perform statistical analyses to identify underlying trends and the most relevant characteristics. Expose analysis results in business dashboards - Tableau. Model business needs with machine learning algorithms (prediction, recommendation, classification, clustering, ...), expose the found models in APIs and put the entire prediction chain into production (with real-time updates).

📒API Development

Django / Flask APIs

📒Search Engine

I work throughout the entire process of setting up a search engine / From text extraction (Image, PDF, ...), indexing, grouping by themes, to deployment on a Web platform.

🚀🚀🚀 This list is not exhaustive 🚀🚀🚀

I offer useful solutions to companies.
  • French

    Native or bilingual

  • English

    Fluent

Remote only
Primarily works remotely

Experience

  • Vizcab
    Data engineer / Developer
    SOFTWARE PUBLISHING
    April 2024 - Today (2 years and 2 months)
    Paris, France
    - Designs and develops new data pipelines in Azure Databricks for data ingestion to/from product applications, Azure Data Lake, and PostgreSQL databases.
    - Implements Datadog metric ingestion pipelines in Databricks, joins this data with other datasets, and exposes insights in Power BI reports.
    - Creates and optimizes models to organize and structure data from various applications and sources, making it usable by users.
    - Develops and maintains Power BI and Databricks dashboards to visualize information, monitor pipeline performance, and ensure data quality.
    - Improves code quality by applying best practices and establishing robust CI/CD pipelines using Databricks Bundle Assets, GitLab, and SonarQube.
    - Implements unit and integration tests.
    - Develops and implements data contracts as a framework for monitoring data models and defining clear specifications.
    - Collaborates with business teams to identify their needs and provide tailored data solutions that deliver value.
    PySpark Databricks Microsoft PowerBI Gitlab CI/CD MySQL Data Modeling Data contracts Extract, Transform, Load (ETL) Data Pipeline Data Quality Data visualization Microsoft Azure
  • Cour des comptes, Paris.
    Machine learning engineer / Project Lead
    PUBLIC SECTOR
    December 2017 - August 2022 (4 years and 8 months)
    ● Designs and supervises the architecture and development of the Court of Auditors' unified search platform based on a Hadoop datalake.
    ● Builds Python scraping pipelines to collect HTML pages of reports produced by the Court of Auditors from 1870 to 2022 (180k+).
    ● Creates and develops Python projects to extract raw text from 250k+ reports of types PDF, Word, HTML, Image documents (OCR), etc.
    ● Implements Python programs to clean, process, and structure heterogeneous data, and especially to identify connections between data for indexing (Elasticsearch) and textual analysis.
    ● Leads and develops Spark pipelines for ingesting content from various databases (e.g., audits, Court of Auditors' agent registry, ...).
    ● Collaboratively develops the search engine's Web platform (React, Django).
    ● Conducts an NER (Named Entity Recognition) POC to automatically extract relevant names and expressions from the text of reports (Spacy, Deep learning).
    ● Organizes and leads manual annotation workshops (Doccano) of reports to build a learning base for the NER POC specific to the Court of Auditors' context.
    ● Organizes several user workshops to gather internal needs regarding efficient text search, document organization, and logical links between information.
    ● Works hand-in-hand with the UX designer to create mockups for the search platform.
    Python Scala SQL PySpark Hadoop Elasticsearch Docker BeautifulSoup Tesseract Spacy Tika Pandas Numpy data engineer Statistical Modeling Python (Programming Language) Natural Language Processing (NLP) Python (Programming Language) Requirements Analysis Project Management Project Management Team Management
  • SOLOCAL
    Data Engineer / Full Stack Developer
    E-COMMERCE
    February 2016 - October 2018 (2 years and 9 months)
    Paris, France
    ● Develops from scratch a data visualization application for Pages Jaunes professionals. The application provides a 360° view of professionals (subscribed products, audience, click share, reviews, and paid/free content, ...).
    ● Refactors and develops an application that allows geographical visualization of Pages Jaunes clients' audiences and activities (migration from Java to React+Node).
    ● Develops Spark pipelines for data ingestion.
    ● Collects, processes, and loads data into Elasticsearch search engines.
    ● Writes technical documentation.
    ● Trains a student in Web development (3 months).
    ● Trains a group of 10+ professionals in Scala.
    Typescript Scala Hadoop Spark Apache Kafka Elasticsearch React.js PostgreSQL Python data engineer Statistical Analysis GitHub

Recommendations

MJ
EC
Emmanuel BismuthEB
+1
Mariette Jusselme and 3 other people have recommended Amelie

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Doctorate
    Université Pierre et Marie Curie - France
    2011
    Sujet: Méthodes automatiques pour la classification et la prédiction des pannes de réseaux

Certifications

Skill set

Categories