About Sami
French
Native or bilingual
English
Fluent
Experience
- Euler HermesNLP Engineer / ML SpecialistApril 2021 - Today (5 years and 2 months)92800 Puteaux, FranceDesigned and implemented an innovative automated news monitoring system for credit analysts to proactively detect events affecting companies' credit risk profiles. This full-scale NLP solution processes 70,000 news articles daily with near real-time analysis.Key Achievements:
- Developed specialized Named Entity Recognition (NER) models to accurately identify and disambiguate primary companies mentioned in articles, achieving 80%+ F1 score
- Created sentiment classification models for English, French, and German articles, reaching 85% F1 score for sentiment analysis and 90% F1 score for importance classification
- Enhanced the system with targeted sentiment analysis capabilities to independently evaluate multiple companies mentioned within the same article
- Designed generative AI-powered annotation pipelines for NER, importance and targeted sentiment analysis, creating 5,000+ annotated articles with minimal human intervention
- Designed and deployed a complete AWS-based architecture using Lambda functions, S3 storage, data streams, and Elasticsearch for article indexing and retrieval
Technical Environment:- Generative AI: Advanced prompt engineering for synthetic data generation
- NLP: Fine-tuned BERT-based models, custom NER systems, multilingual sentiment analysis
- Cloud: AWS (Lambda, S3, Kinesis, Elasticsearch, SageMaker for real-time inference)
- Machine Learning: Supervised classification, transformer models, LSH for semantic similarity
Innovation:- Leveraged generative AI to transition from global sentiment analysis to targeted entity-level analysis for complex multi-company articles
- Created a novel LLM-powered workflow for generating high-quality financial news annotations specifically tailored to credit risk analysis
- Developed a distillation approach combining large-scale LLMs for synthetic data creation with efficient fine-tuned models for production
- Samsung ElectronicsNLP EngineerOctober 2018 - December 2020 (2 years and 2 months)Paris, FranceLocalization and enhancement of Samsung's Bixby voice assistant across mobile and connected devices, working within a multidisciplinary team of developers, data scientists, linguists, and NLP specialists.Key Achievements:
- Developed, tested, and debugged multiple voice modules for Samsung's Bixby voice assistant, ensuring seamless French language functionality
- Created an automated transliteration system for geographic entities from Russian, Korean, and Japanese into French, leveraging Conditional Random Fields (CRF) for candidate sequence scoring
- Built a proof-of-concept emotion recognition system capable of identifying 7 distinct emotions from voice input using SVM classification and acoustic feature engineering
- Enhanced and maintained the French geographic entity database, improving voice recognition accuracy for location-based queries
- Developed user log pattern classification tools to identify improvement opportunities and optimize voice assistant performance
Technical Environment:- Machine Learning: Conditional Random Fields (CRF), Support Vector Machines (SVM), feature selection and engineering
- Programming: Python, PyTorch, Scikit-learn, Git
- NLP: Speech processing, transliteration systems, entity recognition, voice emotion detection
- Data Collection: Developed scrapers with Selenium, extracted structured datasets from DBpedia and Wikidata
- Database: PostgreSQL for geographic entity management
- Development: BixbyDeveloper environment
- ImplicityNLP Research InternJanuary 2018 - June 2018 (5 months)Île-de-France, FranceDesigned and developed an automated solution for extracting critical medical concepts from unstructured clinical reports, enabling more efficient analysis of patient data and enhancing the Implicity platform's capabilities.Key Achievements:
- Developed a system to automatically extract medical concepts from unstructured text, specifically targeting hospitalization details, clinical test results, and treatment information
- Developed a multi-stage NLP pipeline combining document classification, paragraph segmentation, and named entity recognition for comprehensive medical text analysis
- Created a document classification system using linear SVM and TF-IDF of unigrams to categorize medical reports by subject with high accuracy
- Implemented paragraph-level classification to identify specific content types (medical history, clinical examination, etc.) using supervised machine learning techniques
- Built unsupervised extraction systems for dates and clinical examination values, enhancing the structured data available for clinical analysis
Innovation:- Developed hybrid rule-based and statistical approaches for medical concept extraction
- Implemented a progressive bootstrapping technique to incrementally improve named entity extraction performance with minimal manual annotation
Recommendations
Be the first to recommend Sami
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- Machine Learning Sequence Models Neural Networks and Deep Learning Convolutional Neural Networks Improving Deep Neural Networks: Hyperparameter tuning, Regularization and OptimizationMachine Learning Sequence Models Neural Networks and Deep Learning Convolutional Neural Networks Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
- Master of Science (MS),École Nationale Supérieure d'Arts et Métiers2017Master of Science (MS),