You're seeing this page as if you were . The main menu is still yours, though. Exit from immersion
Oscar Javier Pérez LoraOJ

Oscar Javier Pérez Lora

Data Scientist | Machine Learning | n8n | RAG LLM

€270/day
Madrid, ES
3-7 years

Average response time: 4 hours

Freelancer profile translated to English.
Back to original language

About Oscar Javier

I transform data into strategic decisions and competitive advantage. As a Data Scientist and AI Consultant, I specialize in developing and implementing high-impact solutions that solve complex business problems.

My approach combines the rigor of econometrics and statistical analysis with the power of Machine Learning, Generative Artificial Intelligence (GenAI), and process automation to deliver measurable results.

My areas of expertise include
  • **Predictive Modeling and Machine Learning**: Development of models (regression, classification, clustering) to optimize demand forecasts, detect anomalies, or segment customers.
  • **Generative Artificial Intelligence and LLMs**: Implementation of solutions based on Large Language Models (LLMs), including Retrieval-Augmented Generation (RAG) systems, to automate workflows and create custom conversational assistants.
  • **Automation with AI Agents**: Development of automated workflows with AI Agents in n8n, for both data processing and administrative and educational processes.
  • **Data Engineering and Analysis**: Processing and analysis of large volumes of data with Python, PySpark, and SQL, ensuring data quality and preparation for modeling.
  • **Business Intelligence & Data Storytelling**: Creation of interactive dashboards in Power BI and Tableau that communicate complex insights clearly and actionably, turning analysis into a visual narrative that drives action.
  • Spanish

    Native or bilingual

  • English

    Conversational

Can work on-site
Madrid (up to 50km), Barcelona (up to 50km), Málaga (up to 50km), Bilbao (up to 50km), Salamanca (up to 50km)

Experience

  • Corporación Unificada Nacional de Educación Superior - CUN
    AI Developer
    July 2025 - Today (11 months)
    Problem:The entity needed a quantitative methodology to identify and prioritize high-risk territories. Additionally, the process for generating descriptive analyses based on these findings was manual, slow, and consumed significant resources.

    Phases and Results

    1. Statistical Modeling and Risk Map Creation Phase
    • Designed and executed an end-to-end data pipeline, processing and unifying 129 indicators from various sources using R scripts and advanced PostgreSQL queries.
    • Applied dimensionality reduction techniques such as Principal Component Analysis (PCA) to objectively weight risk factors and built a log-linear model to calculate incidence risk at the territorial level.
    2. Report Automation with Generative AI (RAG) Phase
    • To automate analysis generation, implemented a Retrieval-Augmented Generation (RAG) system that connects local Large Language Models (LLMs) like Gemma and Phi with the documentary corpus on Human Rights and recent news related to this topic at the territorial level.
    • Developed the complete workflow using Python, LangChain, and Hugging Face for document vectorization and prompt design (Prompt Engineering), ensuring contextualized and accurate responses for the reports.
    Results and Deliverables
    • Interactive Risk Maps allowing the client to make data-driven decisions for resource allocation.
    • Automated analysis generation system that significantly reduced report generation times.
    • Complete technical documentation of the entire solution (data pipeline, statistical model, and RAG architecture), ensuring its reproducibility and scalability.
    LLM Cloud Computing n8n Automation API
  • Observatorio Presidencial Derechos Humanos
    Data Scientist
    RESEARCH
    March 2025 - July 2025 (4 months)
    Problem:The entity needed a quantitative methodology to identify and prioritize high-risk territories. Additionally, the process for generating descriptive analyses based on these findings was manual, slow, and consumed significant resources.

    Phases and Results

    1. Statistical Modeling and Risk Map Creation Phase
    • Designed and executed an end-to-end data pipeline, processing and unifying 129 indicators from various sources using R scripts and advanced PostgreSQL queries.
    • Applied dimensionality reduction techniques such as Principal Component Analysis (PCA) to objectively weight risk factors and built a log-linear model to calculate incidence risk at the territorial level.
    2. Report Automation with Generative AI (RAG) Phase
    • To automate analysis generation, implemented a Retrieval-Augmented Generation (RAG) system that connects local Large Language Models (LLMs) like Gemma and Phi with the documentary corpus on Human Rights and recent news related to this topic at the territorial level.
    • Developed the complete workflow using Python, LangChain, and Hugging Face for document vectorization and prompt design (Prompt Engineering), ensuring contextualized and accurate responses for the reports.
    Results and Deliverables
    • Interactive Risk Maps allowing the client to make data-driven decisions for resource allocation.
    • Automated analysis generation system that significantly reduced report generation times.
    • Complete technical documentation of the entire solution (data pipeline, statistical model, and RAG architecture), ensuring its reproducibility and scalability.
    RAG Python Langchain Machine Learning Statistics
  • Fundación APG
    Data and BI Consultant
    RESEARCH
    November 2024 - January 2025 (2 months)
    As a lead data consultant, I led the design and implementation of a pilot for the creation of an access to justice observatory for the Ministry of Justice. My objective was to transform complex and dispersed data into a consolidated database and interactive visual tools to identify and analyze the barriers to access to justice in the country.

    Key Phases and Deliverables:

    1. Key Indicator (KPI) Design and Strategy:
    • Defined and standardized a baseline of over indicators aligned with international standards to measure and monitor access barriers.
    • Conducted an exhaustive mapping and analysis of data sources, using SQL for extraction and R (Tidyverse) for cleaning and preparation, ensuring the quality and reliability of the information.
    • Delivered a strategic report on the status of information sources, identifying key allies for the observatory's sustainability.
    2. Statistical Analysis and Insight Generation:
    • Applied statistical models and data analysis in R (dplyr, ggplot2) to process large volumes of quantitative and qualitative data.
    • Discovered and documented significant patterns, trends, and correlations that allowed for the identification of geographic and demographic areas with the greatest barriers to justice access.
    • Prepared analytical reports with data-driven recommendations for the formulation of preliminary public policies.
    3. Interactive Dashboard Development in Power BI:
    • Designed and built a series of interactive dashboards and visualization pilots in Power BI for data exploration.
    • Created interactive visualizations and data modeling with DAX to translate complex findings into clear and visually appealing information, enabling rapid trend identification.
    • Generated automated reports that served as a basis for strategic decision-making within the Ministry.
    DAX Language Microsoft Power BI R Python SQL

Recommendations

Be the first to recommend Oscar Javier

Help this freelancer shine by sharing your experience working together.

These freelancer profiles also match your criteria

AgathaA

Agatha Frydrych

Backend Java Software Engineer

4.7

(3)

2

BaptisteB

Baptiste Duhen

Fullstack developer

4.6

(4)

5

AmedA

Amed Hamou

Senior Lead Developer

4

(2)

7

AudreyA

Audrey Champion

Web developer

4.3

(3)

4

Education

  • Diploma in Artificial Intelligence and Deep Learning
    Universidad Nacional de Colombia
    2021
    - Formación teórica en Deep Learning (redes neuronales profundas) y principales algoritmos relacionados. - Puesta en práctica de los principales algoritmos del Deep Learning a través de Python y el estudio de casos. - Desarrollo del proyecto de final de curso orientado a la búsqueda semántica de sentencias judiciales.
  • Economics
    Universidad Nacional de Colombia
    2008

Certifications

Skill set

Categories