Gabriel
Oduor

Fabric Data Engineer specialising in ETL/ELT Pipelines, Data Warehousing and Data Quality

Building reliable, scalable data systems with Python, SQL, Spark, and Microsoft Fabric. 3x Azure-Certified.

Gabriel Oduor
Data Engineer
AI & Data Science

About Me

What I build

โš™๏ธ

Data Engineering

End-to-end pipelines, Lakehouse architectures, and Medallion patterns using Microsoft Fabric, Spark, and Databricks.

๐Ÿ“Š

Analytics & BI

Interactive Power BI dashboards and data models that turn raw data into business decisions.

๐Ÿ›ก๏ธ

Data Quality

Validation frameworks, governance standards, and data protection across cloud platforms.

Skills

Technologies and tools I work with

Data Engineering

Apache SparkPySparkDatabricksHadoopDelta LakeETL/ELTData Modelling

Cloud & Platforms

Microsoft FabricMicrosoft AzureLakehouseOneLakeStreaming Analytics

SQL & Databases

SQLT-SQLPostgreSQLOracle SQLMongoDBSQLite

Programming

PythonPandasNumPyTensorFlowscikit-learn

Analytics & BI

Power BIDAXMatplotlibSeabornPower Query

Tools

GitDockerJupyterLaTeX

Certifications

Academic background and professional credentials

Education

MSc Artificial Intelligence & Data Science

Keele University
๐Ÿ“… 2023 โ€“ 2024 ยท ๐Ÿ“ Staffordshire, UK
Distinction

Vice Chancellor Scholarship Scholar. First Aid Society, Student Ambassador, External Auditor.

AI for Healthcare โ€” ACAIRA Summer School

Aston University
๐Ÿ“… 2024 ยท ๐Ÿ“ Birmingham, UK

Project proposal on patient readmission and bed management. Lectures on AI in healthcare.

Certifications

Experience

My professional journey in data

GLASS AMR Data Engine
Independent Project
๐Ÿ“… 2025

Engineered a fully automated, end-to-end data analytics platform in Microsoft Fabric to process, model, and visualise global Antimicrobial Resistance (AMR) data from WHO surveillance datasets.

Key Achievements:

  • Built Medallion architecture pipeline transforming raw surveillance data into analytics-ready models
  • Bridged Data Engineering with Microbiological domain logic
PythonMicrosoft FabricMedallion ArchitectureData Engineering
Worldwide Earthquake Intelligence Pipeline
Independent Project
๐Ÿ“… 2025

Built an end-to-end data engineering and analytics project on Microsoft Fabric. Ingested real-time earthquake data, processed it using Medallion architecture, and delivered insights through an interactive Power BI dashboard.

Key Achievements:

  • Implemented real-time data ingestion and processing pipeline
  • Designed interactive Power BI dashboard for earthquake intelligence
PythonMicrosoft FabricPower BIMedallion
Mental Health Data Analysis
Academic Project โ€” Keele University
๐Ÿ“… 2024

Explored mental health trends across 292,364 responses from 35 countries using PySpark and Python. Analysed treatment-seeking behaviour, work interest, and family history impacts.

Key Achievements:

  • Processed 292K+ records using PySpark for big data handling
  • Delivered visualisations informing better mental health strategies
PySparkPythonBig DataVisualisation
Sentiment Analysis of Product Reviews
Academic Project โ€” HyperionDev
๐Ÿ“… Jan โ€“ Mar 2024

Built an NLP pipeline to analyse the sentiment of product reviews, extracting actionable insights from unstructured text data.

PythonNLPMachine Learning
CarSharing Data Analysis
Academic Project โ€” Keele University
๐Ÿ“… 2024

Database management and predictive analytics project using SQLite and multiple ML models to analyse car-sharing demand patterns.

Key Achievements:

  • Implemented Random Forest, Neural Networks, and clustering for demand prediction
  • Built end-to-end pipeline from database management to predictive modelling
PythonSQLiteRandom ForestNeural Networks
Math Prompt Engineer
Scale AI
๐Ÿ“… Oct 2024 โ€“ Oct 2025๐Ÿ“ Remote

Utilised Reinforcement Learning from Human Feedback (RLHF) to improve machine learning model efficiency. Applied mathematical expertise to evaluate and refine model outputs.

Key Achievements:

  • Applied probability, calculus, linear algebra, and statistics to assess model quality
  • Improved training data quality through structured mathematical evaluation
RLHFMathematicsStatisticsLaTeX

Projects

Recent work and experiments

GLASS AMR

GLASS AMR Data Engine

End-to-end WHO AMR surveillance pipeline in Microsoft Fabric with Medallion architecture.

PythonMicrosoft FabricMedallion
Earthquake Pipeline

Earthquake Intelligence Pipeline

Real-time earthquake data ingestion with Medallion architecture and Power BI dashboard.

PythonFabricPower BIMedallion
Mental Health

Mental Health Data Analysis

Big data project analysing 292K+ responses across 35 countries using PySpark and Python.

PySparkPythonBig Data
Sentiment Analysis

Sentiment Analysis of Product Reviews

NLP pipeline to analyse product review sentiment and extract actionable insights.

PythonNLPML
CarSharing

CarSharing Data Analysis

Database management and predictive analytics with SQLite, Random Forest, and Neural Networks.

PythonSQLiteML
Power BI

Power BI Projects

Interactive dashboards demonstrating analytical storytelling and BI capabilities.

Power BIDAXData Viz

Let's Connect

Open to new opportunities

Contact Info

Netherlands

Send Message