Welcome to the homepage of Stephen Oates
I am a Sydney-based Data Scientist with expertise in machine learning, LLMs, and insurance analytics.
Blog
Check out my blog posts for articles on data science, machine learning, and AI.
Interactive Demos
Anthropic Performance Challenge Simulator (January 2026)
Browser-based learning interface for Anthropic’s GPU performance engineering challenge. Includes a CUDA kernel simulator and step-by-step tutorials for understanding matrix multiplication optimization.
Recent Projects
AI Agents & Frameworks
Dory - AI Agent Framework Comparison (January 2026)
Comprehensive evaluation of 11 AI agent frameworks for insurance claims processing. Includes tutorials, demos, and integration with DSPy and MLFlow.
Australian Insurance Obligations Extractor (October 2025)
LLM-powered system for extracting and validating insurance compliance obligations from Australian legislation. Extracted 343 obligations from 16 legislative documents.
PromptOpt (July 2025)
Enterprise prompt optimization framework combining DSPy and GRPO approaches for systematic prompt engineering.
Python Packages
Allyanonimiser (February 2025)
A Python package for removing PII and related sensitive fields from free text. Designed for data privacy compliance in production environments.
Freamon (March 2025)
A package to make data science projects on tabular data easier. Named after the great character from The Wire played by Clarke Peters.
Meno (March 2025)
Topic modelling package for free text analysis, designed especially for the insurance domain.
Heraclitus (March 2025)
A library for making Process Mining accessible to new users. Works well with PM4PY while adding new features.
Classic Projects
Customer Life Time Value Models (2014)
A discussion of CLV models based on a review of the literature.
Churn Analysis (2014)
Using a dataset we walk through model building to predict which customers will leave in order to entice them to stay.
Survival Analysis (2014)
Using techniques originally developed in demography and biology we predict the average lifespan of customers.