Paul Chambaz

Hey! I’m finishing my Master’s in Computer Science (AI2D track) at Sorbonne Université. I’m interested in reinforcement learning because it can achieve incredible performance in real-world tasks, yet we still don’t understand why many algorithms work or fail. I’m currently at ISIR with Prof. Olivier Sigaud, working on a new RL algorithm we aim to submit to ICLR. I believe in transparent, controllable AI rather than closed black boxes. Outside of my studies, I’m reading, running or writing open source software, including tools for ALIAS, the CS student association at Sorbonne, or a self-hosted audiobook stack.

profile picture

Research

Actor Free critic Updates for Off-Policy and Offline Learning
Paul Chambaz, Frédéric Li Combeau
M1 AI2D Sorbonne Université

Blog posts

Some notes on the TQC figure (Aug 2025)

Estimation biases represent a persistent challenge in reinforcement learning, where errors in value estimation can accumulate through bootstrapping and compromise learning efficiency. Among these …

Projects

Polybase - student paper handout distribution system
ALIAS Student Association
go, templ, htmx, tailwindcss, sqlite3
Ssherd - RL training job orchestrator over SSH
paulchambaz
go, templ, htmx, websocket, ssh, nfs, slurm
Mpcube - album focused terminal music client
paulchambaz
go, bubbletea, mpd
Iliad - self-hosted audiobook server
paulchambaz
rust, rest, sqlite3
Odyssey - android audiobook client for Iliad
paulchambaz
kotlin, android
Odyssey TUI - terminal audiobook client for Iliad
paulchambaz
go, bubbletea

Education

Master in Computer Science - AI2D
Sorbonne Université (2024-2026)
M2 S3: 15.8/20 1st/45, M1: 16.4/20 1st/53, S1: 15.85/20, S2: 17.03/20
Excellence diploma program
Bachelor in Computer Science
Université Paris Cité (2019-2023)
Mention Très Bien - 16/20
First year bachelor Computer Science
Université Claude Bernard Lyon 1 (2018-2019)

Work Experience

M2 Research Intern
ISIR Laboratory, Sorbonne Université (February - August 2026)
Supervised by Prof. Olivier Sigaud
Python, JAX, Pytorch, Reinforcement Learning, Matplotlib
M1 Research Intern
ISIR Laboratory, Sorbonne Université (Summer 2025)
Supervised by Prof. Olivier Sigaud
Python, JAX, Pytorch, Reinforcement Learning, Matplotlib
Cybersecurity Developer
Mobeta (February - August 2024)
Supervised by Arthur Le Corguillé
TypeScript, Go, Python, Docker, Cybersecurity
OSINT Developer Intern
Lexfo (Summer 2023)
Supervised by Armand Sylvain
Python, Ansible, Active Directory, Proxmox

Coursework

M2 Sorbonne Université (S3)

UM5IN253 - Models and Algorithms for Decision under Uncertainty
UM5IN256 - Models and Algorithms for Multicriteria and Collective Decision
UM5IN257 - Algorithms for Optimization and Game Theory
UM5IN259 - Artificial Intelligence and Robotics
UM5IN861 - Deep Learning

M1 Sorbonne Université (S2)

MU4IN204 - Decision and Games
MU4IN201 - Problem Solving
MU4IN202 - Foundations of Multi-agent Systems
MU4IN811 - Machine Learning
MU4IN206 - AI2D Research and Development Project
MU4IN207 - Learning and Robotics

M1 Sorbonne Université (S1)

MU4IN800 - Logic and Knowledge Representations
MU4IN601 - Probabilistic and Statistical Methods and Algorithms for Computer Science
MU4IN200 - Modeling, Optimization, Graphs, and Linear Programming
MU4IN600 - Basics of Image Processing
MU4IN900 - Complexity, Randomized and Approximate Algorithms
MU4IN400 - Concurrent and Distributed System Programming