cat ./learning
Learning.
What I'm currently studying. Some of this becomes writing; most of it stays in the notebook. Updated roughly quarterly.
> Last updated
- Paper
Constitutional AI: Harmlessness from AI Feedback
Anthropic's foundational paper on RLAIF. Re-reading with fresh eyes for the agentic-AI threat-modelling work.
- Paper
Sleeper Agents: Training Deceptive LLMs
Backdoors that survive safety training. Implications for ML supply chain security and MAESTRO's training-time risk layer.
- Framework
MAESTRO (CSA)
Reading deeply to extract the practitioner subset that maps onto real engagements, separating framework from cargo cult.
- Book
Threat Modeling. Adam Shostack
Re-reading the chapters on system-level reasoning, applying the lens to agentic AI deployments.
- Tool
PyRIT internals
Going deeper than user-mode, reading the harness internals to extend it for domain-specific test corpora (healthcare, finance).
- Course
Probabilistic Machine Learning. Kevin Murphy (Vol. 2)
Refreshing the deep-learning math that underpins ML security review credibility.
- Side project
Five-Zone worksheet
Translating the method from the STRIDE-breaks post into something engineering teams can run without me. Aim: open-source publish by Q3.