
Recipe: Data analysis with project-state
April 28, 2026recipedata-analysistutorialmilestonesdecision-log
# Recipe: Data analysis with project-state
Data analysis projects have a shape that most PM tools handle badly. The work isn't a linear task list — it's an iterative cycle of data acquisition, cleaning, exploration, modelling, and insight delivery, with multiple stakeholders who want different things from the same analysis. project-state handles this well because it's structured around milestones and stakeholder reporting, not task boards.
Here's how to adapt it for a data analysis engagement.
## How data analysis maps to project-state concepts
| Data analysis concept | project-state concept |
|---|---|
| Analysis phases (acquire, clean, explore, model, deliver) | Phase preset |
| Dataset versions, model iterations | Milestones + technical_progress notes |
| Client, analyst team, exec sponsor | Stakeholder groups |
| Weekly analysis brief, final report | Reporting matrix entries |
| Scope change (new data source, new question) | Change register |
| Key analytical decisions (model choice, exclusion logic) | Decision log |
| Published findings, methodology notes | Document index |
## Step 1: Scaffold with a custom phase preset
```
ask claude: "scaffold a new v2 project, kind: research, phases: data-acquisition, data-cleaning, exploratory-analysis, modelling, insight-delivery"
```
Define gate criteria for each phase:
```yaml
phases:
- name: data-acquisition
gate_criteria:
- all source datasets received and stored
- data dictionary documented
- access permissions confirmed for all team members
- name: data-cleaning
gate_criteria:
- null/missing value audit complete
- outlier policy documented and applied
- cleaning log committed to project docs
- clean dataset version locked (document index entry: status=approved)
- name: exploratory-analysis
gate_criteria:
- EDA summary document approved by analyst lead
- key hypotheses documented as decisions
- at least one stakeholder review of preliminary findings
- name: modelling
gate_criteria:
- model selection decision logged
- validation approach documented
- baseline model milestone complete
- name: insight-delivery
gate_criteria:
- final report milestone complete
- client review meeting conducted
- all deliverables in document index (status=delivered)
```
These gate criteria become the checklist the agent evaluates when you ask "can we advance the phase?"
## Step 2: Set up stakeholders and the reporting matrix
A typical data analysis project has three stakeholder groups:
**Analyst team** — the people doing the work. They need internal status: what's blocked, what decisions are pending, what the current model state is.
**Client / sponsor** — the people who commissioned the analysis. They need progress updates and access to the findings as they emerge.
**Exec / decision-maker** — the end consumer of insights. They need a clean, concise view of findings and recommendations, not methodology.
```yaml
entries:
- stakeholder_group: analyst_team
report_type: internal_status
cadence: weekly
format: slack_message
surface: slack
channel: "#analysis-[project-name]"
- stakeholder_group: client
report_type: progress_update
cadence: biweekly
format: email_draft
surface: gmail
- stakeholder_group: exec_sponsor
report_type: findings_brief
cadence: on_milestone
trigger_milestones: ["eda-complete", "modelling-complete", "final-report"]
format: email_draft
surface: gmail
```
The `on_milestone` cadence is key here — the exec sponsor doesn't need weekly noise, just signal when something significant lands.
## Step 3: Define milestones around analytical outputs, not tasks
Milestones in data analysis should be analytical outputs, not work activities. "Clean dataset" not "clean the data". "EDA complete" not "run exploratory analysis".
```
ask claude: "add milestones:
- Clean dataset v1, due [date], owner: data engineer, definition of done: clean dataset file versioned and documented in project docs
- EDA summary, due [date], owner: lead analyst, definition of done: EDA document approved by team
- Baseline model, due [date], owner: ML engineer, definition of done: baseline results documented with evaluation metrics
- Model v1, due [date], owner: ML engineer, definition of done: model validated, assumptions documented
- Final report, due [date], owner: project lead, definition of done: report delivered and accepted by client"
```
The `technical_progress` note on each milestone is where the analytical narrative lives:
```
ask claude: "update milestone clean-dataset-v1: 70% complete, technical progress: missing value treatment complete for main tables, working on date normalization across three source systems which have inconsistent timezone handling"
```
This note goes directly into the next status report. The client doesn't see the detail — but the analyst team brief does.
## Step 4: Log analytical decisions
Data analysis is full of decisions that need to be traceable: why a particular exclusion criterion was applied, why one model was chosen over another, why an outlier was treated a certain way. Log them as they happen:
```
ask claude: "log a decision: excluding records with NULL in [field] rather than imputing, rationale: imputation would introduce systematic bias in the low-income cohort, decided by: analyst team, date: today"
```
```
ask claude: "log a decision: using XGBoost rather than logistic regression, rationale: non-linear interactions between [var1] and [var2] were significant in EDA, decided by: ML lead, approved by: client"
```
When the client asks "why did you exclude those records?" three months later, the decision is in the log with full rationale, not lost in a Slack thread.
## Step 5: Use the change register for scope changes
Scope changes in data analysis are common and dangerous. A new data source mid-project. A new question the client wants answered. A change in the target variable definition. These are material changes that need to be logged and approved.
```
ask claude: "log a change: client wants to add [new_datasource] to the analysis pipeline, classify it"
```
The change register classifies it (material — this expands scope and timeline) and creates a change record. The next status report to the client mentions it as a pending change request. Nothing moves until the change is approved and logged.
## Step 6: Deliver findings through the document index
As deliverables are produced — EDA summaries, model documentation, final reports — register them in the document index:
```
ask claude: "add document: EDA Summary v1.2, type: analytical-report, file: docs/eda-summary-v1.2.pdf, status: under-review, description: exploratory analysis covering [scope], author: [name]"
```
The document index tracks the approval lifecycle: `draft` → `under-review` → `approved` → `delivered`. Phase gate criteria can check document status — "can't advance to modelling until EDA Summary is approved."
## The result
A data analysis project running on project-state has:
- Full decision traceability from day one
- Automatic status reports that don't require manual preparation
- Phase gates that enforce analytical rigor before advancing
- A change register that catches scope creep
- Stakeholder-appropriate reporting: analyst brief, client update, exec findings brief
- A document index that tracks every deliverable through its approval lifecycle
The analyst team focuses on the analysis. The system handles the reporting.