# Data Flow & Command Network

How commands interconnect through shared data sources and output-to-input dependencies.

---

## 1. Command Network Graph

Arrows mean "output of A feeds as input to B":

```
                    ┌─────────┐
                    │ search  │─────────────────────────────┐
                    └────┬────┘                             │
                         │ iid                              │ topic
                    ┌────▼────┐                        ┌────▼─────┐
              ┌─────│ issues  │◄───────────────────────│ timeline │
              │     │ mrs     │ (detail)               └──────────┘
              │     └────┬────┘                             ▲
              │          │ iid                              │ entity ref
              │     ┌────▼────┐     ┌──────────────┐       │
              │     │ related │     │ file-history  │───────┘
              │     │ drift   │     └──────┬───────┘
              │     └─────────┘            │ MR iids
              │                       ┌────▼────┐
              │                       │  trace  │──── issues (linked)
              │                       └────┬────┘
              │                            │ paths
              │                       ┌────▼────┐
              │                       │   who   │
              │                       │ (expert)│
              │                       └─────────┘
              │
         file paths                   ┌─────────┐
              │                       │   me    │──── issues, mrs (dashboard)
              ▼                       └─────────┘
        ┌──────────┐                       ▲
        │  notes   │                       │ (~same data)
        └──────────┘                  ┌────┴──────┐
                                      │who workload│
                                      └───────────┘
```

### Feed Chains (output of A -> input of B)

| From | To | What Flows |
|---|---|---|
| `search` | `issues`, `mrs` | IIDs from search results -> detail lookup |
| `search` | `timeline` | Topic/query -> chronological history |
| `search` | `related` | Entity IID -> semantic similarity |
| `me` | `issues`, `mrs` | IIDs from dashboard -> detail lookup |
| `trace` | `issues` | Linked issue IIDs -> detail lookup |
| `trace` | `who` | File paths -> expert lookup |
| `file-history` | `mrs` | MR IIDs -> detail lookup |
| `file-history` | `timeline` | Entity refs -> chronological events |
| `timeline` | `issues`, `mrs` | Referenced IIDs -> detail lookup |
| `who expert` | `who reviews` | Username -> review patterns |
| `who expert` | `mrs` | MR IIDs from expert detail -> MR detail |

---

## 2. Shared Data Source Map

Which DB tables power which commands. Higher overlap = stronger consolidation signal.

### Primary Entity Tables

| Table | Read By |
|---|---|
| `issues` | issues, me, who-workload, search, timeline, trace, count, stats |
| `merge_requests` | mrs, me, who-workload, search, timeline, trace, file-history, count, stats |
| `notes` | notes, issues-detail, mrs-detail, who-expert, who-active, search, timeline, trace, file-history |
| `discussions` | notes, issues-detail, mrs-detail, who-active, who-reviews, timeline, trace |

### Relationship Tables

| Table | Read By |
|---|---|
| `entity_references` | trace, timeline |
| `mr_file_changes` | trace, file-history, who-overlap |
| `issue_labels` | issues, me |
| `mr_labels` | mrs, me |
| `issue_assignees` | issues, me |
| `mr_reviewers` | mrs, who-expert, who-workload |

### Event Tables

| Table | Read By |
|---|---|
| `resource_state_events` | timeline, me-activity |
| `resource_label_events` | timeline |
| `resource_milestone_events` | timeline |

### Document/Search Tables

| Table | Read By |
|---|---|
| `documents` + `documents_fts` | search, stats |
| `embeddings` | search, related, drift |
| `document_labels` | search |
| `document_paths` | search |

### Infrastructure Tables

| Table | Read By |
|---|---|
| `sync_cursors` | status |
| `dirty_sources` | stats |
| `embedding_metadata` | stats, embed |

---

## 3. Shared-Data Clusters

Commands that read from the same primary tables form natural clusters:

### Cluster A: Issue/MR Entities

`issues`, `mrs`, `me`, `who workload`, `count`

All read `issues` + `merge_requests` with similar filter patterns (state, author, labels, project). These commands share the same underlying WHERE-clause builder logic.

### Cluster B: Notes/Discussions

`notes`, `issues detail`, `mrs detail`, `who expert`, `who active`, `timeline`

All traverse the `discussions` -> `notes` join path. The `notes` command does it with independent filters; the others embed notes within parent context.

### Cluster C: File Genealogy

`trace`, `file-history`, `who overlap`

All use `mr_file_changes` with rename chain BFS (forward: old_path -> new_path, backward: new_path -> old_path). Shared `resolve_rename_chain()` function.

### Cluster D: Semantic/Vector

`search`, `related`, `drift`

All use `documents` + `embeddings` via Ollama. `search` adds FTS component; `related` is pure vector; `drift` uses vector for divergence scoring.

### Cluster E: Diagnostics

`health`, `auth`, `doctor`, `status`, `stats`

All check system state. `health` < `doctor` (strict subset). `status` checks sync cursors. `stats` checks document/index health. `auth` checks token/connectivity.

---

## 4. Query Pattern Sharing

### Dynamic Filter Builder (used by issues, mrs, notes)

All three list commands use the same pattern: build a WHERE clause dynamically from filter flags with parameterized tokens. Labels use EXISTS subquery against junction table.

### Rename Chain BFS (used by trace, file-history, who overlap)

Forward query:
```sql
SELECT DISTINCT new_path FROM mr_file_changes
WHERE project_id = ?1 AND old_path = ?2 AND change_type = 'renamed'
```

Backward query:
```sql
SELECT DISTINCT old_path FROM mr_file_changes
WHERE project_id = ?1 AND new_path = ?2 AND change_type = 'renamed'
```

Cycle detection via `HashSet` of visited paths, `MAX_RENAME_HOPS = 10`.

### Hybrid Search (used by search, timeline seeding)

RRF ranking: `score = (60 / fts_rank) + (60 / vector_rank)`

FTS5 queries go through `to_fts_query()` which sanitizes input and builds MATCH expressions. Vector search calls Ollama to embed the query, then does cosine similarity against `embeddings` vec0 table.

### Project Resolution (used by most commands)

`resolve_project(conn, project_filter)` does fuzzy matching on `path_with_namespace` — suffix and substring matching. Returns `(project_id, path_with_namespace)`.