riemann-flow-gnn/burn_manifold_graph_ai_handoff.md

# Burn 기반 Pose Manifold Graph 구현 AI 전달 문서

## 0. 문서 목적

이 문서는 **단일 ligand pose overfitting 실험**을 위해, AI 에이전트가 구현해야 할 범위를 명확히 정의한 handoff 문서이다.

구현 대상은 다음 세 가지다.

1. **Burn 규격에 맞는 manifold tensor 표현 및 graph 구현체**
2. **graph 입력을 받아 시각화 가능한 구조 표현과 렌더링 파이프라인**
3. **time parameter `t ∈ [0, 1]`에 따라 translation / rotation / torsion이 변하는 simulator**

이번 단계의 목표는 **연구용 완성품**이 아니라, 발표 및 초기 검증을 위한 **작동 가능한 prototype**이다.

---

## 1. 문제 설정

### 1.1 상태 공간

Ligand pose state는 아래 product space로 둔다.

- `translation`: `R^3`
- `rotation`: `SO(3)`
- `torsion`: `T^m` where `m = number of rotatable bonds`

즉 하나의 pose state는 개념적으로 아래와 같다.

```text
x = (p, R, theta)

p     : global translation, shape [3]
R     : global rotation
theta : torsion angles for rotatable bonds, shape [m]
```

### 1.2 그래프 입력

입력 ligand는 graph로 표현한다.

- node = atom
- edge = bond
- torsion-available edge = 회전 가능한 bond

이번 단계에서는 **protein / pocket context는 제외**한다.
오직 ligand 단일 graph와 target pose만 다룬다.

### 1.3 실험 목적

- 주어진 ligand graph 하나에 대해
- 임의의 초기 pose에서 출발해
- 하나의 fixed target pose로 수렴하는 trajectory를 생성하고
- 이를 시각화 가능하게 만든다.

학습 로직 전체를 이 문서에서 구현하라는 뜻은 아니다.
이번 구현의 중심은 **data representation + simulator + visualization**이다.

---

## 2. 구현 범위 요약

AI 에이전트는 아래 항목을 구현한다.

### 필수 구현

- Burn-friendly tensor container
- ligand graph data structure
- torsion metadata extraction/representation
- pose state container
- explicit simulator for `t=0..1`
- 3D/2D visualization export utility

### 이번 단계에서 제외

- full training loop
- optimizer / scheduler
- dataset loader for many molecules
- protein-ligand interaction scoring
- equivariant GNN model 자체

---

## 3. 설계 원칙

### 3.1 Burn 친화적 설계

구현체는 Burn 모델과 쉽게 연결될 수 있어야 한다.
따라서 다음 원칙을 지킨다.

- **tensor payload는 Burn tensor로 쉽게 변환 가능**해야 한다.
- 구조체는 geometry / graph / rendering 책임을 분리한다.
- manifold-specific logic는 순수 수학 유틸로 분리한다.
- 시각화는 모델과 독립적인 모듈로 둔다.

### 3.2 API 변동에 덜 민감한 설계

Burn 버전 변화에 덜 흔들리도록, 내부 표현과 backend 종속성을 너무 깊게 섞지 않는다.

권장 방식:

- domain object는 plain Rust struct로 유지
- 필요 시 `to_tensor::<B: Backend>(&device)` 같은 adapter 제공
- 학습용 Autodiff backend와 추론용 backend를 쉽게 교체할 수 있게 설계

### 3.3 발표용 prototype 우선

이번 구현은 성능 최적화보다 아래가 중요하다.

- 구조가 명확할 것
- simulator가 deterministic하게 잘 동작할 것
- geometry가 눈으로 드러날 것

---

## 4. 구현해야 할 핵심 데이터 구조

### 4.1 Atom / Bond / LigandGraph

#### Atom

최소 필드:

```rust
pub struct Atom {
    pub index: usize,
    pub atomic_number: u8,
    pub formal_charge: i8,
    pub is_aromatic: bool,
}
```

#### Bond

최소 필드:

```rust
pub struct Bond {
    pub index: usize,
    pub src: usize,
    pub dst: usize,
    pub bond_order: u8,
    pub is_aromatic: bool,
    pub is_rotatable: bool,
}
```

#### TorsionEdge

회전 가능한 bond에 대해, 어떤 원자 집합이 어느 쪽 fragment로 회전하는지 알아야 한다.
이 metadata가 simulator에서 중요하다.

```rust
pub struct TorsionEdge {
    pub bond_index: usize,
    pub atom_left: usize,
    pub atom_right: usize,
    pub rotating_side: Vec<usize>,
}
```

설명:

- `atom_left`, `atom_right`는 torsion axis를 이루는 bond의 두 원자
- `rotating_side`는 회전 적용 대상 atom index 목록

#### LigandGraph

```rust
pub struct LigandGraph {
    pub atoms: Vec<Atom>,
    pub bonds: Vec<Bond>,
    pub edge_index: Vec<(usize, usize)>,
    pub torsion_edges: Vec<TorsionEdge>,
}
```

요구사항:

- undirected bond graph를 기본으로 하되
- GNN 입력용으로는 directed edge pair로 변환 가능해야 함
- torsion edge list를 별도로 유지해야 함

---

### 4.2 PoseState

pose state는 ligand의 geometry 상태를 담는다.

#### 권장 표현

```rust
pub struct PoseState {
    pub translation: [f32; 3],
    pub rotation_quat_xyzw: [f32; 4],
    pub torsions: Vec<f32>,
}
```

주의:

- quaternion은 normalize 보장 필요
- torsion은 radian 단위
- torsion 범위는 기본적으로 `[-pi, pi)` 로 유지

rotation은 내부적으로 quaternion으로 저장하되,
필요 시 axis-angle / rotation matrix 변환 유틸을 제공한다.

---

### 4.3 LigandConformer

실제 원자 좌표를 가진 구조 표현.

```rust
pub struct LigandConformer {
    pub graph: LigandGraph,
    pub coords: Vec<[f32; 3]>,
}
```

요구사항:

- `coords.len() == graph.atoms.len()` 이어야 함
- reference conformer를 기준으로 pose transform 적용 가능해야 함

---

## 5. Burn에 연결 가능한 tensor adapter

AI 에이전트는 아래와 같은 방향의 adapter를 제공한다.

```rust
pub struct GraphTensors<B: burn::tensor::backend::Backend> {
    pub node_features: burn::tensor::Tensor<B, 2>,
    pub edge_index: burn::tensor::Tensor<B, 2, burn::tensor::Int>,
    pub edge_features: Option<burn::tensor::Tensor<B, 2>>,
    pub torsion_edge_index: burn::tensor::Tensor<B, 2, burn::tensor::Int>,
}
```

목표는 정확한 최종 타입을 박제하는 것이 아니라, 아래 contract를 만족하는 것이다.

### contract

- node features: `[num_nodes, node_feat_dim]`
- edge index: `[2, num_edges]`
- torsion edge index: `[2, num_rotatable_bonds]` 또는 equivalent representation
- pose tensor export: translation / quaternion / torsion을 batchable tensor로 변환 가능

### node feature 최소 구성

최초 버전은 아래 정도면 충분하다.

- atomic number
- formal charge
- aromatic flag
- degree(optional)

### edge feature 최소 구성

- bond order
- aromatic flag
- rotatable flag

### pose tensor export

```rust
pub struct PoseTensors<B: burn::tensor::backend::Backend> {
    pub translation: burn::tensor::Tensor<B, 2>, // [batch, 3]
    pub rotation: burn::tensor::Tensor<B, 2>,    // [batch, 4] quaternion
    pub torsions: burn::tensor::Tensor<B, 2>,    // [batch, m]
}
```

---

## 6. 구현해야 할 manifold math 유틸

이 모듈은 모델과 완전히 분리된 순수 수학 유틸이어야 한다.

파일 예시:

```text
src/geometry/
  euclidean.rs
  so3.rs
  torus.rs
  pose.rs
```

### 6.1 Translation (`R^3`)

필수 함수:

- `lerp_translation(p0, p1, t)`
- `translation_delta(p0, p1)`

### 6.2 Rotation (`SO(3)`)

필수 함수:

- quaternion normalize
- quaternion multiply / inverse
- quaternion to rotation matrix
- rotation matrix to quaternion (optional)
- `slerp(q0, q1, t)`
- `relative_rotation(q_from, q_to)`
- `so3_log(q_rel) -> [f32; 3]`
- `so3_exp(omega) -> quat`

설명:

- simulator에서는 `slerp` 기반 interpolation만 먼저 구현해도 충분
- 이후 tangent vector supervision을 위해 `log/exp` utility 유지 권장

### 6.3 Torsion (`T^m`)

필수 함수:

- `wrap_pi(x)`
- `wrap_vec(theta)`
- `shortest_angle_delta(a, b)`
- `interpolate_torsion(theta0, theta1, t)`

정의:

```text
shortest_angle_delta(a, b) = wrap_pi(b - a)
interpolate_torsion(theta0, theta1, t) = wrap(theta0 + t * shortest_delta)
```

### 6.4 Product Pose Geometry

필수 함수:

- `interpolate_pose(pose0, pose1, t) -> PoseState`
- `pose_delta(pose0, pose1) -> PoseDelta`

예상 구조:

```rust
pub struct PoseDelta {
    pub translation: [f32; 3],
    pub rotation_tangent: [f32; 3],
    pub torsions: Vec<f32>,
}
```

---

## 7. 좌표 복원 / simulator 요구사항

핵심 요구는 다음과 같다.

> `reference conformer + pose state` 를 받아, 시각화 가능한 현재 좌표를 만든다.

### 7.1 적용 순서

권장 적용 순서:

1. reference conformer 좌표 복사
2. rotatable bond별 torsion 회전 적용
3. global rotation 적용
4. global translation 적용

### 7.2 torsion 회전 적용

각 `TorsionEdge`에 대해:

- axis = bond direction
- pivot = bond axis 상의 원자 위치
- rotating_side에 포함된 atom들만 회전
- angle = 해당 torsion value

즉, torsion은 local internal transform이다.

### 7.3 global transform 적용

- rotation quaternion으로 전체 원자 좌표 회전
- translation vector 더하기

### 7.4 simulator

입력:

- `reference conformer`
- `pose_start`
- `pose_target`
- `t in [0,1]`

출력:

- `PoseState at t`
- `LigandConformer at t`

즉,

```rust
pub fn simulate_at_t(
    reference: &LigandConformer,
    pose_start: &PoseState,
    pose_target: &PoseState,
    t: f32,
) -> LigandConformer
```

형태의 API가 있어야 한다.

### 7.5 trajectory

추가로 아래 helper도 구현한다.

```rust
pub fn simulate_trajectory(
    reference: &LigandConformer,
    pose_start: &PoseState,
    pose_target: &PoseState,
    num_steps: usize,
) -> Vec<LigandConformer>
```

- `t = 0..1` 균등 분할
- endpoint 포함
- 결과는 animation 또는 프레임 저장에 활용 가능

---

## 8. Visualization 요구사항

이번 단계에서 visualization은 매우 중요하다.
발표용 데모에 직접 쓰일 수 있어야 한다.

### 필수 출력

- ligand graph connectivity 확인용 2D/3D static view
- current pose 단일 프레임 렌더링
- trajectory animation용 frame sequence export

### 최소 기능

#### A. static plot

입력:
- atom coordinates
- bond list

출력:
- atoms = points
- bonds = lines
- rotatable bond는 다른 색상 또는 강조

#### B. trajectory frames

입력:
- `Vec<LigandConformer>`

출력:
- png sequence 또는 gif용 intermediate frames

#### C. optional: HTML viewer

가능하면 아래 중 하나 추가 검토:

- plotly 기반 HTML
- three.js/json export용 포맷
- simple `.xyz` / `.sdf` frame export

### 시각화 우선순위

1. 정적 PNG 출력
2. trajectory frame sequence 출력
3. 선택적으로 gif/mp4/HTML

주의:

- visualization 구현은 model과 완전히 분리
- geometry correctness 검증이 목적이지 예쁜 UI가 목적은 아님

---

## 9. 권장 디렉터리 구조

```text
src/
  graph/
    atom.rs
    bond.rs
    torsion.rs
    ligand_graph.rs
    featurize.rs

  geometry/
    euclidean.rs
    so3.rs
    torus.rs
    pose.rs

  conformer/
    conformer.rs
    apply_torsion.rs
    apply_pose.rs

  burn_adapter/
    graph_tensors.rs
    pose_tensors.rs

  simulator/
    interpolate.rs
    trajectory.rs

  viz/
    plot_static.rs
    plot_trajectory.rs
    export.rs

  lib.rs
```

---

## 10. 구현 상세 계약

### 10.1 반드시 보장해야 할 것

- quaternion normalization이 항상 유지될 것
- torsion angle은 항상 wrapped range에 있을 것
- `rotating_side` 계산이 일관될 것
- bond axis 기준 회전이 올바를 것
- `simulate_at_t(..., 0.0)` 는 시작 pose와 일치할 것
- `simulate_at_t(..., 1.0)` 는 타깃 pose와 일치할 것

### 10.2 numerical sanity checks

AI 에이전트는 아래 테스트를 함께 작성한다.

#### Test 1: wrap correctness

- `pi - eps` 와 `-pi + eps` 의 차이가 작은 값으로 처리되는지 검증

#### Test 2: quaternion norm

- interpolation 전후 quaternion norm ≈ 1

#### Test 3: endpoint correctness

- trajectory 첫 프레임과 마지막 프레임이 입력 pose와 일치

#### Test 4: rigid transform invariance

- torsion 없는 molecule에서는 internal geometry가 rigid하게 유지

#### Test 5: torsion rotation locality

- 특정 torsion 회전 시 rotating_side atom만 변하는지 확인

---

## 11. 입력 데이터 가정

이번 구현은 범용 loader까지 강제하지 않는다.
아래 두 입력 경로 중 하나만 우선 지원하면 된다.

### 경로 A: 수동 구성

- atom list
- bond list
- reference coordinates
- rotatable bond metadata
- target pose

### 경로 B: RDKit 등 외부 전처리 결과 import

- JSON 또는 serde-friendly 포맷으로 graph + conformer metadata load

권장:

- Rust 내부 구현은 외부 화학 toolkit에 강하게 묶지 말 것
- 외부 전처리 결과를 import하는 구조가 더 안전함

---

## 12. 구현 우선순위

### Phase 1

- `Atom`, `Bond`, `TorsionEdge`, `LigandGraph`
- `PoseState`
- `wrap_pi`, `shortest_angle_delta`
- quaternion normalize + slerp
- `interpolate_pose`

### Phase 2

- `LigandConformer`
- torsion application
- global rotation/translation application
- `simulate_at_t`

### Phase 3

- Burn tensor adapter
- graph/node/edge feature export
- pose tensor export

### Phase 4

- visualization static export
- trajectory frame export
- tests / sanity checks

---

## 13. 기대 산출물

AI 에이전트는 최종적으로 아래를 제공해야 한다.

### 코드 산출물

- Rust crate source
- 기본 unit tests
- example 실행 코드

### example 요구

example은 아래를 보여줘야 한다.

1. ligand graph 생성 또는 로드
2. reference conformer 준비
3. start pose / target pose 설정
4. `t=0.0, 0.25, 0.5, 0.75, 1.0` 프레임 생성
5. static image 또는 trajectory frame export
6. graph tensor / pose tensor 변환 예시

---

## 14. AI 에이전트에게 주는 구현 지침

### 반드시 지킬 것

- 너무 많은 abstraction을 한 번에 넣지 말 것
- 우선 **작동하는 명시적 구현**을 만들 것
- geometry correctness를 최우선으로 둘 것
- Burn 모델 학습 코드까지 과도하게 확장하지 말 것

### 선호하는 구현 스타일

- plain Rust struct 중심
- 명시적 타입
- unit test 풍부하게
- 실패 시 panic보다 `Result` 기반 오류 처리 선호
- visualization/output path는 예제에서 바로 실행 가능하게 구성

### 금지 사항

- 불필요한 거대 프레임워크 도입
- protein context까지 무리하게 확장
- 학습 코드와 시각화 코드를 강하게 결합
- 화학 toolkit 종속성을 core domain object 안에 깊게 삽입

---

## 15. 구현 완료 판정 기준

아래 조건을 만족하면 이번 단계 구현 완료로 본다.

- ligand graph와 torsion metadata를 표현할 수 있음
- pose state를 translation / rotation / torsion으로 표현할 수 있음
- `t=0..1` explicit interpolation simulator가 동작함
- conformer 좌표를 올바르게 복원함
- static frame 및 trajectory frame을 export할 수 있음
- Burn 입력용 tensor adapter가 존재함
- 최소한의 numerical sanity tests가 모두 통과함

---

## 16. 선택적 확장 아이디어

이번 단계 이후 확장 후보:

- batch support
- pocket context 추가
- learned vector field model 연결
- explicit path supervision dataset 생성기
- SE(3)-equivariant / graph neural model 연결
- target pose 단일 개체가 아니라 multiple conformer target 지원

---

## 17. 마지막 메모

이 구현은 최종 모델이 아니라, 아래 질문에 답하기 위한 prototype이다.

> “ligand pose manifold (`R^3 × SO(3) × T^m`) 위의 상태를 Rust/Burn 친화적으로 표현하고,
> graph 입력과 연결하며,
> explicit path simulator와 visualization까지 한 번에 돌릴 수 있는가?”

이 질문에 yes를 만들 수 있으면 이번 단계는 충분히 성공이다.