Document anti-leakage early stopping and rollback policy.

Forbid using the final RMSD metric (or equivalent) as a training-time stop signal; allow train-side stopping only. Add rollback hygiene when failures accumulate without beating BEST_PRACTICE, and log the executed rollback to the attempt log. Made-with: Cursor
2026-04-16 18:05:46 +09:00
parent b0eca31865
commit ba1c1a3892
2 changed files with 18 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -18,6 +18,7 @@ This repository is intentionally pinned to CUDA 12.6 PyTorch wheels and matching
 - Attempt log is mandatory for both successful and failed trials.
 - Flow-matching training time must stay random (middle-time supervision is mandatory).
 - Independent attempts must be research-level changes (architecture/training strategy/loss design). Pure hyperparameter-only runs are not counted as standalone attempts.
+- When failures accumulate without beating `BEST_PRACTICE.json`, follow `GUIDELINES.md` rollback rules (partial or full reset to anchor); log rollback in the attempt log; do not use `mean_rmsd_100` (or equivalent) as a training-time early-stopping signal.
 - Commits touching `train.py` must include:
  - `reports/latest_eval.json`
  - strictly better `mean_rmsd_100` compared to previous best (enforced by pre-commit).
@@ -66,3 +67,5 @@ This repository is intentionally pinned to CUDA 12.6 PyTorch wheels and matching
 - 2026-04-16: Failed attempt K (research-level: added terminal-consistency auxiliary loss from `x_t` to `x_1`) reached `mean_rmsd_100=2.722863`; no commit.
 - 2026-04-16: Failed attempt L (research-level: decoupled architecture with centered-coordinate trunk + separate translation head, with terminal auxiliary term) reached `mean_rmsd_100=2.637292`; no commit.
 - 2026-04-16: Failed attempt M (research-level: decoupled centered-coordinate architecture only, no terminal auxiliary term) reached `mean_rmsd_100=2.479326`; close to best but no commit.
+- 2026-04-16: Failed attempt N (training-strategy: added configurable early stopping with large max-epoch budget, patience/min-delta/check cadence controls) ran to max epoch with ongoing improvements (`stop_reason=max_epochs`) and reached `mean_rmsd_100=2.764940`; no commit.
+- 2026-04-16: Rollback (per `GUIDELINES.md`): restored `train.py`, `reports/latest_eval.json`, and `artifacts/latest_eval_best_model.pt` to last committed baseline after attempts K–N; `mean_rmsd_100` anchor unchanged at `2.461592` (`BEST_PRACTICE.json`). Objective-aligned early stopping remains disallowed for training control.