docs: sync attempt log from feature branch (README only).

Bring canonical Attempt Log lines onto main without merging train.py changes, so cherry-picks stay clean. Made-with: Cursor
2026-04-16 23:58:51 +09:00
parent ac0fd8ff41
commit 9e221c62a6
1 changed files with 7 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -93,3 +93,10 @@ This repository is intentionally pinned to CUDA 12.6 PyTorch wheels and matching
 - 2026-04-16: Continued parallel sweep with rotation curriculum variants (`start=0.85/0.95` and lower-lr schedule) produced `2.450391`, `2.457748`, and `2.426384`; no improvement over branch best `2.388103`.
 - 2026-04-16: Deep schedule parallel sweep (`epochs=320~380`, `start=1.0` with warmup variants, multi-seed) produced `2.464117`, `2.410706`, and `2.419527`; still below branch best and showed late-epoch fallback instability in 일부 runs.
 - 2026-04-16: Post-reset attempt on `attempt/s3-tail-risk-next` (trajectory-tail-risk focus) using residual-geodesic with clipped omega and scheduled rotation weight (`lr=6.8e-4`, `grad_clip=0.7`, `start=1.0`, warmup `120`) reached `mean_rmsd_100=2.464730`; no improvement.
 - 2026-04-16: Restarted branch `attempt/s3-restart-after-doc-sync` and ran immediate S3 continuation (`lr=6.0e-4`, `grad_clip=0.7`, geodesic+residual, `omega_max_norm=5.0`, warmup `120`), obtaining `mean_rmsd_100=2.474573`; no improvement over best `2.388103`.
 - 2026-04-16: Strategy S4 start (tail-risk suppressor): added upper-quantile tail penalty in training loss and ran first trial (`tail-risk-weight=0.2`, `tail-risk-quantile=0.85`, `lr=6.8e-4`, geodesic+residual), yielding `mean_rmsd_100=2.466082`; no improvement over best `2.388103`.
 - 2026-04-16: Strategy S4 micro-tuning #2 lowered tail penalty (`tail-risk-weight=0.1`, quantile `0.85`, `lr=6.4e-4`) to reduce over-regularization, but result was `mean_rmsd_100=2.476267`; no improvement.
 - 2026-04-16: Strategy S4 micro-tuning #3 softened tail coverage (`tail-risk-quantile=0.9`, `tail-risk-weight=0.2`, `lr=6.8e-4`) and improved to `mean_rmsd_100=2.440570`, but still below best `2.388103`.
 - 2026-04-16: Strategy S4 micro-tuning #4 increased tail penalty (`tail-risk-weight=0.25`, quantile `0.9`) and regressed sharply to `mean_rmsd_100=2.601258`; indicates over-penalization risk.
 - 2026-04-16: Strategy S4 micro-tuning #5 changed seed (`seed=2`, `tail-risk-weight=0.2`, quantile `0.9`) and encountered prolonged fallback-to-1000 behavior with `mean_rmsd_100=2.709563`; S4 hit 5-run cap with no best update.
 - 2026-04-16: Structural torsion head (`--torsion-head bond_pair`, GCN only): translation/rotation still use full-graph mean-pooled trunk+time; each torsion `k` runs the **same GCN weights** on the **movable-side induced subgraph** (mask only selects nodes/edges for that subgraph—mask values are not fed as features), mean-pools that subgraph, concatenates with global pooled context, `LayerNorm`, then a small MLP to one scalar. Replaced the prior mask-as-feature design. One calibration run (`epochs=320`, geodesic+residual) reached `mean_rmsd_100=2.598530` with long `train_mse=1000` plateaus; worse than best `2.388103`, likely dominated by multi-forward cost + same geodesic instability rather than readout alone.