Learn

Program database and continual learning across experiments

127
Experiments Logged
14
Evals Tested
3
Domains Explored
+0.9%
Avg Score Gain / Run
Program DatabaseLive · 22 entries
13 kept6 reverted3 flagged
IDHypothesisDomainScoreΔStatusTime
EXP-047LR schedule: cosine annealing with T_max=500Transformer opt.0.742+1.4%KEPT2m ago
Dropout rate increase 0.2→0.3 in FFN layersTransformer opt.0.728-0.3%REVERTED8m ago
Weight decay 0.01 → 0.001Transformer opt.0.731+0.9%KEPT14m ago
Gradient clipping max_norm 1.0 → 0.5Transformer opt.0.724+0.4%KEPT19m ago
WordPiece tokenization for code-specific tokensTransformer opt.0.718+3.4% BLEUFLAGGED25m ago
EXP-046Graph conv pooling: mean → attention-weighted sumDrug discovery0.902+1.2%KEPT28m ago
SMILES dropout augmentation rate 0.1 → 0.2Drug discovery0.896+0.7%KEPT34m ago
Node feature normalization: batch norm → layer normDrug discovery0.888-0.5%REVERTED41m ago
EXP-047Label smoothing ε=0.1 applied to SFT objectiveTransformer opt.0.713+0.1%KEPT47m ago
Batch size linear ramp 32 → 128 over first 1k stepsTransformer opt.0.704+0.5%KEPT53m ago
EXP-015TD3 policy update delay: 2 → 4 stepsRL robotics0.661+2.1%KEPT58m ago
Exploration noise std 0.1 → 0.2 with decay scheduleRL robotics0.653+0.9%KEPT1h 5m ago
Critic network: 2 layers → 3 layers, hidden 256RL robotics0.648-0.6%REVERTED1h 12m ago
Polyak averaging coefficient 0.995 → 0.999RL robotics0.655+1.1%FLAGGED1h 19m ago
EXP-0463D conformer features concatenated to fingerprintDrug discovery0.881+2.3%FLAGGED1h 27m ago
Flash Attention 2 kernel — latency vs throughputTransformer opt.0.700−1msKEPT1h 44m ago
Layer norm epsilon: 1e-5 → 1e-8Transformer opt.0.698+0.2%KEPT1h 59m ago
Increased attention dropout 0.0 → 0.1Transformer opt.0.691-1.1%REVERTED2h 14m ago
EXP-091Multi-task loss weighting: inverse class frequencyDrug discovery0.874+1.8%KEPT2h 38m ago
Dropout 0.1 → 0.3 in message-passing layersDrug discovery0.857-0.4%REVERTED2h 52m ago
EXP-013Reward normalization with running mean/stdRL robotics0.633+0.8%KEPT3h 10m ago
Policy network hidden size 256 → 512RL robotics0.622-0.5%REVERTED3h 25m ago
Eval Reliability

Predictive power across experiments

Val Loss
r=0.9194%

47 experiments

Pass@1
r=0.8588%

47 experiments

BLEU-4
r=0.6972%

47 experiments

Inference Latency
r=0.5861%

47 experiments