Claude for Research

EXP-047

Run #3

|2h ago|47 hypotheses completed

Composite Baseline0.742+6.2% from start

Sampler Agents2 running · 1 queued

SA-001DEPTH

4m 12s

Reducing attention head dropout from 0.1 to 0.05 in top 4 transformer layers may improve convergence on validation set without overfitting.

86%

SA-002BREADTH

2m 47s

Applying a cosine learning rate schedule with warm restarts every 1000 steps to balance exploration and exploitation across the full eval battery.

58%

SA-003DEPTH

Experimenting with rotary position embeddings (RoPE) in place of learned absolute positions to improve generalization on longer sequences.

Evaluator Agents4 active

Val Loss

2.31-0.160

best: 2.28

BLEU-4

0.341+0.023

best: 0.347

Pass@1

0.280+0.040

best: 0.310

Inference Latency

48-4ms

best: 44

Experiment LogEXP-047

ID	Status	Δ Score	Hypothesis	Time
EXP-047	KEPT	+1.4%	LR schedule: cosine annealing T_max=500	2m ago
╰	REVERTED	-0.3%	Dropout rate: 0.2→0.3 in FFN	8m ago
╰	KEPT	+0.9%	Weight decay 0.01→0.001	14m ago
╰	FLAGGED	+3.4% BLEU	WordPiece tokenization for code tokens	25m ago
╰	KEPT	+0.5%	Batch size schedule: linear 32→128	36m ago
╰	KEPT	+0.2%	Layer norm epsilon: 1e-5→1e-8	48m ago
╰	REVERTED	-0.9%	Increased FFN hidden size 2048→3072	1h ago
╰	KEPT	+4.2%	Mixed precision FP16 baseline	1h 22m ago

Soft-Stop Checkpoints

Marginal Return Threshold

Avg gain last 10 exp: +0.9%

Eval Convergence

3/4 evals still improving

Compute Budget

78% of budget consumed

Depth/Breadth Balance

Sampler split 2:1