Tags
adamw
1 post
ai
2 posts
ai-investment
1 post
architecture
2 posts
attention
1 post
capex
1 post
colossus
1 post
common-crawl
1 post
cost
1 post
大语言模型
1 post
data-quality
1 post
datasets
1 post
dclm
1 post
deepseek
3 posts
deepseek-v3
3 posts
economics
1 post
energy
1 post
fineweb
1 post
fineweb-edu
1 post
fp8
1 post
frontier
1 post
gb200
1 post
gpt-5
1 post
gpu
1 post
gpu-cluster
1 post
h100
1 post
h200
1 post
hardware
1 post
infrastructure
2 posts
机器学习
1 post
kv-cache
1 post
llama
1 post
llm
12 posts
machine-learning
1 post
mixed-precision
1 post
mixture-of-experts
1 post
mla
1 post
muon
1 post
nemotron-cc
1 post
optimization
1 post
optimizer
1 post
power
1 post
pre-training
3 posts
project-rainier
1 post
quantization
1 post
qwen
3 posts
shampoo
1 post
soap
1 post
stargate
2 posts
tpu
1 post
training
1 post
training-budget
1 post
trainium
1 post
预训练
1 post