llms-from-scratch-cn/Model_Architecture_Discussions/llama3/params.txt
2024-05-27 15:11:03 +08:00

9 lines
182 B
Plaintext

{'dim': 4096,
'n_layers': 32,
'n_heads': 32,
'n_kv_heads': 8,
'vocab_size': 128256,
'multiple_of': 1024,
'ffn_dim_multiplier': 1.3,
'norm_eps': 1e-05,
'rope_theta': 500000.0}