Newsletter
Added 2024-09-19 16:09:22 +0000 UTCMicrosoft releases GRIN MoE. With only 6.6B activate parameters, it achieves exceptionally good performance across a diverse set of tasks, particularly in coding and mathematics tasks. Level above GPT-3.5 and close to the first version of GPT-4. Using little RAM due to the low number of active parameters. Benchmark results slightly higher than Phi-3.5-MoE also from Microsoft.
Benchmarks:
MMLU: 79.4
GSM-8K: 90.4
Source: Hugging Face
Demo: GRIN-MoE-Demo