Mistral Small 3: A Leap in Efficient Open-Source AI
This blog post was automatically generated (and translated). It is based on the following original, which I selected for publication on this blog:
Mistral Small 3 | Mistral AI | Frontier AI in your hands.
Mistral Small 3: A Leap in Efficient Open-Source AI
Mistral AI has recently unveiled Mistral Small 3, a 24B-parameter model designed for speed and efficiency, released under the permissive Apache 2.0 license. This model aims to provide a robust and open alternative to larger models and even closed-source solutions.
Key Features and Capabilities
Mistral Small 3 stands out for its ability to compete with models like Llama 3 70B and Qwen 32B, while boasting significantly lower latency. Key highlights include:
- Performance: Competitive with Llama 3 70B instruct, but more than three times faster on the same hardware.
- Efficiency: Optimized for generative AI tasks requiring robust language and instruction following with low latency.
- Architecture: Designed with fewer layers to reduce the time per forward pass, achieving over 81% accuracy on MMLU and 150 tokens/s latency.
- Open Source: Released under Apache 2.0, encouraging community adoption and customization.
The model is pre-trained and instruction-tuned, making it suitable for a wide range of applications. It can serve as a powerful base for accelerating progress in various AI domains. The question arises, how will this model impact the open-source AI landscape, and what innovations will it spur?
Evaluation and Benchmarking
Evaluations against proprietary and open-source models highlight Mistral Small 3’s competitive edge:
- Performance rivals models three times its size, such as Llama 3.3 70B.
- Competitive with the proprietary GPT4o-mini model across Code, Math, General knowledge, and Instruction following benchmarks.
Use Cases
Several distinct use cases are emerging for pre-trained models of this size:
- Fast-response Conversational Assistance: Excels in scenarios where quick, accurate responses are critical, such as virtual assistants.
- Low-latency Function Calling: Handles rapid function execution in automated or agentic workflows.
- Fine-tuning for Subject Matter Expertise: Can be fine-tuned for specific domains like legal advice, medical diagnostics, and technical support.
- Local Inference: Ideal for handling sensitive information privately on local hardware.
Availability
Mistral Small 3 is available on various platforms:
- La Plateforme (as mistral-small-latest or mistral-small-2501)
- Hugging Face
- Ollama
- Kaggle
- Together AI
- Fireworks AI
Future Directions
Mistral AI is committed to enhancing its models, with plans to release small and large models with boosted reasoning capabilities in the coming weeks. This commitment is reinforced by the decision to progressively move away from MRL-licensed models, opting for the Apache 2.0 license to promote open access and collaboration.
Is this development towards smaller yet powerful AI models a sign of a shift in the industry? What impact will this have on the accessibility and democratization of AI technology?