French AI startup Mistral has introduced its first model capable of handling both images and text. Known as Pixtral 12B, this 12-billion-parameter model is approximately 24GB in size and promises enhanced problem-solving abilities compared to models with fewer parameters.
Pixtral 12B Features
Built on Mistral’s Nemo 12B text model, Pixtral 12B can process questions related to numerous images of varying sizes using URLs or base64-encoded images. Similar to other multimodal models, Pixtral 12B is designed to perform tasks like image captioning and object counting.
Availability and Usage
Pixtral 12B can be downloaded, customized, and utilized under an Apache 2.0 license. Mistral provides access to this model via a torrent link on GitHub and the Hugging Face platform.
Mistral’s Growth and Strategy
Following a successful $645 million funding round led by General Catalyst, Mistral has garnered attention in the AI community. As a newer player compared to industry giants like OpenAI, Mistral’s approach involves offering open models for free, providing managed versions for a fee, and delivering consulting services to corporate clients.
Overall, Pixtral 12B signifies Mistral’s commitment to innovation and advancement in the AI space, positioning itself as a significant player in the European AI landscape.
