There is a new AI model family emerging that sets itself apart by being one of the few that can be replicated from scratch. On Tuesday, AI2, the nonprofit AI research organization established by the late Microsoft co-founder Paul Allen, introduced OLMo 2, the second set of models in its OLMo series. OLMo stands for “open language model.” While there are numerous “open” language models available (such as Meta’s Llama), OLMo 2 adheres to the Open Source Initiative’s definition of open-source AI, ensuring that the tools and data used in its development are publicly accessible.
Insightful Development Process
The Open Source Initiative finalized its definition of open-source AI in October, with the original OLMo models from February already meeting the criteria. AI2 emphasized that OLMo 2 was created entirely with open and transparent practices, including open and accessible training data, open-source training code, reproducible training recipes, transparent evaluations, and more. By openly sharing their data, methodologies, and discoveries, AI2 aims to equip the open-source community with the necessary resources to explore new and innovative techniques.
Model Specifications and Capabilities
The OLMo 2 family consists of two models: OLMo 7B with 7 billion parameters and OLMo 13B with 13 billion parameters. Parameters essentially reflect a model’s problem-solving abilities, with models containing more parameters typically exhibiting superior performance. Like most language models, OLMo 2 7B and 13B excel at various text-based tasks, such as answering questions, summarizing documents, and generating code.
Training Methodology and Dataset
To train these models, AI2 utilized a vast dataset comprising 5 trillion tokens, where tokens represent discrete units of raw data. One million tokens equate to roughly 750,000 words, with the training set curated from high-quality websites, academic papers, Q&A forums, and both synthetic and human-generated math workbooks.
Competitive Performance and Accessibility
AI2 asserts that OLMo 2 models rival the performance of open models like Meta’s Llama 3.1 release. Moreover, the OLMo 2 models and their components are freely available for download from AI2’s website under the Apache 2.0 license, permitting commercial use.
Ethical Considerations and Future Implications
Despite concerns over the potential misuse of open models, AI2 remains confident in the benefits outweighing the risks. By fostering technical progress, enhancing ethical considerations, ensuring verifiability and reproducibility, and promoting equitable access, open models like OLMo contribute to a more inclusive and innovative AI landscape.
