Microsoft open-sources its Phi-4 small language model
January 11, 2025

Microsoft open-sources its Phi-4 small language model

Microsoft today released code for Phi-4, a small language model that can generate text and solve math problems.

The company first detailed the model last month. Phi-4 was initially only available through Microsoft’s Azure Foundry AI development service. The model is already available for download on Hugging Facepopular website for hosting open source artificial intelligence projects.

Phi-4 is the fourth iteration of a series of small language models introduced by Microsoft in 2023. It contains 14 billion parameters—configuration settings that determine how the neural network processes data. Microsoft researchers trained it on a cluster of 1,920 H100 GPUs from Nvidia Corp. within 21 days.

The model is based on the standard Transformer architecture that underlies most large language models. Upon receiving a user prompt, Transformer models break down the input data into individual words and determine the meaning of each word by analyzing the surrounding text. Moreover, they give priority to those parts of the surrounding text that are considered most relevant.

Phi-4 implements a so-called decoder-only variant of the Transformer architecture. The standard Transformer model analyzes text before and after a word to determine its meaning. Decoder-only models focus solely on the text that precedes the word, reducing the amount of data they have to process and thereby reducing inference costs.

IN research workMicrosoft detailed that it honed the quality of Phi-4 output using two post-training optimization methods. These methods are known as direct preference optimization and supervised fine-tuning. Both involve providing the language model with examples that explain how it should generate quick responses.

In an internal evaluation, Microsoft compared Phi-4 to Llama 3.3 70B, an LLM with five times more parameters. The company says Phi-4 showed better performance in the popular GPQA and MATH tests. The two test data sets contain science questions and math problems, respectively.

Phi-4 joins a growing list of small language models that have been open sourced by major technology companies over the past year.

Last February, Google LLC presented a series of small language models called Gemma. The algorithms in this series have from 2 to 27 billion parameters. According to Google, the version with 27 billion parameters can outperform models more than twice its size.

More recently, Meta Platforms Inc. released two Llama 3.2 models with less than five billion parameters. The company monitored the release open source even more powerful versions of these models implement machine learning called quantification. This method compresses the data that the neural network receives to reduce the amount of hardware needed to process it.

Photo: Microsoft

Your voice of support is important to us and helps us keep our content FREE.

One click below supports our mission of providing free, in-depth and relevant content.

Join our community on YouTube

Join a community of over 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, ​​Dell Technologies Founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many other luminaries and experts.

“TheCUBE is an important industry partner. You guys are truly a part of our events and we really appreciate you coming and I know people appreciate the content you create too.” – Andy Jassy

THANK YOU

2025-01-08 22:21:01

Leave a Reply

Your email address will not be published. Required fields are marked *