The Transformer architecture has allowed us to train models for Natural Language Processing (NLP) in a more efficient way. Instead of processing each token in a sentence or sequence, attention allows a model to process tokens in parallel in various ways.
To train a model using the Transformer architecture, you need to use a large amount of text data as input. Different models have been trained, which mostly differ by the data they’ve been trained on, or by how they implement attention within their architectures. Since the models are trained on large datasets, and the models themselves are large in size, they’re often referred to as Large Language Models (LLMs).
Many LLMs are open-source and publicly available through communities like Hugging Face. Azure also offers the most commonly used LLMs as foundation models in the Azure Machine Learning model catalog. Foundation models are pretrained on large texts and can be fine-tuned for specific tasks with a relatively small dataset.
Explore the model catalog
In the Azure Machine Learning studio, you can navigate to the model catalog to explore all available foundation models. Additionally, you can import any model from the Hugging Face open-source library into the model catalog.
Leave a Reply