Digital Event Horizon

Align Meta Llama 3 to human preferences with DPO, Amazon SageMaker Studio, and Amazon SageMaker Ground Truth

In this post, we show you how to enhance the performance of Meta Llama 3 8B Instruct by fine-tuning it using direct preference optimization (DPO) on data collected with SageMaker Ground Truth. Clean up After you complete your tasks in the SageMaker Studio notebook, remember to stop your JupyterLab workspace to prevent incurring additional charges. You can do this by choosing Stop next to your JupyterLab space. Additionally, you have the option to set up lifecycle configuration scripts that will automatically shut down resources when they’re not in use. If you deployed the model to a SageMaker endpoint, run the following code at the end of the notebook to delete the endpoint: #delete your endpoint sm_client.delete_endpoint(EndpointName=tg_sm_model.endpoint_name) Conclusion Amazon SageMaker offers tools to streamline the process of fine-tuning LLMs to align with human preferences. With SageMaker Studio, you can experiment interactively with different models, questions, and fine-tuning techniques. With SageMaker Ground Truth, you can set up workflows, manage teams, and collect consistent, high-quality human feedback. In this post, we showed how to enhance the performance of Meta Llama 3 8B Instruct by fine-tuning it using DPO on data collected with SageMaker Ground Truth. To get started, launch SageMaker Studio and run the notebook available in the following GitHub repo. Share your thoughts in the comments section! About the Authors Anastasia Tzeveleka is a GenAI/ML Specialist Solutions Architect at AWS. As part of her work, she helps customers build foundation models and create scalable generative AI and machine learning solutions using AWS services. Pranav Murthy is an AI/ML Specialist Solutions Architect at AWS. He focuses on helping customers build, train, deploy and migrate machine learning (ML) workloads to SageMaker. He previously worked in the semiconductor industry developing large computer vision (CV) and natural language processing (NLP) models to improve semiconductor processes. In his free time, he enjoys playing chess and traveling. Sundar Raghavan is an AI/ML Specialist Solutions Architect at AWS, helping customers build scalable and cost-efficient AI/ML pipelines with Human in the Loop services. In his free time, Sundar loves traveling, sports and enjoying outdoor activities with his family.

Published: 2024-09-09T22:40:48

Today's AI/ML headlines are brought to you by ThreatPerspective

Align Meta Llama 3 to human preferences with DPO, Amazon SageMaker Studio, and Amazon SageMaker Ground Truth