Short CourseEfficiently Serving LLMsUnderstand how LLMs predict the next token and how techniques like KV caching can speed up text generation. Write code to serve LLM applications efficiently to multiple users.Fine-TuningGenerative ModelsLLMOpsLLM ServingTransformersPredibase
Short CourseReinforcement Learning From Human FeedbackGet an introduction to tuning and evaluating LLMs using Reinforcement Learning from Human Feedback (RLHF) and fine-tune the Llama 2 model.Fine-TuningGenerative ModelsLLMOpsTransformersGoogle Cloud