The most relevant topics, told in the simplest way
Cache Rules Everything Around Me
Prompt caching helps LLMs by reusing computation from repeated portions of prompts (think system instructions, boilerplate context included with every prompt) so the LLM doesn't have to process the entire prompt fresh each time. Anthropic and OpenAI use this regularly and have reported reducing input token size by as much as 80-90%. Salesforce doesn't support this today (at least the traditional sense) but there are still ways we can have conversations with customers and devs about it (including methods to cache output response which is unique).