The “chat” model component is used to power question-answer experiences and is typically a 30B+ parameter model. Latency is not as important as it is for the “autocomplete” model, so most people choose the one that gives them the best possible responses, oftentimes opting for SaaS API endpoints. When SaaS isn’t possible or preferred, open-source models are self-hosted on a server for the entire team to use. Examples of models used for chat experiences include GPT-4, DeepSeek Coder 33B, Claude 3, Code Llama 70B, Llama 3 70B etc.