Skip to content

Best Practices

  • Securely Store Keys: Never hard-code API keys in client-side code. It is recommended to use environment variables or a key management service provided by a cloud provider to store them.
  • Implement Graceful Retries: For transient errors such as 429 and 5xx, it is recommended to implement a retry logic with exponential backoff on the client side.
  • Make Good Use of Streaming Responses: For interactive chat applications, it is strongly recommended to use streaming responses (stream: true), which can significantly reduce the response time of the first token and improve user experience.
  • Configure Cost Budgets and Alerts: Set your cost budget and alert thresholds in the R9S console. When usage approaches the budget, the system will automatically notify you to avoid accidental overspending.
  • Utilize Caching: For queries with high repetition and content that does not change frequently, consider adding a caching layer on your application side to effectively reduce API call costs and latency.