Handle max tokens exception in chat response generation

Introduced error handling for the 'max_tokens' exception during chat response generation. In cases where the maximum token count is exceeded, the bot now falls back to a no-tools response, avoiding the halt of conversation. This ensures a smoother chat experience and persistence in responding, even when a limit is reached. Any other exceptions will still be raised as before, maintaining error transparency.
2023-11-30 14:18:47 +01:00 · 2023-11-30 14:18:47 +01:00 · eccca2a624
commit eccca2a624
parent 670667567e
1 changed files with 9 additions and 3 deletions
--- a/src/gptbot/classes/openai.py
+++ b/src/gptbot/classes/openai.py
@ -303,9 +303,15 @@ class OpenAI:
                self.logger.log(f"No more responses received, aborting.")
                result_text = False
            else:
-                messages = original_messages[:-1] + [choice.message] + tool_responses + original_messages[-1:]
-
-                result_text, additional_tokens = await self.generate_chat_response(messages, user, room)
+                try:
+                    messages = original_messages[:-1] + [choice.message] + tool_responses + original_messages[-1:]
+                    result_text, additional_tokens = await self.generate_chat_response(messages, user, room)
+                except openai.APIError as e:
+                    if e.code == "max_tokens":
+                        self.logger.log(f"Max tokens exceeded, falling back to no-tools response.")
+                        result_text, additional_tokens = await self.generate_chat_response(original_messages, user, room, allow_override=False, use_tools=False)
+                    else:
+                        raise e

        elif not self.chat_model == chat_model:
            new_messages = []