Handle max tokens exception in chat response generation

Introduced error handling for the 'max_tokens' exception during chat response generation. In cases where the maximum token count is exceeded, the bot now falls back to a no-tools response, avoiding the halt of conversation. This ensures a smoother chat experience and persistence in responding, even when a limit is reached. Any other exceptions will still be raised as before, maintaining error transparency.
This commit is contained in:
Kumi 2023-11-30 14:18:47 +01:00
parent 670667567e
commit eccca2a624
Signed by: kumi
GPG key ID: ECBCC9082395383F

View file

@ -303,9 +303,15 @@ class OpenAI:
self.logger.log(f"No more responses received, aborting.")
result_text = False
else:
messages = original_messages[:-1] + [choice.message] + tool_responses + original_messages[-1:]
result_text, additional_tokens = await self.generate_chat_response(messages, user, room)
try:
messages = original_messages[:-1] + [choice.message] + tool_responses + original_messages[-1:]
result_text, additional_tokens = await self.generate_chat_response(messages, user, room)
except openai.APIError as e:
if e.code == "max_tokens":
self.logger.log(f"Max tokens exceeded, falling back to no-tools response.")
result_text, additional_tokens = await self.generate_chat_response(original_messages, user, room, allow_override=False, use_tools=False)
else:
raise e
elif not self.chat_model == chat_model:
new_messages = []