Upgraded bot features to interpret and respond to text, image, and voice prompts in Matrix rooms using advanced OpenAI models, including vision preview and text-to-speech. Streamlined installation process with bot now available via PyPI, simplifying setup and extending accessibility. Eliminated planned features section, signaling a shift towards realized functionalities over prospective development.
Configured Pantalaimon as an optional dependency to enable bot use in E2EE rooms while maintaining compatibility with non-encrypted rooms. Removed trackingmore dependency, indicating a refinement in the feature set towards core functionalities. Version bumped to 0.3.0, signifying major enhancements over previous iteration.
Introduced a new systemd service configuration for GPTbot to ensure Pantalaimon starts as a background process on system boot, maintaining persistent Matrix encryption handling. Ensures seamless restarts and network dependency management for improved reliability.
Integrated Pantalaimon support with updated configuration instructions and examples, facilitating secure communication when using the Matrix homeserver. The .gitignore is now extended to exclude a Pantalaimon configuration file, preventing sensitive information from accidental commits. Removed encryption callbacks and related functions as the application leverages Pantalaimon for E2EE, simplifying the codebase and shifting encryption responsibilities externally. Streamlined dependency management by removing the requirements.txt in favor of pyproject.toml, aligning with modern Python practices. This change overall improves security handling and eases future maintenance.
Resolved a syntax error in the allowed_users property within the GPTBot class by adding the missing 'self' parameter. This correction ensures the proper functioning of the property method, enabling the bot to correctly retrieve the list of users authorized to use it.
Extended the condition for the audio message handling in the chatbot to recognize MP3 audio files sent as file attachments. This ensures that MP3 files will be properly processed as audio messages, improving the bot's media handling capabilities. This is just a test at this point, and may be rolled back.
Migrated several hardcoded bot configuration settings to dynamic properties with fallbacks, enhancing flexibility and customization. The properties now read from a configuration file, allowing changes without code modification. Simplified the instantiation logic by removing immediate attribute setting in favor of lazy-loaded properties. Additionally, prepared to segregate OpenAI-related settings into a dedicated class (noted with TODO comments).
Note: Verify the presence of necessary configuration parameters or include defaults to prevent potential runtime issues.
Added options to extract specific info and summarize content from Wikipedia pages within the gptbot's Wikipedia tool. The 'extract' option enables partial retrieval of page data based on a user-defined string, leveraging the bot's existing chat API for extraction. The 'summarize' option allows users to get concise versions of articles, again utilizing the bot's chat capabilities. These additions provide users with more granular control over the information they receive, potentially reducing response clutter and focusing on user-specified interests.
Cast user objects to strings to standardize ID handling across API calls. Enhanced logging statements now include user and room context, providing better traceability for response generation. Also, refined error handling for API token limits by falling back to an altered response flow, removing tool roles from messages when a max token error occurs, before reattempting. This targets more graceful handling of response generation without tool assistance when constraints are hit.
Introduced error handling for the 'max_tokens' exception during chat response generation. In cases where the maximum token count is exceeded, the bot now falls back to a no-tools response, avoiding the halt of conversation. This ensures a smoother chat experience and persistence in responding, even when a limit is reached. Any other exceptions will still be raised as before, maintaining error transparency.
- Introduce error handling for the keys upload process, logging failures to assist with troubleshooting.
- Improve exception handling in the OpenAI class by returning a more informative response based on the exception arguments if available.
- Replace a return statement in the Newroom tool with an exception raise to standardize tool action termination and provide clearer flow control.
Resolves issue with silent key upload failures. Refines response and control flow for better clarity and debugging.
Enabled asynchronous key upload in the roommember callback to improve efficiency. Fixed the chat response generation by properly referencing the event sender rather than the room ID, aligning user context with chat messages. Corrected the user parameter misuse in the OpenAI class to utilize the room ID. Extended the toolkit to include a 'newroom' feature for creating and setting up new Matrix rooms, thereby enhancing bot functionality.
This commit significantly improves bot response times and contextual accuracy while interacting within rooms and adds a valuable feature for users to create rooms seamlessly.
Enhanced the speech generation logging to display the word count of the input text instead of the full text. This change prioritizes user privacy and improves log readability. Implemented a new feature to generate descriptions for images within a conversation, expanding the bot's capabilities. Also, refactor `BaseTool` class to securely access arguments through `.get` method and to include `messages` by default, ensuring graceful handling of missing arguments.
Enhanced the audio processing in speech-to-text conversion by converting the input audio to MP3 format before transcription. The logging now reflects the word count of the recognized text, providing clearer insight into the output. This should improve compatibility with the transcription service and result in more accurate transcriptions.
Introduced a new 'datetime' tool to the gptbot, which provides the current date and time in UTC. This enhancement caters to the need for time-related queries within the bot's functionality, expanding its utility for users dealing with time-sensitive information.
Temporarily commented out callbacks for test responses, event handling, and encrypted messages to focus on core functionality stabilization. This change aims to simplify the debugging process and enhance the reliability of active features during the development phase. Encryption handling will be reintroduced after refining base features.
Refined the exception details in the Wikipedia tool to include the search query when no results are found, enhancing the clarity of error outputs for end-users. This change helps in debugging by indicating the exact query that led to a no-results situation. Additionally, the existing failure-to-connect error message was left as-is, maintaining accurate API connectivity diagnostics.
Refactor the message concatenation logic within the chat response to ensure the original final message remains intact at the end of the sequence. Introduce a new 'Wikipedia' tool to the bot's capabilities, allowing users to query and retrieve information from Wikipedia directly through the bot's interface. This enhancement aligns with efforts to provide a more informative and interactive user experience.
Eliminated a print statement that was outputting the API request URL in the weather fetching tool, ensuring sensitive key information is not displayed in logs. This increases security by preventing potential API key exposure.
Eliminated the printing of traceback in the exception handling block when the GPTBot encounters an error calling a tool. This change cleans up the logs by removing a redundant error output since relevant information is already being logged. The update aims to enhance the clarity and readability of the logs in case of tool calling errors.
Refactored `call_tool` to pass `room` and `user` for improved context during tool execution.
Introduced `Handover` and `StopProcessing` exceptions to better control the flow when calling tools involves managing exceptions and handovers between tools and text generation.
Enabled flexibility with `room` param in sending images and files, now accepting both `MatrixRoom` and `str` types.
Updated `generate_chat_response` in OpenAI class to incorporate tool usage flag and more pruned message handling for tool responses.
Introduced `orientation` option for image generation to specify landscape or portrait.
Implemented two new tool classes, `Imagine` and `Imagedescription`, to streamline image creation and description processes accordingly.
This improved error handling and additional granularity in tool invocation ensure that the bot behaves more predictably and transparently, particularly when interacting with generative AI and handling dialogue. The flexibility in both response and image generation caters to a wider range of user inputs and scenarios, ultimately enhancing the bot's user experience.
This commit adds functionality to call tools within the chat completion model. By introducing the `call_tool()` method in the `GPTBot` class, tools can now be invoked with the appropriate tool call. The commit also includes the necessary changes in the `OpenAI` class to handle tool calls during response generation. Additionally, new tool classes for geocoding and dice rolling have been implemented. This enhancement aims to expand the capabilities of the bot by allowing users to leverage various tools directly within the chat conversation.
This change adds support for voice input and output to the GPTbot. Users can enable this feature using the new `!gptbot roomsettings` command. Voice input and output are currently supported via OpenAI's TTS and Whisper models. However, note that voice input may be unreliable at the moment. This enhancement expands the capabilities of the bot, allowing users to interact with it using their voice. This addresses the need for a more user-friendly and natural way of communication.
- Replaced synchronous room check with asynchronous room check using `await`.
- Updated the code to use the `await` keyword before calling `self.room_uses_assistant(room)`.
- This change enables the code to generate assistant response asynchronously.
This commit adds a new method `room_uses_assistant` to the OpenAI class. This method allows checking whether a given room uses an assistant. It uses the `room_settings` table in the database to determine if the specified room has the `openai_assistant` setting.
The commit modifies the image generation code in the OpenAI class. The size and model of the generated image can now be dynamically set based on the provided prompt. The code has been refactored to handle different image sizes and models correctly.
- Fixed bot command prefix recognition to include prefixes starting with an asterisk (= edited messages)
- Added handling of ignoring bot commands in the '_last_n_messages' method.
Previously, there was no option to specify the model for image generation in the OpenAI configuration. This commit adds a new option called "ImageModel" where you can specify the desired model. The default value for this option is "dall-e-2".
In the `GPTBot` class, the OpenAI object is now initialized with the `ImageModel` option if it is provided in the configuration. This allows the bot to use the specified image generation model in addition to the chat model.
Furthermore, in the `OpenAI` class, the `image_api` attribute has been renamed to `image_model` to reflect its purpose more accurately. The default value has also been updated to "dall-e-2" to align with the new configuration option.
This commit ensures that the OpenAI configuration is up-to-date and allows users to specify the desired image generation model.