Introduced the `ForceVision` configuration option to allow usage of third-party models for image recognition within the OpenAI setup. This change broadens the flexibility and applicability of the bot's image processing capabilities by not restricting to predefined vision models only. Also, added missing properties to the `OpenAI` class to provide comprehensive control over the bot's behavior, including options for forcing vision and tools usage, along with emulating tool capabilities in models not officially supporting them. These enhancements make the bot more adaptable to various models and user needs, especially for self-hosted setups.
Additionally, updated documentation and increment version to 0.3.12 to reflect these changes and improvements.
Refactored the handling of AI providers to support multiple AI services efficiently, introducing a `BaseAI` class from which all AI providers now inherit. This change modernizes our approach to AI integration, providing a more flexible and maintainable architecture for future expansions and enhancements.
- Adopted `gpt-4o` and `dall-e-3` as the default models for chat and image generation, respectively, aligning with the latest advancements in AI capabilities.
- Integrated `ruff` as a development dependency to enforce coding standards and improve code quality through consistent linting.
- Removed unused API keys and sections from `config.dist.ini` to streamline configuration management and clarify setup processes for new users.
- Updated the command line tool for improved usability and fixed previous issues preventing its effective operation.
- Enhanced OpenAI integration with advanced settings for temperature, top_p, frequency_penalty, and presence_penalty, enabling finer control over AI-generated outputs.
This comprehensive update not only enhances the bot's performance and usability but also lays the groundwork for incorporating additional AI providers, ensuring the project remains at the forefront of AI-driven chatbot technologies.
Resolves#13
Refactored the main execution pathway to introduce a `main_sync` function that wraps the existing asynchronous `main` function, facilitating compatibility with environments that necessitate or prefer synchronous execution. This change enhances the bot's flexibility in various deployment scenarios without altering the core asynchronous functionality.
In addition, expanded the exception handling in `get_version` to catch all exceptions instead of limiting to `DistributionNotFound`. This broadens the robustness of version retrieval, ensuring the application can gracefully handle unexpected issues during version lookup.
Whitespace adjustments improve code readability by clearly separating function definitions.
These adjustments contribute to the maintainability and operability of the application, allowing for broader usage contexts and easier integration into diverse environments.
This commit removes unnecessary imports across several modules, enhancing code readability and potentially improving performance. Notably, `KeysUploadError` and `requests` were removed where no longer used, reflecting a cleaner dependency structure. Furthermore, logging calls have been standardized, removing dynamic string generation in favor of static messages. This change not only makes the logs more consistent but also slightly reduces the computational overhead associated with log generation. The removal of unused type hints also contributes to a more focused and maintainable code base.
Additionally, the commit includes minor text adjustments for user messages, replacing dynamic content with fixed strings where the dynamism was not needed. This enhances both the clarity and security of user-directed messages by avoiding unnecessary string formatting operations.
Finally, the simplification of the migration script and the adjustment in the tools module underscore an ongoing effort to maintain clean and efficient code infrastructure.
Enhanced bot flexibility by enabling the specification of room IDs in the allowed users' list, broadening access control capabilities. This change allows for more granular control over who can interact with the bot, particularly useful in scenarios where the bot's usage needs to be restricted to specific rooms. Additionally, updated documentation and configurations reflect the inclusion of new AI models and self-hosted API support, catering to a wider range of use cases and setups. The README.md and config.dist.ini files have been updated to offer clearer guidance on setup, configuration, and troubleshooting, aiming to improve user experience and ease of deployment.
- Introduced the ability for room-specific bot access, enhancing user and room management flexibility.
- Expanded AI model support, including `gpt-4o` and `ollama`, increases the bot's versatility and application scenarios.
- Updated Python version compatibility to 3.12 to ensure users are leveraging the latest language features and improvements.
- Improved troubleshooting documentation to assist users in resolving common issues more efficiently.
Introduces logging for cases where OpenAI's API returns an empty response, ensuring that such occurrences are captured for debugging purposes. This change enhances visibility into the interaction with OpenAI's endpoint, facilitating easier identification and resolution of issues where empty responses are received, potentially indicating API limitations, network issues, or unexpected behavior from the AI model.
This update allows users to provide a location name for their weather reports, which can be useful when requesting weather information for specific locations.
When processing large volumes of data, it's essential to handle errors gracefully and provide clear feedback to users. This change introduces additional checks to ensure robust error handling during user authentication, reducing the likelihood of errors propagating further down the pipeline.
This improvement not only enhances the overall stability of the system but also provides a better user experience by providing more informative error messages in the event of an issue.
This update introduces the ability for the bot to use a Matrix UserID and password for authentication, in addition to the existing Access Token method. Upon the first run with UserID and password, the bot automatically converts these credentials into an Access Token, updates the configuration with this token, and removes the password for security purposes. This enhancement simplifies the initial setup process for new users by directly utilizing Matrix login credentials, aligning with common user authentication workflows and enhancing security by not storing passwords long-term.
Refactored the bot initialization process in `GPTBot.from_config` to support dynamic login method selection based on provided credentials, and implemented automatic configuration updating to reflect the newly obtained Access Token and cleaned credentials.
Minor adjustments include formatting and comment clarification for better readability and maintenance.
This change addresses the need for a more straightforward and secure authentication process for bot deployment and user experience improvement.
Transitioned from the deprecated `pkg_resources` to `importlib.metadata` for package version retrieval, improving performance and future compatibility.
- Included the `ffmpeg` package in the Docker environment to support multimedia content processing.
- Added `trackingmore-api-tool` as a dependency to expand the bot's functionality with tracking capabilities.
- Adjusted the `all` dependencies list in `pyproject.toml` to include the `trackingmore` module, indicating a broader feature set for the application.
- Updated the bot class to prepare for integrating `TrackingMore` alongside existing services like `OpenAI` and `WolframAlpha`, highlighting an intention to make such integrations configurable in the future.
This enhancement enables the bot to interact with multimedia content more effectively and introduces package tracking features, laying groundwork for configurable service integrations.
Removed an extraneous log statement that recorded the first message content in the OpenAI class. This change streamlines the logging process by eliminating unnecessary log clutter, improving the readability of logs and potentially enhancing performance by reducing I/O operations on the logging system. This adjustment is pivotal for maintaining a clean and efficient codebase, especially in production environments where excessive logging can lead to inflated log sizes and make troubleshooting more challenging.
Upgraded project version to 0.3.6 to introduce a critical fix for message type detection failing on certain messages. This version also amends the package directory structure for improved organization, moving from `src/gptbot` to just `gptbot`. Additionally, updated the CHANGELOG to reflect this fix and organizational change, ensuring that it stays current with the project's progress.
- Fixes message type detection issue
Improve event type determination in the message fetching logic by adding try-except blocks to handle AttributeError and KeyError exceptions gracefully. This change allows the bot to continue processing other events if it encounters an event without a recognizable type or msgtype, avoiding premature termination and enhancing its ability to process diverse event streams more robustly. This approach also includes logging unprocessable events for debugging purposes, providing clearer insights into event handling anomalies.
Fixes#5
conditional avatar update
Introduces a new method `get_state_event` to asynchronously retrieve
state events for a given room and event type, enhancing the bot's
ability to fetch specific room states before performing actions. This
functionality is leveraged to conditionally update room avatars only if
they are not already set, reducing unnecessary updates and improving
efficiency. Additionally, the commit includes minor formatting
adjustments for better code readability.
Refactoring the avatar updating process to assess the current state
before action prevents redundant network calls and aligns with optimal
resource usage practices, contributing to a smoother operation and
potentially reducing the workload on the server.
Enhanced event processing in the bot's message retrieval logic to
improve message relevance and accuracy. Changes include accepting all
'gptbot' prefixed events, refining handling of 'ignoreolder' command
with exact match rather than starts with, and now passing through
'custom' commands that start with '!'. The default behavior now excludes
notices unless explicitly included. This update allows for more precise
command interactions and reduces clutter from irrelevant notices.
Standardize the passing of 'messages' argument across various calls to
generate_chat_response method to ensure consistency and prevent
potential bugs in the GPT bot's response generation. The 'model'
parameter in one instance has been corrected to 'original_model' for
proper context loading. These changes improve code clarity and maintain
the intended message flow within the bot's conversation handling.
Refactored initialization of OpenAI APIs to correct a redundancy and
enhance clarity. Improved content extraction logic for robust handling
of various message types. Enhanced logging and messaging by including
user and room context, facilitating better debugging and user
interaction. Extended `send_message` to support custom message types,
allowing for richer interaction within the chat ecosystem. Updated
hardcoded chat models to leverage newer versions for potentially more
accurate tool overrides. Fixed async method call in recursion handling
to ensure proper response generation. Finally, increased message history
retrieval limit based on the `max_messages` attribute for more effective
conversation context.
Resolves issues with message context and enhances user feedback during
operations.
Improved the error handling in the OpenAI class to prevent infinite recursion issues by retaining the original chat model during recursive calls. Enhanced logging within the recursion depth check for better debugging and traceability. Ensured consistency in chat responses by passing the initial model reference throughout the entire call stack. This is crucial when fallbacks due to errors or tool usage occur.
Refactored code for clarity and readability, ensuring that any recursion retains the original model and tool parameters. Additionally, proper logging and condition checks now standardize the flow of execution, preventing unintended modifications to the model's state that could lead to incorrect bot behavior.
Introduced the ability to specify and retrieve different OpenAI models on a per-room basis, thereby allowing enhanced customization of the bot's response behavior according to the preferences for each room. Cleaned up code formatting across the bot implementation files for improved readability and maintainability. Additional logic now checks for model overrides when generating responses, ensuring the correct model is used as configured.
Refactors include streamlined database and API initializations and a refined method for processing message formatting to accommodate images, texts, and system messages consistently. This change differentiates default behavior from room-specific configurations, catering to diverse user needs without compromising on default settings.
Added a safety check to prevent infinite recursion within the response generation function. When `use_tools` is active, the code now inspects the call stack and terminates the process if a certain recursion depth is exceeded. This ensures that the code is robust against potential infinite loops that could block or crash the service. A default threshold is set with a TODO for revisiting the hard-coded limit, and the recursion detection logs the occurrence for easier debugging and maintenance.
Note: Recursion limit handling may require future adjustments to the `allow_override` parameter based on real-world feedback or testing.
Introduced a configuration option to emulate tool usage in models that do not natively support tools, facilitating the use of such functionality without direct model support. This should benefit users aiming to leverage tool-based features without relying on specific AI models. Additionally, enhanced error logging in the GPTBot class by including traceback details, aiding in debugging and incident resolution.
- Added `EmulateTools` option in the `config.dist.ini` for flexible tool handling.
- Enriched error logging with stack traces in `bot.py` for better insight during failures.
Introduced additional debug log entries in the `GPTBot` class to provide clarity on the initial sync and callback setup process. This helps with monitoring and troubleshooting during the early stages of bot deployment, making it easier to pinpoint issues around bot startup and room joining behavior.
Bumped project version to 0.3.3-dev to signal ongoing development.
Resolved an issue that prevented the bot from responding when files were uploaded to encrypted rooms by implementing a workaround. The bot now tries to generate text from uploaded files and logs errors without interrupting the message flow. Upgraded the Pantalaimon dependency to ensure compatibility. Also, refined the message processing logic to handle different message types correctly and made the download_file method asynchronous to match the matrix client's expected behavior. Additionally, updated the changelog and bumped the project version to reflect these fixes and improvements.
Known issues have been documented, including a limitation when using Pantalaimon where the bot cannot download/use files uploaded to encrypted rooms.
Integrated Pantalaimon support with updated configuration instructions and examples, facilitating secure communication when using the Matrix homeserver. The .gitignore is now extended to exclude a Pantalaimon configuration file, preventing sensitive information from accidental commits. Removed encryption callbacks and related functions as the application leverages Pantalaimon for E2EE, simplifying the codebase and shifting encryption responsibilities externally. Streamlined dependency management by removing the requirements.txt in favor of pyproject.toml, aligning with modern Python practices. This change overall improves security handling and eases future maintenance.
Resolved a syntax error in the allowed_users property within the GPTBot class by adding the missing 'self' parameter. This correction ensures the proper functioning of the property method, enabling the bot to correctly retrieve the list of users authorized to use it.
Extended the condition for the audio message handling in the chatbot to recognize MP3 audio files sent as file attachments. This ensures that MP3 files will be properly processed as audio messages, improving the bot's media handling capabilities. This is just a test at this point, and may be rolled back.
Migrated several hardcoded bot configuration settings to dynamic properties with fallbacks, enhancing flexibility and customization. The properties now read from a configuration file, allowing changes without code modification. Simplified the instantiation logic by removing immediate attribute setting in favor of lazy-loaded properties. Additionally, prepared to segregate OpenAI-related settings into a dedicated class (noted with TODO comments).
Note: Verify the presence of necessary configuration parameters or include defaults to prevent potential runtime issues.
Added options to extract specific info and summarize content from Wikipedia pages within the gptbot's Wikipedia tool. The 'extract' option enables partial retrieval of page data based on a user-defined string, leveraging the bot's existing chat API for extraction. The 'summarize' option allows users to get concise versions of articles, again utilizing the bot's chat capabilities. These additions provide users with more granular control over the information they receive, potentially reducing response clutter and focusing on user-specified interests.
Cast user objects to strings to standardize ID handling across API calls. Enhanced logging statements now include user and room context, providing better traceability for response generation. Also, refined error handling for API token limits by falling back to an altered response flow, removing tool roles from messages when a max token error occurs, before reattempting. This targets more graceful handling of response generation without tool assistance when constraints are hit.
Introduced error handling for the 'max_tokens' exception during chat response generation. In cases where the maximum token count is exceeded, the bot now falls back to a no-tools response, avoiding the halt of conversation. This ensures a smoother chat experience and persistence in responding, even when a limit is reached. Any other exceptions will still be raised as before, maintaining error transparency.
- Introduce error handling for the keys upload process, logging failures to assist with troubleshooting.
- Improve exception handling in the OpenAI class by returning a more informative response based on the exception arguments if available.
- Replace a return statement in the Newroom tool with an exception raise to standardize tool action termination and provide clearer flow control.
Resolves issue with silent key upload failures. Refines response and control flow for better clarity and debugging.
Enabled asynchronous key upload in the roommember callback to improve efficiency. Fixed the chat response generation by properly referencing the event sender rather than the room ID, aligning user context with chat messages. Corrected the user parameter misuse in the OpenAI class to utilize the room ID. Extended the toolkit to include a 'newroom' feature for creating and setting up new Matrix rooms, thereby enhancing bot functionality.
This commit significantly improves bot response times and contextual accuracy while interacting within rooms and adds a valuable feature for users to create rooms seamlessly.
Enhanced the speech generation logging to display the word count of the input text instead of the full text. This change prioritizes user privacy and improves log readability. Implemented a new feature to generate descriptions for images within a conversation, expanding the bot's capabilities. Also, refactor `BaseTool` class to securely access arguments through `.get` method and to include `messages` by default, ensuring graceful handling of missing arguments.
Enhanced the audio processing in speech-to-text conversion by converting the input audio to MP3 format before transcription. The logging now reflects the word count of the recognized text, providing clearer insight into the output. This should improve compatibility with the transcription service and result in more accurate transcriptions.
Introduced a new 'datetime' tool to the gptbot, which provides the current date and time in UTC. This enhancement caters to the need for time-related queries within the bot's functionality, expanding its utility for users dealing with time-sensitive information.
Temporarily commented out callbacks for test responses, event handling, and encrypted messages to focus on core functionality stabilization. This change aims to simplify the debugging process and enhance the reliability of active features during the development phase. Encryption handling will be reintroduced after refining base features.
Refined the exception details in the Wikipedia tool to include the search query when no results are found, enhancing the clarity of error outputs for end-users. This change helps in debugging by indicating the exact query that led to a no-results situation. Additionally, the existing failure-to-connect error message was left as-is, maintaining accurate API connectivity diagnostics.
Refactor the message concatenation logic within the chat response to ensure the original final message remains intact at the end of the sequence. Introduce a new 'Wikipedia' tool to the bot's capabilities, allowing users to query and retrieve information from Wikipedia directly through the bot's interface. This enhancement aligns with efforts to provide a more informative and interactive user experience.
Eliminated a print statement that was outputting the API request URL in the weather fetching tool, ensuring sensitive key information is not displayed in logs. This increases security by preventing potential API key exposure.
Eliminated the printing of traceback in the exception handling block when the GPTBot encounters an error calling a tool. This change cleans up the logs by removing a redundant error output since relevant information is already being logged. The update aims to enhance the clarity and readability of the logs in case of tool calling errors.
Refactored `call_tool` to pass `room` and `user` for improved context during tool execution.
Introduced `Handover` and `StopProcessing` exceptions to better control the flow when calling tools involves managing exceptions and handovers between tools and text generation.
Enabled flexibility with `room` param in sending images and files, now accepting both `MatrixRoom` and `str` types.
Updated `generate_chat_response` in OpenAI class to incorporate tool usage flag and more pruned message handling for tool responses.
Introduced `orientation` option for image generation to specify landscape or portrait.
Implemented two new tool classes, `Imagine` and `Imagedescription`, to streamline image creation and description processes accordingly.
This improved error handling and additional granularity in tool invocation ensure that the bot behaves more predictably and transparently, particularly when interacting with generative AI and handling dialogue. The flexibility in both response and image generation caters to a wider range of user inputs and scenarios, ultimately enhancing the bot's user experience.
This commit adds functionality to call tools within the chat completion model. By introducing the `call_tool()` method in the `GPTBot` class, tools can now be invoked with the appropriate tool call. The commit also includes the necessary changes in the `OpenAI` class to handle tool calls during response generation. Additionally, new tool classes for geocoding and dice rolling have been implemented. This enhancement aims to expand the capabilities of the bot by allowing users to leverage various tools directly within the chat conversation.
This change adds support for voice input and output to the GPTbot. Users can enable this feature using the new `!gptbot roomsettings` command. Voice input and output are currently supported via OpenAI's TTS and Whisper models. However, note that voice input may be unreliable at the moment. This enhancement expands the capabilities of the bot, allowing users to interact with it using their voice. This addresses the need for a more user-friendly and natural way of communication.
- Replaced synchronous room check with asynchronous room check using `await`.
- Updated the code to use the `await` keyword before calling `self.room_uses_assistant(room)`.
- This change enables the code to generate assistant response asynchronously.