Introduced the capability to handle video files as input for AI models that support it, enhancing the bot's versatility in processing media. This update includes a new configuration option to enable or disable video input, catering to different model capabilities. Additionally, integrated Google's Generative AI through the addition of a Google dependency and a corresponding AI class implementation. This move broadens the AI options available, providing users with more flexibility in choosing their desired AI backend. The update involves refactoring and simplifying message preparation and handling, ensuring compatibility and extending functionality to include the new video input feature and Google AI support.
- Added `ForceVideoInput` configuration option to toggle video file processing.
- Integrated Google Generative AI as an optional dependency and included it in the bot's AI choices.
- Implemented a unified method for preparing messages for AI processing, streamlining how the bot handles various message types.
- Removed obsolete code related to message truncation and specialized handling for images, files, and audio, reflecting a shift towards a more flexible and generalized message processing approach.
Improved the login logic in the bot's initialization process to require a UserID when a Password is provided for login. This update ensures a more secure and fail-proof login procedure by validating the presence of a UserID before attempting to log in, and by handling LoginError more explicitly with a clear error message. This change addresses the need for better error handling and validation during the bot's login phase to avoid silent failures and improve debuggability.
- Added LoginError import to handle login-related exceptions more gracefully.
- Refined the login process to create the AsyncClient instance with a UserID when password authentication is used, following best practices for client identification.
- Introduced explicit error raising for missing UserID configuration, enhancing configuration validation before attempting a login.
- Improved clarity and security by clearing the password from the configuration post-login, preventing inadvertent storage or reuse.
This update enhances the bot's robustness and configuration validation, ensuring smoother operations and better error handling during the initialization phase.
Introduced changes to the tool request behavior and image processing. Now, the configuration allows a dedicated model for tool requests (`ToolModel`) and enforces automatic resizing of context images to maximal dimensions, improving compatibility and performance with the AI model. The update shifts away from a rigid tool model use, accommodating varied model support for tool requests, and optimizes image handling for network and processing efficiency. These adjustments aim to enhance user experience with more flexible tool usage and efficient image handling in chat interactions.
This commit simplifies the pyproject.toml structure for better readability and maintenance. Key changes include formatting author and license information, consolidating dependency lists into a more concise format, and adding the `future` package to dependencies to ensure forward-compatibility. Optional dependencies are now listed in a more compact style, and the development dependencies section has been cleaned up. These adjustments make the project configuration cleaner and more accessible, facilitating future updates and dependency management.
Introduced the `ForceVision` configuration option to allow usage of third-party models for image recognition within the OpenAI setup. This change broadens the flexibility and applicability of the bot's image processing capabilities by not restricting to predefined vision models only. Also, added missing properties to the `OpenAI` class to provide comprehensive control over the bot's behavior, including options for forcing vision and tools usage, along with emulating tool capabilities in models not officially supporting them. These enhancements make the bot more adaptable to various models and user needs, especially for self-hosted setups.
Additionally, updated documentation and increment version to 0.3.12 to reflect these changes and improvements.
Refactored the handling of AI providers to support multiple AI services efficiently, introducing a `BaseAI` class from which all AI providers now inherit. This change modernizes our approach to AI integration, providing a more flexible and maintainable architecture for future expansions and enhancements.
- Adopted `gpt-4o` and `dall-e-3` as the default models for chat and image generation, respectively, aligning with the latest advancements in AI capabilities.
- Integrated `ruff` as a development dependency to enforce coding standards and improve code quality through consistent linting.
- Removed unused API keys and sections from `config.dist.ini` to streamline configuration management and clarify setup processes for new users.
- Updated the command line tool for improved usability and fixed previous issues preventing its effective operation.
- Enhanced OpenAI integration with advanced settings for temperature, top_p, frequency_penalty, and presence_penalty, enabling finer control over AI-generated outputs.
This comprehensive update not only enhances the bot's performance and usability but also lays the groundwork for incorporating additional AI providers, ensuring the project remains at the forefront of AI-driven chatbot technologies.
Resolves#13
Enhanced bot flexibility by enabling the specification of room IDs in the allowed users' list, broadening access control capabilities. This change allows for more granular control over who can interact with the bot, particularly useful in scenarios where the bot's usage needs to be restricted to specific rooms. Additionally, updated documentation and configurations reflect the inclusion of new AI models and self-hosted API support, catering to a wider range of use cases and setups. The README.md and config.dist.ini files have been updated to offer clearer guidance on setup, configuration, and troubleshooting, aiming to improve user experience and ease of deployment.
- Introduced the ability for room-specific bot access, enhancing user and room management flexibility.
- Expanded AI model support, including `gpt-4o` and `ollama`, increases the bot's versatility and application scenarios.
- Updated Python version compatibility to 3.12 to ensure users are leveraging the latest language features and improvements.
- Improved troubleshooting documentation to assist users in resolving common issues more efficiently.
Renamed `pantalaimon_first_login.py` to `fetch_access_token.py` to better reflect its purpose. Additionally, updated README to remove obsolete instructions for using pantalaimon with the bot.
- Updated the Docker CI/CD workflow to trigger on pushes to the main branch, aligning with standard Git flow practices for production deployment.
- Advanced project version to 0.3.9, marking a new release with consolidated features and bug fixes.
This adjustment ensures that the Docker images are built and deployed in a more streamlined manner, reflecting our shift towards a unified branching strategy for releases. The version bump signifies the stabilization of new functionalities and enhancements for broader usage.
- Included the `ffmpeg` package in the Docker environment to support multimedia content processing.
- Added `trackingmore-api-tool` as a dependency to expand the bot's functionality with tracking capabilities.
- Adjusted the `all` dependencies list in `pyproject.toml` to include the `trackingmore` module, indicating a broader feature set for the application.
- Updated the bot class to prepare for integrating `TrackingMore` alongside existing services like `OpenAI` and `WolframAlpha`, highlighting an intention to make such integrations configurable in the future.
This enhancement enables the bot to interact with multimedia content more effectively and introduces package tracking features, laying groundwork for configurable service integrations.
- Initialized preparations for the unreleased 0.3.9 version in the changelog.
- Updated copyright information to include 2024 and added Private.coffee Team alongside Kumi Mitterer to reflect the collaborative nature of the project going forward.
- Incremented the project version to 0.3.9.dev0 in pyproject.toml to align with upcoming development efforts.
- Modified all references from Kumi's personal repo to the Private.coffee Team's repo in README.md, LICENSE, and pyproject.toml, ensuring future contributions and issues are directed to the correct repository. This change facilitates a broader collaboration platform and acknowledges the team's growing involvement in the project's development.
These updates are critical for the upcoming development phase and for accurately representing the collaborative efforts behind the project.
Bumped project version to 0.3.8 for the next release cycle. Updated Homepage and Bug Tracker URLs to reflect the new hosting location, aiming for improved accessibility and collaboration. Additionally, introduced a Source Code URL for direct access to the repository, facilitating developers' engagement and contributions.
Upgraded project version to 0.3.6 to introduce a critical fix for message type detection failing on certain messages. This version also amends the package directory structure for improved organization, moving from `src/gptbot` to just `gptbot`. Additionally, updated the CHANGELOG to reflect this fix and organizational change, ensuring that it stays current with the project's progress.
- Fixes message type detection issue
Standardize the passing of 'messages' argument across various calls to
generate_chat_response method to ensure consistency and prevent
potential bugs in the GPT bot's response generation. The 'model'
parameter in one instance has been corrected to 'original_model' for
proper context loading. These changes improve code clarity and maintain
the intended message flow within the bot's conversation handling.
Removed the '-dev' suffix from the project version indicating the transition from a development state to the official release of version 0.3.3. This version bump aligns with the completion of features and fixes slated for this iteration.
Introduced additional debug log entries in the `GPTBot` class to provide clarity on the initial sync and callback setup process. This helps with monitoring and troubleshooting during the early stages of bot deployment, making it easier to pinpoint issues around bot startup and room joining behavior.
Bumped project version to 0.3.3-dev to signal ongoing development.
Resolved an issue that prevented the bot from responding when files were uploaded to encrypted rooms by implementing a workaround. The bot now tries to generate text from uploaded files and logs errors without interrupting the message flow. Upgraded the Pantalaimon dependency to ensure compatibility. Also, refined the message processing logic to handle different message types correctly and made the download_file method asynchronous to match the matrix client's expected behavior. Additionally, updated the changelog and bumped the project version to reflect these fixes and improvements.
Known issues have been documented, including a limitation when using Pantalaimon where the bot cannot download/use files uploaded to encrypted rooms.
Upgraded bot features to interpret and respond to text, image, and voice prompts in Matrix rooms using advanced OpenAI models, including vision preview and text-to-speech. Streamlined installation process with bot now available via PyPI, simplifying setup and extending accessibility. Eliminated planned features section, signaling a shift towards realized functionalities over prospective development.
Configured Pantalaimon as an optional dependency to enable bot use in E2EE rooms while maintaining compatibility with non-encrypted rooms. Removed trackingmore dependency, indicating a refinement in the feature set towards core functionalities. Version bumped to 0.3.0, signifying major enhancements over previous iteration.
Integrated Pantalaimon support with updated configuration instructions and examples, facilitating secure communication when using the Matrix homeserver. The .gitignore is now extended to exclude a Pantalaimon configuration file, preventing sensitive information from accidental commits. Removed encryption callbacks and related functions as the application leverages Pantalaimon for E2EE, simplifying the codebase and shifting encryption responsibilities externally. Streamlined dependency management by removing the requirements.txt in favor of pyproject.toml, aligning with modern Python practices. This change overall improves security handling and eases future maintenance.
This commit adds functionality to call tools within the chat completion model. By introducing the `call_tool()` method in the `GPTBot` class, tools can now be invoked with the appropriate tool call. The commit also includes the necessary changes in the `OpenAI` class to handle tool calls during response generation. Additionally, new tool classes for geocoding and dice rolling have been implemented. This enhancement aims to expand the capabilities of the bot by allowing users to leverage various tools directly within the chat conversation.
This change adds support for voice input and output to the GPTbot. Users can enable this feature using the new `!gptbot roomsettings` command. Voice input and output are currently supported via OpenAI's TTS and Whisper models. However, note that voice input may be unreliable at the moment. This enhancement expands the capabilities of the bot, allowing users to interact with it using their voice. This addresses the need for a more user-friendly and natural way of communication.