Skip to content

Releases: uezo/ChatdollKit

v0.8.1

18 Sep 14:49
Compare
Choose a tag to compare

🏷️ User-Defined Tags

You can now include custom tags in AI responses, enabling dynamic actions. For instance, embed language codes in replies to switch between multiple languages on the fly during conversations.

  • Add support for user-defined tags in response messages #342
  • Add support for user-defined tags (Claude, Gemini and Dify) #350

🌐 External Control via Socket

Now supports external commands through Socket communication. Direct conversation flow, trigger specific phrases, or control expressions and gestures, unlocking new use cases like AI Vtubers and remote customer service.

  • Add SocketServer to enable external request handling via socket communication #345
  • Add DialogPriorityManager for handling prioritized dialog requests #346
  • Add option to hide user message window #347
  • Add ModelRequestBroker for simplified model control via tagged text #348

Check out the client-side demo here: https://gist.github.com/uezo/9e56a828bb5ea0387f90cc07f82b4c15

🍩 Other Updates

  • Fix bug where expressions on error doesn't work #344
  • Improve text splitting logic in SplitString method #349
  • Update demo for v0.8.1 #351

Full Changelog: 0.8.0...0.8.1

v0.8.0

08 Sep 08:12
191d61c
Compare
Choose a tag to compare

💎 What's New in Version 0.8 Beta

To run the demo for version 0.8.0 beta, please follow the steps below after importing the dependencies:

  • Open scene Demo/Demo08.
  • Select AIAvatarVRM object in scene.
  • Set OpenAI API key to following components on inspector:
    • ChatGPTService
    • OpenAITTSLoader
    • OpenAISpeechListener
  • Run on Unity Editor.
  • Say "こんにちは" or word longer than 3 characters.
  • Enjoy👍

⚡ Optimized AI Dialog Processing

We've boosted response speed with parallel processing and made it easier for you to customize behavior with your own code. Enjoy faster, more flexible AI conversations!

  • Optimize AI-driven interactions by @uezo in #335

🥰 Emotionally Rich Speech

Adjusts vocal tone dynamically to match the conversation, delivering more engaging and natural interactions.

  • Improve expressiveness of text-to-speech output by @uezo in #336
  • Allow adding emotion to speech synthesis by @uezo in #337

🎤 Enhanced Microphone Control

Microphone control is now more flexible than ever! Easily start/stop devices, mute/unmute, and adjust voice recognition thresholds independently.

  • Add new SpeechListener namespace with voice input modules by @uezo in #334

🍩 Other Changes

  • Fix some bugs in StyleBertVITSTTSLoader by @uezo in #333
  • Update for v0.8.0 beta by @uezo in #338

Full Changelog: 0.7.7...0.8.0

v0.7.7

02 Sep 14:44
b3d67b5
Compare
Choose a tag to compare

🥰 Support StyleBertVits2

We've added support for Text-to-Speech using the StyleBertVits2 API! 🎙️✨ Now, your AI characters can speak with even more expressive and dynamic voices, making them shine brighter than ever! 😎 Get ready to take your character's charm to the next level! 🚀💫

  • Support StyleBertVits2 API as TTS service by @uezo in #327

💕 Support Cohere Command R 💕

  • Add experimental support for Command R by @uezo in #329
  • Add experimental support for Command R on WebGL by @uezo in #331

🐸 Other Changes

  • Fix bug in handling response when using Azure OpenAI by @uezo in #325
  • Add option to completely disable WakeWordListener by @uezo in #326
  • Fix bug causing ToolCalls to fail by @uezo in #328
  • Provide workaround to clear state data, including LLM context by @uezo in #330
  • Update WebGLMicrophone.jslib by @uezo in #332

Full Changelog: 0.7.6...v0.7.7

v0.7.6

20 Jul 17:09
Compare
Choose a tag to compare

What's Changed

🎓LLM related updates

  • Add support for Dify Agents by @uezo in #315
  • Add support for custom logic at the end of LLM streaming by @uezo in #316
  • Internalize Dify ConversationId in state data by @uezo in #318
  • Use GPT-4o mini as the default model for ChatGPT by @uezo in #321

🗣️ Dialog control

  • Fix WebGL microphone input handling by @uezo in #317
  • Improve WakewordListener functionality and debugging by @uezo in #320

🥰 3D model control

  • Enable runtime loading of VRM models from URL and byte data by @uezo in #322

🐈 Others

Full Changelog: 0.7.5...0.7.6

v0.7.5

02 Jul 16:43
0443468
Compare
Choose a tag to compare

Dify Support 💙

  • Add support for Dify💙 by @uezo in #309
  • Add support for Dify TTS and STT by @uezo in #311

Other changes

  • Fix bug where mic volume changes are not applied immediately by @uezo in #307
  • Enhance camera functionality with manual still capture and sub-camera switching by @uezo in #308

Full Changelog: 0.7.4...0.7.5

v0.7.4

23 Jun 03:34
7165a1a
Compare
Choose a tag to compare

👀 Enhanced Vision Capabilities

This update introduces autonomous vision input for Gemini and Claude, and adds vision input support for WebGL. Now, various AIs can offer richer conversational experiences with integrated vision input across different platforms.

  • Support autonomous vision input for Gemini✨ #302
  • Refactor Vision input and various related improvements #303
  • Support autonomous vision input for Claude✹ #304
  • Add vision input support for WebGL #305

Full Changelog: 0.7.3...0.7.4

v0.7.3

15 Jun 16:31
d0c3655
Compare
Choose a tag to compare

👀 Support dynamic vision input for ChatGPT

By adding a SimpleCamera to the scene and including [vision:camera] in the response message, the system will autonomously capture images when visual input is required for a response.

  • Add autonomous image input handling for ChatGPT #298

📦 Easy setup by modularized UI components

Microphone volume sliders and request input forms have been modularized. These can now be used immediately by simply adding the prefabs to the scene without any additional setup.

  • Modularize UI components for easy scene addition in #300

🎙️ dB-based microphone volume

  • Change volume measurement from amplitude to decibels #296
  • Fix incorrect volume measurement bug #299

✨ Other changes

  • Switch from function call to tool call for ChatGPT Function Calling #297
  • Remove deprecated ChatGPT-related modules in #301

Full Changelog: v0.7.2...0.7.3

v0.7.2

21 Mar 19:17
f304506
Compare
Choose a tag to compare

🖼️ Support vision

Set image to request message as payload for multimodal conversation.

  • Support message with image in GPT-4V and Claude3 #286

🚀 Configuration-free demo

Just start without any configurations. Set API key (and others if you want) at runtime.

  • Add configuration-free demo #288

Full Changelog: v0.7.1...v0.7.2

v0.7.1

08 Jan 02:43
5e85d06
Compare
Choose a tag to compare

🥰😇 Change avatar on runtime

Support changing avatar on runtime. Use ModelController.SetAvatar to change the avatar to another on runtime.
If you want to try it on editor, set another avatar on the scene to the inspector of ModelController and push Change Avatar button(appears runtime only).

  • Add support for changing avatar on runtime #280

🎙️ Better microphone management

Mute microphone when not listening for WakeWord or VoiceRequest.
Also added IsMuted property to DialogController so that you can control mute/unmute manually. Default is false.

  • IsMuted == false: Microphone will be on when WakeWordListener or VoiceRequestProvider is listening.
  • IsMuted == true: Microphone will be off.

You can change the value manually on inspector or in script.

  • Mute microphone when not listening for WakeWord or VoiceRequest #281
  • Mute microphone when avatar speaking #282

🍱 Other changes

  • Set message windows automatically without configurations #278
  • Prevent UnityWebRequest error when calling LLMs #279
  • Update demo for v0.7.1 #283

Full Changelog: v0.7.0...v0.7.1

v0.7.0

03 Jan 08:04
Compare
Choose a tag to compare

🤖 LLM-based Dialog Processing

✅Multiple LLMs: ChatGPT / Azure OpenAI Service, Anthropic Claude, Google Gemini Pro and others
✅Agents: Function Calling (ChatGPT / Gemini) or your prompt engineering
✅Multimodal: GPT-4V and Gemini-Pro-Vision are suppored
✅Emotions: Autonomous face expression and animation

We've developed a versatile framework that standardizes the processes of Routing, Chatting (including facial expressions and motions), and executing Tools using various Large Language Models (LLMs). This framework allows for easy customization by simply swapping out LLM-specific components.

Additionally, you can support any LLM by creating your own components that implement the ILLMService interface.

To use this new LLM-based dialog, attach LLMRouter, LLMContentSkill and ILLMService component like ChatGPTService in ChatdollKit.LLM package. Also attach function skills that extends LLMFunctionSkillBase if you build an AI agent. See DemoChatGPT that works right out-of-the-box🎁.

  • Support multiple LLMs: ChatGPT, Claude and Gemini✨ #271
  • Support custom request parameters and headers for LLM APIs #272
  • Fix bug that iOS build failed on Xcode #273
  • Fix bug that LLMFunctionSkill fails #275

NOTE: ChatdollKit.Dialog.Processor.ChatGPT* components are deprecated.

🐉 Other Changes

  • Refactoring for Future tech integration and reduced dependencies #270
  • Improve stability of AzureStreamVoiceRequestProvider #274
  • Make microphone volume controller works with NonRecordingVRP #276
  • Update demo for v0.7.0 #277

Full Changelog: v0.6.6...v0.7.0