From 3f324d3d1ebf4b345407d91ba8d9166866132dc6 Mon Sep 17 00:00:00 2001 From: "google-labs-jules[bot]" <161369871+google-labs-jules[bot]@users.noreply.github.com> Date: Sat, 24 May 2025 12:52:39 +0000 Subject: [PATCH 1/5] feat: Implement Discord voice bot with Character AI This commit introduces a Discord bot capable of engaging in voice conversations. The bot can: - Join and leave voice channels. - Listen to your speech for 10 seconds. - Convert the speech to text using the SpeechRecognition library. - Send the transcribed text to a specified Character AI character. - Receive a voice response (MP3) from Character AI using its generate_speech utility. - Play the Character AI's voice response back in the Discord voice channel. Key features include: - A conversation loop mode: After the bot speaks, it automatically starts listening again. - Commands to manage the conversation: `!join` (starts convo), `!stopconvo`. - Configuration for Discord bot token, Character AI token, Character ID, and Voice ID. - Documentation (README.md) and dependency list (requirements.txt). The implementation relies on discord.py for bot functionalities, PyCharacterAI for interacting with Character AI, and SpeechRecognition for STT. FFmpeg is a system dependency for audio playback. --- README.md | 320 ++++++---------------------------- discord_bot.py | 435 +++++++++++++++++++++++++++++++++++++++++++++++ requirements.txt | 6 + 3 files changed, 497 insertions(+), 264 deletions(-) create mode 100644 discord_bot.py create mode 100644 requirements.txt diff --git a/README.md b/README.md index dbdb40c..fda32fd 100644 --- a/README.md +++ b/README.md @@ -1,270 +1,62 @@ -# PyCharacterAI -> -> An asynchronous Python api wrapper for [Character AI](https://character.ai/) using [curl-cffi](https://github.com/yifeikong/curl_cffi). - -> [!WARNING] -> This is an unofficial library made by an enthusiast who has no relation to the CharacterAI development team. CharacterAI has no official api and all breakpoints were found manually using reverse proxy. The author is not responsible for possible consequences of using this library. -> -> The library is under constant development, trying to match the functionality of the website. Anything that isn't described in the documentation may be changed or removed without backward compatibility! - ---- -📚 [Documentation](https://github.com/Xtr4F/PyCharacterAI/blob/main/docs/welcome.md). - ---- -**TO-DO**: - -- [x] Exceptions. -- [ ] Logging? -- [ ] Finish the docs. - ---- -\ -If you have any questions, problems, suggestions, please open an issue or contact me: - -[![Tag](https://img.shields.io/badge/telegram-dm-black?style=flat&logo=Telegram)](https://t.me/XtraF) - ---- - -## Getting started - -First, you need to install the library: - +# Discord Character AI Voice Bot + +## Description +This Discord bot engages in voice conversations with users. It joins a voice channel, listens to a user's speech, converts it to text using Speech-to-Text (STT), sends the text to a Character AI persona, receives a text response, converts that response to speech using Character AI's voice generation, and plays it back in the voice channel. The bot supports a continuous conversation mode. + +## Features +- **Voice Conversations:** Enables spoken dialogue with a Character AI persona. +- **Configurable Character:** Users can specify the Character AI token, character ID, and voice ID. +- **Turn-Based Interaction:** The bot listens for a user's speech, responds, and then listens again. +- **Conversation Management:** Commands to start, stop, and manage the conversation flow. +- **Manual Recording:** Option for one-off voice recordings (when not in conversation mode). + +## Setup Instructions + +### Prerequisites +- **Python:** Python 3.8+ is recommended. +- **FFmpeg:** FFmpeg must be installed and accessible in your system's PATH. FFmpeg is used for audio processing by `discord.py`. You can download it from [ffmpeg.org](https://ffmpeg.org/download.html). + +### Installation +1. **Clone the repository** (or download the files). +2. **Install Python dependencies:** + Create a `requirements.txt` file (or use the one provided) with the following content: + ```txt + discord.py + SpeechRecognition + # PyCharacterAI is assumed to be a local module present in the same directory. + # Pocketsphinx was installed as a dependency of SpeechRecognition for offline STT capabilities, + # but the bot primarily uses online STT (Google Web Speech API). + # curl-cffi was installed as a dependency of PyCharacterAI. + ``` + Open your terminal or command prompt in the project directory and run: + ```bash + pip install -r requirements.txt + ``` + If `PyCharacterAI` is not provided as a local module, you would typically install it via pip if it were available on PyPI (e.g., `pip install PyCharacterAI`). For this project, ensure the `pycharacterai` library/module is correctly placed if it's a local dependency. + +3. **Configuration:** + Open the `discord_bot.py` file in a text editor. At the top of the file, you will find placeholder values for your tokens and IDs. **You MUST replace these with your actual credentials:** + ```python + CAI_TOKEN = "YOUR_CAI_TOKEN" # Your Character AI client token + CAI_CHARACTER_ID = "YOUR_CAI_CHARACTER_ID" # The ID of the Character AI you want to use + CAI_VOICE_ID = "YOUR_CAI_VOICE_ID" # The voice ID for the character's speech + BOT_TOKEN = "YOUR_DISCORD_BOT_TOKEN" # Your Discord bot token + ``` + +## Running the Bot +Once the dependencies are installed and the configuration is set, you can run the bot using: ```bash -pip install PyCharacterAI -``` - ---- -\ -Import the `Client` class from the library and create a new instance of it: - -```Python -from PyCharacterAI import Client -``` - -```Python -client = Client() -``` - -Then you need to authenticate `client` using `token`: - -```Python -await client.authenticate("TOKEN") -``` - -> if you want to be able to upload your avatar you also need to specify `web_next_auth` token as an additional argument (only this way for now, this may change in the future): -> -> ```Python -> await client.authenticate("TOKEN", web_next_auth="WEB_NEXT_AUTH") -> ``` - -\ -Or you can just call `get_client()` method: - -```Python -from PyCharacterAI import get_client - -client = await get_client(token="TOKEN", web_next_auth="WEB_NEXT_AUTH") +python discord_bot.py ``` +Ensure your terminal is in the same directory as `discord_bot.py`. -After authentication, we can use all available library methods. - ---- - -## About tokens and how to get them -> -> ⚠️ WARNING, DO NOT SHARE THESE TOKENS WITH ANYONE! Anyone with your tokens has full access to your account! - -This library uses two types of tokens: a common `token` and `web_next_auth`. The first one is required for almost all methods here and the second one only and only for `upload_avatar()` method (may change in the future). - -### Instructions for getting a `token` - -1. Open the Character.AI website in your browser -2. Open the `developer tools` (`F12`, `Ctrl+Shift+I`, or `Cmd+J`) -3. Go to the `Nerwork` tab -4. Interact with website in some way, for example, go to your profile and look for `Authorization` in the request header -5. Copy the value after `Token` - -> For example, token in `https://plus.character.ai/chat/user/public/following/` request headers: -> ![img](https://github.com/Xtr4F/PyCharacterAI/blob/main/assets/token.png) - -### Instructions for getting a `web_next_auth` token - -1. Open the Character.AI website in your browser -2. Open the `developer tools` (`F12`, `Ctrl+Shift+I`, or `Cmd+J`) -3. Go to the `Storage` section and click on `Cookies` -4. Look for the `web-next-auth` key -5. Copy its value - -> ![img](https://github.com/Xtr4F/PyCharacterAI/blob/main/assets/web_next_auth.png) - ---- - -## Examples -> -> Here are just some examples of the library's features. If you want to know about all `methods` and `types` with explanations, go to [methods](https://github.com/Xtr4F/PyCharacterAI/blob/main/docs/api_reference/methods.md) and [types](https://github.com/Xtr4F/PyCharacterAI/blob/main/docs/api_reference/types.md) documentation sections. -> -### Simple chatting example - -```Python -import asyncio - -from PyCharacterAI import get_client -from PyCharacterAI.exceptions import SessionClosedError - -token = "TOKEN" -character_id = "ID" - - -async def main(): - client = await get_client(token=token) - - me = await client.account.fetch_me() - print(f"Authenticated as @{me.username}") - - chat, greeting_message = await client.chat.create_chat(character_id) - - print(f"{greeting_message.author_name}: {greeting_message.get_primary_candidate().text}") - - try: - while True: - # NOTE: input() is blocking function! - message = input(f"[{me.name}]: ") - - answer = await client.chat.send_message(character_id, chat.chat_id, message) - print(f"[{answer.author_name}]: {answer.get_primary_candidate().text}") - - except SessionClosedError: - print("session closed. Bye!") - - finally: - # Don't forget to explicitly close the session - await client.close_session() - -asyncio.run(main()) -``` +## Usage / Commands +- **`!join`**: The bot will join the voice channel you are currently in and initiate "conversation mode." It will listen for your speech for 10 seconds, process it, get a response from Character AI, speak it back, and then listen to you again. +- **`!stopconvo`**: This command stops the active "conversation mode." The bot will stop listening, clear any ongoing processes, and leave the voice channel. +- **`!leave`**: Similar to `!stopconvo`, this command makes the bot leave the voice channel and ends any active conversation mode. +- **`!record`**: If conversation mode is *not* active, you can use this command to make a manual 10-second recording of your voice. The bot will process this single recording (STT and Character AI response if configured, though playback might only be the text part depending on implementation details outside conversation mode). --- -A more advanced example. You can use so-called streaming to receive a message in parts, as is done on a website, instead of waiting for it to be completely generated: - -```Python -import asyncio - -from PyCharacterAI import get_client -from PyCharacterAI.exceptions import SessionClosedError - -token = "TOKEN" -character_id = "ID" - - -async def main(): - client = await get_client(token=token) - - me = await client.account.fetch_me() - print(f'Authenticated as @{me.username}') - - chat, greeting_message = await client.chat.create_chat(character_id) - - print(f"[{greeting_message.author_name}]: {greeting_message.get_primary_candidate().text}") - - try: - while True: - # NOTE: input() is blocking function! - message = input(f"[{me.name}]: ") - - answer = await client.chat.send_message(character_id, chat.chat_id, message, streaming=True) - - printed_length = 0 - async for message in answer: - if printed_length == 0: - print(f"[{message.author_name}]: ", end="") - - text = message.get_primary_candidate().text - print(text[printed_length:], end="") - - printed_length = len(text) - print("\n") - - except SessionClosedError: - print("session closed. Bye!") - - finally: - # Don't forget to explicitly close the session - await client.close_session() - -asyncio.run(main()) -``` - ---- - -### Working with images - -```Python -# We can generate images from a prompt -# (It will return list of urls) -images = await client.utils.generate_image("prompt") -``` - ---- - -```Python -# We can upload an image to use it as an -# avatar for character/persona/profile - -# NOTE: This method requires the specified web_next_auth token -avatar_file = "path to file or url" -avatar = await client.utils.upload_avatar(avatar_file) -``` - ---- - -### Working with voices - -```Python -# We can search for voices -voices = await client.utils.search_voices("name") -``` - ---- - -```Python -# We can upload the audio as a voice -voice_file = "path to file or url" -voice = await client.utils.upload_voice(voice_file, "voice name") -``` - ---- - -```Python -# We can set and unset a voice for character -await client.account.set_voice("character_id", "voice_id") -await client.account.unset_voice("character_id") -``` - ---- - -```Python -# And we can use voice to generate speech from the character's messages -speech = await client.utils.generate_speech("chat_id", "turn_id", "candidate_id", "voice_id") - -# It will return bytes, so we can use it for example like this: -filepath = "voice.mp3" - -with open(filepath, 'wb') as f: - f.write(speech) -``` - -```Python -# or we can get just the url. -speech_url = await client.utils.generate_speech("chat_id", "turn_id", "candidate_id", - "voice_id", return_url=True) - -``` - ---- - -## Special Thanks - -- [node_characterai](https://github.com/realcoloride/node_characterai) by @realcoloride - for being the backbone of the project in the past. - -- [CharacterAI](https://github.com/kramcat/CharacterAI) by @kramcat - for inspiring me. +*This bot relies on external services (SpeechRecognition for STT, Character AI for persona and TTS). Ensure you have stable internet access and that these services are operational.* +*Ensure FFmpeg is correctly installed and added to your system's PATH.* diff --git a/discord_bot.py b/discord_bot.py new file mode 100644 index 0000000..1a90995 --- /dev/null +++ b/discord_bot.py @@ -0,0 +1,435 @@ +import discord +from discord.ext import commands +import asyncio +import discord.sinks # For advanced audio recording sinks +import speech_recognition as sr # For Speech-to-Text +from pycharacterai import PyCharacterAI # For Character AI interaction +from io import BytesIO # For handling byte streams (audio data) + +# --- Configuration Placeholders --- +# These MUST be filled in for the bot to work. +CAI_TOKEN = "YOUR_CAI_TOKEN" # Your Character AI client token +CAI_CHARACTER_ID = "YOUR_CAI_CHARACTER_ID" # The ID of the Character AI character you want to interact with +CAI_VOICE_ID = "YOUR_CAI_VOICE_ID" # The specific voice ID for the Character AI character's speech +BOT_TOKEN = "YOUR_DISCORD_BOT_TOKEN" # Your Discord bot token + +# --- Global Variables --- +cai_client = None # Global client for PyCharacterAI, initialized in on_ready() +conversation_mode_status = {} # Dictionary to manage conversation mode state per guild + # Key: guild_id (int), Value: boolean (True if active, False if inactive) + +# --- Bot Setup --- +# Define intents required by the bot +intents = discord.Intents.default() +intents.voice_states = True # Required for voice channel operations (joining, speaking, listening) +intents.message_content = True # Required for reading message content for commands (e.g., "!join") + +# Create bot instance with command prefix "!" and defined intents +bot = commands.Bot(command_prefix="!", intents=intents) + + +# --- Core Conversation Logic Functions --- + +async def after_playback(error, guild_id: int, original_author_id: int, text_channel_id: int): + """ + Async callback function executed after the bot finishes playing audio in a voice channel. + This function is crucial for continuing the conversation loop. + Args: + error: Any error that occurred during playback (None if successful). + guild_id: The ID of the guild where playback occurred. + original_author_id: The ID of the user who initiated the current conversation turn. + text_channel_id: The ID of the text channel used for bot responses. + """ + if error: + print(f"Error during playback for guild {guild_id}: {error}") + # Optionally, send a message to the text channel about the playback error. + # text_channel = bot.get_channel(text_channel_id) + # if text_channel: + # await text_channel.send("An error occurred during audio playback.") + + # Retrieve Discord objects from IDs for further operations + guild = bot.get_guild(guild_id) + if not guild: + print(f"after_playback: Guild {guild_id} not found.") + return + + text_channel = guild.get_channel(text_channel_id) + if not text_channel: + print(f"after_playback: Text channel {text_channel_id} not found in guild {guild_id}.") + return + + original_author = guild.get_member(original_author_id) # Get the member object for the user + if not original_author: + print(f"after_playback: Original author {original_author_id} not found in guild {guild_id}.") + return + + # Check if conversation mode is still active for this guild + if conversation_mode_status.get(guild.id): + await text_channel.send(f"My turn is over, {original_author.mention}, now listening for your response for 10 seconds...") + # Schedule the start_recording function to listen to the user again, continuing the loop. + # asyncio.create_task is used because start_recording is an async function. + asyncio.create_task(start_recording(guild, original_author, text_channel)) + # If conversation_mode_status is False or not set for the guild, the loop naturally stops. + + +async def finished_recording_callback( + sink: discord.sinks.WaveSink, + guild: discord.Guild, + user_recorded: discord.Member, + text_channel_for_responses: discord.TextChannel +): + """ + Async callback executed when the WaveSink finishes recording audio (i.e., vc.stop_listening() is called). + This function processes the recorded audio: performs STT, interacts with Character AI, + gets a voice response, and plays it back. This forms a single turn in the conversation. + Args: + sink: The WaveSink object containing the recorded audio data. + guild: The guild where the recording took place. + user_recorded: The user whose audio was intended to be recorded. + text_channel_for_responses: The text channel for sending bot messages. + """ + # The sink can record multiple users if they speak at once. + # We filter to get audio specifically from the `user_recorded`. + user_audio_data = None + for user_id, audio in sink.audio_data.items(): + if user_id == user_recorded.id: + user_audio_data = audio # This is a discord.sinks.core.AudioData object + break + + if not user_audio_data: + await text_channel_for_responses.send(f"Sorry {user_recorded.mention}, I couldn't record your audio this time. Make sure you're speaking.") + # If in conversation mode, try to re-listen to keep the loop active. + if conversation_mode_status.get(guild.id): + await text_channel_for_responses.send(f"Trying to listen again for {user_recorded.mention} for 10 seconds...") + asyncio.create_task(start_recording(guild, user_recorded, text_channel_for_responses)) + return + + # Save the user's audio data to a temporary .wav file for STT processing. + filename = f"{user_recorded.id}_{guild.id}_recording.wav" + with open(filename, "wb") as f: + f.write(user_audio_data.file.read()) # user_audio_data.file is an io.BytesIO object + + await text_channel_for_responses.send(f"Finished recording for {user_recorded.mention}. Processing audio...") + + # 1. Perform Speech-to-Text (STT) + recognizer = sr.Recognizer() + with sr.AudioFile(filename) as source: # Use the saved .wav file as the audio source + audio_data_for_stt = recognizer.record(source) # Load audio data from file + try: + # Use Google Web Speech API for STT. Requires internet. + text = recognizer.recognize_google(audio_data_for_stt) + await text_channel_for_responses.send(f"You ({user_recorded.mention}) said: \"{text}\"") + + # 2. Interact with PyCharacterAI + global cai_client # Access the globally initialized CAI client + if not cai_client: + await text_channel_for_responses.send(f"Sorry {user_recorded.mention}, PyCharacterAI client is not initialized.") + return + + try: + # Create a new chat session with the Character AI for this interaction. + # This ensures conversation history is managed per interaction if needed, + # though here each turn is treated somewhat independently for simplicity. + chat_response_tuple = await cai_client.chat.create_chat(CAI_CHARACTER_ID, greeting=False) + current_chat_object = chat_response_tuple[0] + current_chat_id = current_chat_object.chat_id + if not current_chat_id: + await text_channel_for_responses.send("Failed to create CAI chat session.") + return + + # Send the transcribed text from the user to Character AI. + answer = await cai_client.chat.send_message(CAI_CHARACTER_ID, current_chat_id, text) + primary_candidate = answer.get_primary_candidate() # Get the main response from CAI + if not primary_candidate: + await text_channel_for_responses.send("CAI did not return a valid response.") + return + + cai_text_response = primary_candidate.text + await text_channel_for_responses.send(f"Character AI: {cai_text_response}") # Send AI's text reply + + # 3. Generate Speech from Character AI's response + audio_bytes = await cai_client.utils.generate_speech( + chat_id=answer.chat_id, + turn_id=answer.turn_id, + candidate_id=primary_candidate.candidate_id, + voice_id=CAI_VOICE_ID # Use the configured voice ID for TTS + ) + + if audio_bytes: + # 4. Play Character AI's audio response in the voice channel + voice_client = discord.utils.get(bot.voice_clients, guild=guild) + if voice_client and voice_client.is_connected(): + # Create a discord.AudioSource object from the audio bytes. + # FFmpeg must be installed and in PATH for FFmpegPCMAudio to work. + audio_source = discord.PCMVolumeTransformer(discord.FFmpegPCMAudio( + BytesIO(audio_bytes), pipe=True, executable="ffmpeg" + )) + if not voice_client.is_playing(): # Play only if not already playing something + # The `after` argument schedules `after_playback` to run once this audio finishes. + # This is key to the conversation loop. + # IDs are passed to `after_playback` to avoid issues with Discord objects in async callbacks. + voice_client.play(audio_source, after=lambda e: bot.loop.create_task( + after_playback(e, guild.id, user_recorded.id, text_channel_for_responses.id) + )) + await text_channel_for_responses.send("Playing Character AI's response...") + else: + await text_channel_for_responses.send("Already playing audio. CAI response will not be played now.") + # If in conversation mode and can't play, still trigger re-listening via after_playback. + if conversation_mode_status.get(guild.id): + bot.loop.create_task(after_playback(None, guild.id, user_recorded.id, text_channel_for_responses.id)) + else: + await text_channel_for_responses.send("Bot is not connected to VC. Cannot play CAI audio.") + if conversation_mode_status.get(guild.id): # If bot got disconnected during convo + conversation_mode_status[guild.id] = False # Stop conversation mode + await text_channel_for_responses.send("Conversation mode stopped as bot is not in a voice channel.") + else: + await text_channel_for_responses.send("Failed to generate speech from CAI.") + # If speech generation fails but in convo mode, try to re-listen. + if conversation_mode_status.get(guild.id): + bot.loop.create_task(after_playback(None, guild.id, user_recorded.id, text_channel_for_responses.id)) + + except PyCharacterAI.exceptions.PyCAIError as e: # Handle errors from PyCharacterAI library + await text_channel_for_responses.send(f"Error with PyCharacterAI: {e}") + except Exception as e: # Handle other unexpected errors during CAI interaction + await text_channel_for_responses.send(f"Unexpected error during CAI interaction: {e}") + + except sr.UnknownValueError: # STT could not understand the audio + await text_channel_for_responses.send(f"Sorry {user_recorded.mention}, I could not understand your audio.") + if conversation_mode_status.get(guild.id): # If in convo mode, try re-listening + await text_channel_for_responses.send(f"Trying to listen again for {user_recorded.mention} for 10 seconds...") + asyncio.create_task(start_recording(guild, user_recorded, text_channel_for_responses)) + except sr.RequestError as e: # STT service (e.g., Google) had an issue + await text_channel_for_responses.send(f"Could not request STT results; {e}") + if conversation_mode_status.get(guild.id): # If in convo mode, try re-listening + bot.loop.create_task(after_playback(None, guild.id, user_recorded.id, text_channel_for_responses.id)) + + +async def start_recording( + guild: discord.Guild, + user_to_record: discord.Member, + text_channel_for_responses: discord.TextChannel +): + """ + Initiates the audio recording process for a specific user in their voice channel. + This is called at the beginning of each user's "turn" in the conversation. + Args: + guild: The guild (server) where the recording should happen. + user_to_record: The specific member whose audio should be captured. + text_channel_for_responses: The text channel for sending bot status messages. + """ + voice_client = discord.utils.get(bot.voice_clients, guild=guild) # Get the bot's current voice client for this guild + if not voice_client or not voice_client.is_connected(): + await text_channel_for_responses.send("Bot is not connected to a voice channel. Cannot start recording.") + conversation_mode_status[guild.id] = False # Ensure conversation mode is off if bot is not in VC + return + + # Verify the target user is in the same voice channel as the bot. + if not user_to_record.voice or user_to_record.voice.channel != voice_client.channel: + await text_channel_for_responses.send(f"{user_to_record.mention} is not in the bot's voice channel. Cannot record.") + # This might pause the conversation loop if the user has left the channel. + return + + # Create a new WaveSink for each recording session. This sink collects audio in WAV format. + sink_instance = discord.sinks.WaveSink() + + try: + # Start listening for audio. + # `voice_client.listen()` takes the sink and an `after` callback. + # The `after` callback (`finished_recording_callback`) is triggered when `voice_client.stop_listening()` is called. + # We pass `guild`, `user_to_record`, and `text_channel_for_responses` as `*cb_args` (callback arguments) + # so that `finished_recording_callback` receives the necessary context for its operations. + voice_client.listen(sink_instance, + after=lambda sink_obj, *cb_args: bot.loop.create_task(finished_recording_callback(sink_obj, *cb_args)), + guild, user_to_record, text_channel_for_responses) + + await text_channel_for_responses.send(f"Listening to {user_to_record.mention} for 10 seconds...") + await asyncio.sleep(10) # Record audio for a fixed duration of 10 seconds. + except Exception as e: + print(f"Error starting listener: {e}") + await text_channel_for_responses.send(f"Error starting recording: {e}") + return + + # After the 10-second recording duration, stop listening. + # This action will trigger the `after` callback specified in `voice_client.listen()`, + # which in turn calls `finished_recording_callback` to process the audio. + if voice_client.is_listening(): + voice_client.stop_listening() + else: + # This case might occur if the bot was disconnected or `stop_listening` was called by another process. + print("Was not listening when trying to stop. Callback might not be called as expected.") + + +# --- Bot Event Handlers --- + +@bot.event +async def on_ready(): + """ + Event handler executed when the bot has successfully connected to Discord and is ready. + This is typically used for initialization tasks, like setting up the PyCharacterAI client. + """ + print(f"Logged in as {bot.user.name} (ID: {bot.user.id})") + print("PyCharacterAI and other services will be initialized now.") + + global cai_client + try: + # Initialize the PyCharacterAI client using the token from configuration. + cai_client = await PyCharacterAI.get_client(token=CAI_TOKEN) + if cai_client: + print("PyCharacterAI client initialized successfully.") + # Optional: Fetch account info to verify successful authentication with Character AI. + # me = await cai_client.account.fetch_me() + # print(f"Authenticated to Character AI as: @{me.username}") + else: + # This case (get_client returning None without an exception) might be unlikely + # depending on PyCharacterAI's implementation. + print("Failed to initialize PyCharacterAI client (get_client returned None).") + except Exception as e: # Catch any errors during CAI client initialization + print(f"Failed to initialize PyCharacterAI client: {e}") + cai_client = None # Ensure cai_client is None if setup fails, preventing further CAI calls. + +# --- Bot Commands --- + +@bot.command(name='join', help='Makes the bot join your voice channel and starts conversation mode.') +async def join(ctx: commands.Context): + """ + Command for the bot to join the voice channel of the user who issued the command. + Upon joining, it automatically starts "conversation mode," where it will listen to the user. + """ + if not ctx.author.voice: # Check if the command issuer is in a voice channel + await ctx.send("You are not in a voice channel. Please join a channel first.") + return + + voice_channel = ctx.author.voice.channel # Get the voice channel of the user + + # Check if bot is already in a voice channel in the same guild + if ctx.voice_client: + if ctx.voice_client.channel == voice_channel: # Already in the target channel + await ctx.send("Already in your voice channel.") + else: # In a different channel, so move + await ctx.voice_client.move_to(voice_channel) + await ctx.send(f"Moved to {voice_channel.name}.") + else: # Bot is not in any voice channel in this guild, so connect + try: + await voice_channel.connect() # Connect to the user's voice channel + await ctx.send(f"Joined {voice_channel.name}.") + except Exception as e: + await ctx.send(f"Could not join voice channel: {e}") + return + + # Start conversation mode for this guild. + # The bot will begin by listening to the user who issued the !join command. + conversation_mode_status[ctx.guild.id] = True + await ctx.send("Conversation mode started. I will listen for your first message for 10 seconds...") + + # Initiate the recording process for the user who called !join. + # `ctx.guild` is the server, `ctx.author` is the user, `ctx.channel` is the text channel for messages. + # `asyncio.create_task` schedules the `start_recording` coroutine to run. + asyncio.create_task(start_recording(ctx.guild, ctx.author, ctx.channel)) + + +@bot.command(name='leave', help='Makes the bot leave its current voice channel and stops conversation mode.') +async def leave(ctx: commands.Context): + """ + Command for the bot to leave its current voice channel. + This also deactivates conversation mode for the guild. + """ + # Explicitly turn off conversation mode for this guild. + conversation_mode_status[ctx.guild.id] = False + + if ctx.voice_client: # Check if the bot is connected to a voice channel in this guild + if ctx.voice_client.is_playing(): # If playing audio, stop it. + ctx.voice_client.stop() + + if ctx.voice_client.is_listening(): # If recording audio, stop it. + ctx.voice_client.stop_listening() + + await ctx.voice_client.disconnect() # Disconnect from the voice channel. + await ctx.send("Left voice channel and conversation mode stopped.") + else: + await ctx.send("I am not in a voice channel.") + + +@bot.command(name='stopconvo', help='Stops conversation mode and makes the bot leave the voice channel.') +async def stopconvo(ctx: commands.Context): + """ + Command to explicitly stop the conversation mode. + The bot will also leave the voice channel. This is functionally similar to `!leave` + but provides a more semantically clear way to end the active conversation. + """ + # Set conversation mode to false for the guild. + conversation_mode_status[ctx.guild.id] = False + await ctx.send("Conversation mode stopped.") + + if ctx.voice_client: # If connected to a voice channel + if ctx.voice_client.is_playing(): # Stop any audio playback + ctx.voice_client.stop() + + if ctx.voice_client.is_listening(): # Stop any audio recording + ctx.voice_client.stop_listening() + + if ctx.voice_client.is_connected(): # Disconnect from the voice channel + await ctx.voice_client.disconnect() + await ctx.send("Disconnected from voice channel.") + else: + await ctx.send("Was not in a voice channel.") + + +@bot.command(name='record', help='Manually records 10s of audio if not in conversation mode.') +async def record(ctx: commands.Context): + """ + Command to manually trigger a 10-second audio recording from the command issuer. + This command only functions if "conversation mode" is NOT active for the guild. + It's intended for one-off recordings rather than continuous conversation. + """ + # Check if conversation mode is currently active for this guild. + if conversation_mode_status.get(ctx.guild.id): + await ctx.send("Conversation mode is active. Please use `!stopconvo` first if you want to make a manual recording.") + return + + # Standard checks for bot and user voice state for any voice command. + if not ctx.voice_client: + await ctx.send("I am not in a voice channel. Use `!join` first to bring me in (this will start conversation mode).") + return + + if not ctx.author.voice or ctx.author.voice.channel != ctx.voice_client.channel: + await ctx.send("You need to be in the same voice channel as the bot to record.") + return + + # Manual recording logic. + # This uses the same `finished_recording_callback` as the conversation loop, + # but it won't trigger a subsequent re-listening because `conversation_mode_status` + # is False (or not set) for this guild during a manual record. + await ctx.send(f"Manual recording for {ctx.author.mention} for 10 seconds...") + + sink_instance = discord.sinks.WaveSink() # Create a new sink for this recording session. + try: + # Pass guild, author, and channel to the callback via listen's *args for context. + ctx.voice_client.listen(sink_instance, + after=lambda sink, *args: bot.loop.create_task(finished_recording_callback(sink, *args)), + ctx.guild, ctx.author, ctx.channel) + await asyncio.sleep(10) # Record for 10 seconds. + except Exception as e: + print(f"Error starting manual recording listener: {e}") + await ctx.send(f"Error starting manual recording: {e}") + return + + if ctx.voice_client.is_listening(): # Stop the recording after 10 seconds. + ctx.voice_client.stop_listening() + else: + await ctx.send("Recording was already stopped or failed to start for manual record.") + + +# --- Run the Bot --- +if __name__ == "__main__": + # This check ensures the bot runs only when the script is executed directly. + # The BOT_TOKEN must be configured at the top of the file. + if BOT_TOKEN == "YOUR_DISCORD_BOT_TOKEN" or not BOT_TOKEN: + print("ERROR: Please fill in your BOT_TOKEN in the discord_bot.py file.") + elif CAI_TOKEN == "YOUR_CAI_TOKEN" or not CAI_TOKEN: + print("ERROR: Please fill in your CAI_TOKEN in the discord_bot.py file.") + elif CAI_CHARACTER_ID == "YOUR_CAI_CHARACTER_ID" or not CAI_CHARACTER_ID: + print("ERROR: Please fill in your CAI_CHARACTER_ID in the discord_bot.py file.") + elif CAI_VOICE_ID == "YOUR_CAI_VOICE_ID" or not CAI_VOICE_ID: + print("ERROR: Please fill in your CAI_VOICE_ID in the discord_bot.py file.") + else: + bot.run(BOT_TOKEN) diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..e8fa802 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,6 @@ +discord.py +SpeechRecognition +# PyCharacterAI is a local module, not from PyPI, but its dependencies like curl-cffi might be needed if installing from scratch. +# Pocketsphinx was installed as a dependency of SpeechRecognition for offline STT capabilities. +# SoundDevice was installed as a dependency of Pocketsphinx. +# Ensure FFmpeg is installed separately and in PATH. From 8240f9c5ed8a4a747a6a6fb47052f1cda14ebba3 Mon Sep 17 00:00:00 2001 From: Neear7771 <107468618+Neear7771@users.noreply.github.com> Date: Sat, 24 May 2025 18:26:28 +0530 Subject: [PATCH 2/5] Update discord_bot.py --- discord_bot.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/discord_bot.py b/discord_bot.py index 1a90995..135ba0f 100644 --- a/discord_bot.py +++ b/discord_bot.py @@ -10,7 +10,7 @@ # These MUST be filled in for the bot to work. CAI_TOKEN = "YOUR_CAI_TOKEN" # Your Character AI client token CAI_CHARACTER_ID = "YOUR_CAI_CHARACTER_ID" # The ID of the Character AI character you want to interact with -CAI_VOICE_ID = "YOUR_CAI_VOICE_ID" # The specific voice ID for the Character AI character's speech +CAI_VOICE_ID = "453c0918-82d5-40ab-b42c-517a322ee5e5" # The specific voice ID for the Character AI character's speech BOT_TOKEN = "YOUR_DISCORD_BOT_TOKEN" # Your Discord bot token # --- Global Variables --- From a718e97a0dcfd8a9783c63a16df8b6b51f3aa4e9 Mon Sep 17 00:00:00 2001 From: Neear7771 <107468618+Neear7771@users.noreply.github.com> Date: Sat, 24 May 2025 18:37:45 +0530 Subject: [PATCH 3/5] Update discord_bot.py --- discord_bot.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/discord_bot.py b/discord_bot.py index 135ba0f..ec005c1 100644 --- a/discord_bot.py +++ b/discord_bot.py @@ -8,8 +8,8 @@ # --- Configuration Placeholders --- # These MUST be filled in for the bot to work. -CAI_TOKEN = "YOUR_CAI_TOKEN" # Your Character AI client token -CAI_CHARACTER_ID = "YOUR_CAI_CHARACTER_ID" # The ID of the Character AI character you want to interact with +CAI_TOKEN = "8041baf6512c863ffe65eea49a071e4f0287f149" # Your Character AI client token +CAI_CHARACTER_ID = "vOPdHXLGkA_7tamhZGhijCC29nk8W1xphYbm81qfSH4" # The ID of the Character AI character you want to interact with CAI_VOICE_ID = "453c0918-82d5-40ab-b42c-517a322ee5e5" # The specific voice ID for the Character AI character's speech BOT_TOKEN = "YOUR_DISCORD_BOT_TOKEN" # Your Discord bot token From 39d82d366224f86a6a484fb36a178ff3f75e20d1 Mon Sep 17 00:00:00 2001 From: Neear7771 <107468618+Neear7771@users.noreply.github.com> Date: Sat, 24 May 2025 18:39:24 +0530 Subject: [PATCH 4/5] Update requirements.txt --- requirements.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/requirements.txt b/requirements.txt index e8fa802..8813026 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,5 +1,6 @@ discord.py SpeechRecognition +websockets==15.0.1 # PyCharacterAI is a local module, not from PyPI, but its dependencies like curl-cffi might be needed if installing from scratch. # Pocketsphinx was installed as a dependency of SpeechRecognition for offline STT capabilities. # SoundDevice was installed as a dependency of Pocketsphinx. From a03a5bd60e7a9691c46a4b6273ac69a6c8c5f17c Mon Sep 17 00:00:00 2001 From: Neear7771 <107468618+Neear7771@users.noreply.github.com> Date: Sat, 24 May 2025 18:45:44 +0530 Subject: [PATCH 5/5] Update discord_bot.py --- discord_bot.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/discord_bot.py b/discord_bot.py index ec005c1..1a0733b 100644 --- a/discord_bot.py +++ b/discord_bot.py @@ -11,8 +11,7 @@ CAI_TOKEN = "8041baf6512c863ffe65eea49a071e4f0287f149" # Your Character AI client token CAI_CHARACTER_ID = "vOPdHXLGkA_7tamhZGhijCC29nk8W1xphYbm81qfSH4" # The ID of the Character AI character you want to interact with CAI_VOICE_ID = "453c0918-82d5-40ab-b42c-517a322ee5e5" # The specific voice ID for the Character AI character's speech -BOT_TOKEN = "YOUR_DISCORD_BOT_TOKEN" # Your Discord bot token - +BOT_TOKEN = input("BOT TOKEN: ") # --- Global Variables --- cai_client = None # Global client for PyCharacterAI, initialized in on_ready() conversation_mode_status = {} # Dictionary to manage conversation mode state per guild