I successfully developed and deployed a sophisticated AI chatbot application using FastAPI and Azure Cloud Platform. The chatbot leverages advanced AI models to provide a rich, interactive user experience.
- FastAPI: To handle continuous streams of audio efficiently on server side.
- HTML, CSS and Javascript: For creating web-based UI for the Chatbot and add styling and interactivity to the web-app.
- AI models: Various AI models utilized to provide AI responses based on user's text and voice input.
- Gemini 2.0 Flash: Utilized for generating natural language responses to user inputs and speech-to-text conversion, enabling voice interaction.
- black-forest-labs/FLUX.1-schnell-Free: Employed for creating images from textual descriptions provided by users.
- Speech Recognition: Implemented using Gemini 2.0 Flash model to convert user speech to text accurately.
- Text Response Generation: Integrated Google's Gemini model to handle text-based queries, ensuring conversational relevance and coherence.
- Text-to-Image Generation: Leveraged black-forest's model to transform user-provided text into high-quality images.
- Continuous Audio Streaming: Used FastAPI to handle continuous streams of audio, enabling real-time processing and interaction on server side.
- Memory Retention Feature: Implemented a chat memory management feature for each user, ensuring that the chat conversation is maintained in memory as long as the backend instance is running.
Here’s a sneak peek of the frontend and the sample conversation between me and my assistant.
- Designed the architecture of the chatbot application, ensuring seamless integration of various AI models.
- Developed the front-end using Jinja2 template library, creating an intuitive and user-friendly interface.
- Successfully integrated Gemini 2.0 Flash to handle text-based queries, ensuring conversational relevance and accurate speech recognition.
- Implemented Black-Forest's FLUX-1 model for generating images based on user descriptions, enhancing the visual interaction capabilities of the chatbot.
- Managed and optimized the backend processes to handle real-time user interactions efficiently.
- Used FastAPI to handle continuous streams of audio, ensuring smooth and responsive audio processing.
- implemented chat memory management feature until instance is running.
- Ensured smooth communication between the UI and the AI models, reducing latency and improving performance.
- Improved accessibility through voice interactions, making the application user-friendly.
If you have a repository for your project, clone it using git:
git clone https://github.com/Akshat2512/AI_Voice_Assistant.git
cd AI_Voice_Assistant # move to the root folder of the applicationIf you want to create separate virtual environment for python
python -m venv my_env &&
my_env/Script/activateThen install required libraries
pip install -r requirements.txtThen for starting application, first start the fastapi server i.e., app.py, run directly using this script in the terminal:
uvicorn app:app --host localhost --port 5000 --reloadthen, Go to https://localhost:5000.





