Skip to content

UtkarshPrajapati/SQL-Sense

Repository files navigation

🤖 DataFlow: Your Intelligent Database Assistant

Python Version Framework License Made with Love PRs Welcome

DataFlow bridges the gap between humans and relational data. Ask questions in plain English and instantly receive optimized SQL, live results, and AI-generated insights – all without leaving the browser. ✨


📚 Table of Contents

  1. Why DataFlow?
  2. Key Features
  3. Project Showcase
  4. Technologies and Core Libraries
  5. Setup and Installation
  6. How to Use
  7. API Endpoints
  8. Troubleshooting & FAQ
  9. Project Structure
  10. Contributing
  11. License
  12. Acknowledgements

❓ Why DataFlow?

Traditional SQL clients are great at running queries—but they assume you already know SQL and your schema inside-out. DataFlow removes that barrier:

  • No more context-switching. Ask questions in plain language and stay focused on your analysis.
  • Instant productivity. New teammates or non-technical stakeholders can explore data without a crash-course in SQL.
  • Better insights, faster. AI-generated summaries highlight trends you might miss in raw tables.
  • Safety first. Potentially destructive queries are intercepted and require explicit confirmation.

✨ Key Features

  • 🤖 Natural Language to SQL: Ask questions in English; get optimized SQL queries in return.
  • 🚀 Direct SQL Execution: A /run command to execute raw SQL queries for power users.
  • 📈 AI-Powered Insights: Automatically generates summaries and suggests relevant follow-up questions from query results.
  • 💡 AI-Powered Troubleshooting: When a query fails, the AI provides a plain-English explanation of the error and suggests a fix based on your database schema.
  • 🛡️ Advanced Security: Intercepts potentially harmful queries. Data-modifying queries (e.g., UPDATE, INSERT) require user confirmation, while structure-altering queries (e.g., DROP, ALTER) are blocked entirely.
  • 👁️ Dynamic Schema Viewer: An interactive, collapsible sidebar displays your database schemas, tables, and columns in real-time.
  • ⚡ Interactive Querying: Inline Run Query buttons appear next to SQL code blocks, allowing one-click execution without obscuring the query text.
  • ⚙️ On-the-Fly Configuration: Update database credentials and API keys from the UI without needing to restart the server. Your settings are securely saved in a local .env file.
  • 💬 Session-Based History: Chat history is tied to your browser session, providing a persistent and private workspace. Start a new chat anytime.
  • 🎨 Modern Material UI: A sleek, responsive, and user-friendly chat UI built with TailwindCSS and inspired by Material Design, featuring toast notifications and a seamless user experience.

🚀 Project Showcase

A visual tour of DataFlow, from its architecture to its user interface.

🏛️ System Architecture & Workflow (Click to Expand)

1. High-Level System Architecture
The main components of the system: UI, FastAPI Backend, SQL Database, and the Gemini LLM.

High-Level System Architecture

2. Application Workflow
From user prompt to AI-generated SQL, data retrieval, and final insights.

SQL Assistant Workflow

🗄️ Database Schema (Click to Expand)

Employee & Salary Schema
ER Diagram for the `Employees` and `Salaries` tables.

Employee and Salary ER Diagram

Product Schema
ER Diagram for the `Product` table.

Product ER Diagram

💻 User Interface (Click to Expand)

Main Chat Interface
The primary UI where users interact with the SQL agent.

Main Chat Interface

Query Results & Insights
An example of the application returning query results, generated SQL, AI-driven insights, and AI-powered troubleshooting for errors.

Query Results and Insights

Database Schema Viewer
The collapsible sidebar allows users to browse the connected database schemas and tables.

Database Schema Viewer


🛠️ Technologies and Core Libraries

Python
Python
Fast API
FastAPI
Google Gemini
Google Gemini
MySQL
MySQL
TailwindCSS
TailwindCSS
HTML5
HTML5

Backend: Python, FastAPI, Uvicorn, Google GenAI API, MySQL Connector, Pydantic, python-dotenv, sqlparse, fastapi-sessions

Frontend: HTML, TailwindCSS, Marked.js, DOMPurify, Lucide Icons


⚙️ Setup and Installation

Follow these steps to get DataFlow running on your local machine.

1. Prerequisites

  • Python 3.11+
  • Git
  • An active MySQL database service.

2. Clone the Repository

git clone https://github.com/UtkarshPrajapati/SQL-Sense.git
cd SQL-Sense

3. Set Up Virtual Environment

It's highly recommended to use a virtual environment to manage project dependencies.

# Create a virtual environment
python -m venv venv

# Activate it
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

4. Configure Environment Variables

It is not strictly necessary to create a .env file before starting, but it is recommended for providing your database credentials and Gemini API key.

How it works:

  • No .env file: The application will start with default settings (attempting to connect to MySQL on localhost with user root). You can then use the web UI's Config panel to enter your credentials, which will automatically create a .env file for you.
  • With a .env file: You can create a file named .env in the project root to pre-configure the application.
  1. Create a file named .env in the root of the project (optional).
  2. Add the keys you need. The application will use defaults for any keys that are not provided.
Example `.env` structure (Click to Expand)
# MySQL Database Configuration
MYSQL_HOST=localhost
MYSQL_USER=root
MYSQL_PASSWORD=your_mysql_password

# Google Gemini API Configuration
GEMINI_API_KEY=your_gemini_api_key

# Session Management (Optional, will be auto-generated)
SESSION_SECRET_KEY=your_super_secret_key_for_sessions

5. Install Dependencies

pip install -r requirements.txt

6. (Optional) Prepare the Sample Database

This step is only necessary if you want to use the pre-defined sample data to test the application. If you intend to connect to your own existing database, you can skip this.

Run the provided script to create the necessary tables and populate them with realistic sample data.

python gen-data.py

This script will create two databases by default:

  1. SQLLLM: Contains employees and salaries tables.
  2. StoreDB: Contains a products table.

It will then populate these tables with sample data.

7. Run the Application

You're all set! Start the FastAPI server.

python sql_assistant.py

The application will be live at http://127.0.0.1:6969.


📖 How to Use

  1. Open the Web Interface: Navigate to http://127.0.0.1:6969 in your web browser.
  2. Configure Credentials: Click the "Config" button to enter your MySQL and Gemini API Key details. They will be saved for future sessions.
  3. View Schema: Click the "View Schema" button to see the tables and columns the AI is aware of.
  4. Ask a Question: Type a question in plain English, like show me all employees and their salaries.
  5. Execute Direct SQL: For precise control, use the /run command followed by a SQL query. For example: /run SELECT product_name, price FROM products WHERE price > 50;
  6. Review Results: The application will display the generated SQL, the data results in a table, and a summary of insights derived by the AI.
  7. One-Click Execution: If the assistant suggests an SQL query, click the "Run Query" button that appears over the code block to execute it immediately.
  8. Start a New Chat: Click the "New Chat" button to clear the conversation and start fresh.

🔗 API Endpoints

Method Endpoint Description
GET / Serves the index.html single-page application and manages session creation.
GET /schema Returns JSON containing databases, tables, and columns the assistant can access.
GET /config_status Returns the public configuration status (e.g., host, user, and whether keys are set).
POST /config Body: { "mysql_host": "...", "mysql_user": "...", "mysql_password": "...", "gemini_api_key": "..." } – Updates connection credentials and tests them. No restart needed.
POST /chat Body: { "message": "<natural-language question or /run <SQL>>" } – Main interaction endpoint: accepts NL queries or /run SQL commands, returns results/insights.
POST /execute_confirmed_sql Body: { "query": "<SQL previously flagged for confirmation>" } – Executes DML queries that the user has reviewed and approved.
POST /reset_chat Clears the chat history for the current user session.

All responses are JSON and follow the shape documented in the code. Unhandled errors are returned with appropriate HTTP status codes.


🛟 Troubleshooting & FAQ

The server starts but `/schema` returns an empty list

Cause: The MySQL credentials in your .env file don't have permission to see user databases, or no user databases exist.

Fix:

  1. Verify MYSQL_USER / MYSQL_PASSWORD in .env or via the UI Config panel.
  2. Check that your user has at least SELECT privilege on the target databases.
  3. Use the /config endpoint (or restart the app) after updating credentials.
Gemini replies with "Error: Gemini API not configured"

The GEMINI_API_KEY environment variable is missing or invalid.

  • Obtain an API key from Google AI Studio.
  • Add it to your .env file and/or update via the /config endpoint.
  • Restart the backend (or let /config re-initialize the key).
"Client does not support authentication protocol" MySQL error

Your MySQL server may be using the newer caching_sha2_password plugin while the connector expects mysql_native_password. The application attempts to use mysql_native_password by default. If this error still occurs, you may need to update your user configuration in MySQL.

ALTER USER 'root'@'%' IDENTIFIED WITH mysql_native_password BY 'your_password';
FLUSH PRIVILEGES;

Alternatively, create a dedicated read-only user for DataFlow with compatible auth.

The UI shows a loading spinner that never stops after I submit a question

Check the backend logs; the LLM may be taking longer than expected or returning a safety block. Increase the timeout on your HTTP client if you've reverse-proxied the API.


📂 Project Structure

.
├── .env.example        # Environment variable template
├── .gitignore          # Files to ignore for git
├── README.md           # This file
├── assets              # Images and architectural diagrams
├── gen-data.py         # Generates and populates the database
├── index.html          # Main frontend file
├── requirements.txt    # Python dependencies
├── sql_assistant.py    # FastAPI backend logic
├── static              # Static assets for the logo
└── venv                # Virtual environment folder

🤝 Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📜 License

Distributed under the MIT License.


👏 Acknowledgements

  • The amazing open-source community.
  • The teams behind FastAPI, Google Gemini, and Uvicorn.
  • Icons and visuals from Lucide Icons and SVG Repo.

About

🤖 Natural language SQL assistant using LLMs and Gemini API for intelligent database querying.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Contributors 3

  •  
  •  
  •