Skip to content

A LLM-based chatbot can solve math problems, give clearly explainable reasons and self-training by using reinforcement learning, Fine-tune Mistral v0.3 (7B) on MathQA-40K datasets, Use Unsloth framework for increasing performance model, speed training time, Apply Google Mind paper’s techniques for increasing accuracy and performance of model

Notifications You must be signed in to change notification settings

MinhHung7/MathSolver-LLM-based-system

Repository files navigation

🧠 Mistral Fine-Tune for Math Solver using Unsloth, Gradio & OpenAI API

This project showcases how to fine-tune the powerful Mistral language model using the Unsloth library to build a robust and interactive math problem solver. It combines multiple components to enhance the user experience and ensure high accuracy:

  • 🔧 Mistral Fine-Tuning with Unsloth for step-by-step mathematical reasoning.
  • 🌐 Gradio Interface for easy, browser-based user interaction.
  • 🧠 OpenAI API Integration to:
    • 🖼️ Extract math problems from uploaded images (OCR + interpretation).
    • ✅ Validate and correct Mistral's output if one or more of the generated answers are incorrect.

🚀 Project Overview

Large Language Models (LLMs) are increasingly being used in educational tools, especially for solving math problems. However, base models often struggle with step-by-step mathematical reasoning. In this project, we: 🧠 Technologies Used

  • Mistral — Lightweight, high-performance LLM.
  • Unsloth — Memory-optimized library for fast fine-tuning with LoRA.
  • Gradio — Web-based UI for testing and deployment.
  • OpenAI API — Used for image-to-text (problem extraction) and output validation.

Training Flow

You can see my training flow below Training Flow

Inference Flow

You can see my inference flow below Inference Flow

About

A LLM-based chatbot can solve math problems, give clearly explainable reasons and self-training by using reinforcement learning, Fine-tune Mistral v0.3 (7B) on MathQA-40K datasets, Use Unsloth framework for increasing performance model, speed training time, Apply Google Mind paper’s techniques for increasing accuracy and performance of model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published