Chat with your PDF: Seamlessly interact with documents using Amazon Bedrock, RAG, S3, Langchain, and Streamlit for intelligent, responsive conversations.
PDFPal is a web application designed for processing PDF files. It offers two main components:
- Admin Interface: For processing PDFs, creating a vector store, and uploading it to Amazon S3.
- User Interface: For interacting with the processed PDFs and vector store.
Follow the instructions below to set up and run both the Admin and User components of the application.
- Python 3.11 or later (if not using Docker)
- Docker (if using Docker)
- Amazon S3 account and bucket for storing vector store files
- AWS CLI installed for configuring AWS credentials
-
Configure AWS CLI:
Ensure you have AWS CLI installed and configured. Run the following command to set up your AWS credentials:
aws configure
You will need to provide your AWS Access Key ID, Secret Access Key, region, and output format.
- Clone the Repository:
git clone https://github.com/your-repository-url
cd your-repository-directory
- Install Dependencies:
Create a requirements.txt file with the following content:
Copy code
boto3
streamlit
faiss-cpu
langchain
langchain-community
Then, install the dependencies using:
pip install -r requirements.txt
- Build Docker Images:
For the Admin Interface:
docker build -t pdfpal-admin .
For the User Interface:
docker build -t pdfpal-user .
- Run Docker Containers:
For the Admin Interface (accessible at http://localhost:8083):
docker run -p 8083:8083 pdfpal-admin
For the User Interface (accessible at http://localhost:8084):
docker run -p 8084:8084 pdfpal-user
Admin Interface:
- Upload a PDF: Use the file uploader widget to select and upload a PDF file.
- Process PDF: The application will process the PDF, split its content, create a vector store, and upload it to S3.
User Interface:
- Interact with Processed PDFs: Access the processed PDFs and vector store as needed.
Troubleshooting:
- No Response from App: Ensure the application is running by checking the terminal output.
- File Upload Issues: Make sure the file is in PDF format.
- S3 Errors: Verify AWS credentials and S3 bucket permissions.