Releases · abgulati/LARS

10 Sep 22:44

abgulati

v2.0-beta6

bf7365e

v2.0-beta6: Major HF-Waitress LLM Server Update Latest

Latest

HF-Waitress: /completions_stream now implements custom TextStreamer so as to redirect only it's output to the stream buffer, while STDOUT remains unmodified thus allowing other non-blocked routes and methods to execute and output to STDOUT in parallel without interfering with the stream
CSS separated into a dedicate file
Minor QoL changes

Full Changelog: v2.0-beta5...v2.0-beta6

Assets 2

09 Sep 22:19

abgulati

v2.0-beta5

93f76db

v2.0-beta5: UI Enhancements

New font-family, glassmorphism and title bar

Full Changelog: v2.0-beta4...v2.0-beta5

Assets 2

06 Sep 23:08

abgulati

v2.0-beta4

8aa7889

v2.0-beta4: HQQ Fix and Minor Refinements

BUG FIX: HQQ quantization would error out if torch.dtype (dataType) was set to auto, it now force-sets to torch.bfloat16
BUG FIX: Add new LLM button re-displays when the HF-Waitress LLM list is closed and re-opened
Minor response-formatting adjustment

Full Changelog: v2.0-beta3...v2.0-beta4

Assets 2

06 Sep 00:17

abgulati

v2.0-beta3

d3d06c7

v2.0-beta3

Fixed HF-Waitress streaming-response formatting!
Improved app load times from tuned server health-check intervals
Minor performance improvement to HF-Waitress streaming-output
Minor refinements to HF-Waitress server status outputs

Full Changelog: v2.0-beta2...v2.0-beta3

Assets 2

05 Sep 00:25

abgulati

v2.0-beta2

ffcc9ae

v2.0-beta2: Enhanced HF-Waitress LLM Management Features, Error-Reporting Refinements and Bug Fixes

Enhanced HF-Waitress LLM Management: Add new model_ids, search-filter & sort the list of LLMs as well as delete LLM IDs from the HF-Waitress LLM dropdown list
HF-Waitress server health-check reporting improvements
Various bug fixes: Reference to index_dir removed, document_records SQL-DB correctly created on very first run, removed troublesome test-prints during document-chunking operation

Full Changelog: v2.0-beta1...v2.0-beta2

Assets 2

30 Aug 01:11

abgulati

v2.0-beta1

73f7bf6

v2.0-beta1: New LLM Server -- HF-Waitress!

HF-Waitress is a powerful and flexible server application for deploying and interacting with HuggingFace Transformer models. It simplifies the process of running open-source Large Language Models (LLMs) locally on-device, addressing common pain points in model deployment and usage.

This server enables loading HF-Transformer & AWQ-quantized models directly off the hub, while providing on-the-fly quantization via BitsAndBytes, HQQ and Quanto for the former. It negates the need to manually download any model yourself, simply working off the models name instead. It requires no setup, and provides concurrency and streaming responses all from within a single, easily-portable, platform-agnostic Python script.

For a full list of features see: https://github.com/abgulati/hf-waitress

LARS is far easier to deploy and get working on the very first run without requiring the user to manually download and place their LLMs.

Check out the updated Dependencies, Installation and Usage Instructions in the README

Note containers are not yet updated and will be done so in the following week most likely.

Full Changelog: v1.9.1...v2.0-beta1

Assets 2

21 Aug 20:48

abgulati

v1.9.1

d545247

v1.9.1 - Re-ranker Robustness & Minor UI Tweak

BUG FIX: Re-ranking bypassed when do_rag=False - error no longer produced due to empty document list!
Minor UI change: Adjusted max-width of Settings modal to 75% for better use of available screenspace

Full Changelog: v1.9...v1.9.1

Assets 2

21 Aug 01:52

abgulati

v1.9

da9cbd4

v1.9 - Vector Re-Ranking & No More Whoosh

Custom document chunker appends page number data as metadata to chunks stored vectorDB
LLM can now supply specific document names and page numbers within the response itself!
Re-ranking and filtering applied via SentenceTransformer('all-MiniLM-L6-v2') to the vectorDB similarity search results for better contextual accuracy
Whoosh indexing no longer necessary - far simplified book-keeping and no overhead for page-number searches at inference time
Page number accuracy significantly increased as a result of all the above
Default system-prompt template now instructs the LLM to include document names and page numbers whenever additional context is provided, actual output dependent on ability of the specific LLM used
BUG FIX: PDF tabs in documnet-viewer in the response window did not open properly for consequetive questions and on chat-history load. FIXED.

Full Changelog: v1.8...v1.9

Assets 2

15 Aug 23:41

abgulati

v1.8

15d426a

v1.8

MAJOR UPDATE:

Google Drive Integration complete! Downloads files and folders recursively. Filtering, sorting and queued-loading of Google Drive docs is now available via the UI
Improved highlighting: Implemented fuzzy-search logic, replacing exact matching, resulting in expanded highlighting on pages
Improved RAG: Increased cosine similarity seach threshold to 80% for more stringent and accurate matching and passing sources data to the LLM for improved response quality
Imporved handling of images for citations - skipping image extraction of scanned docs
Clearer document naming in citations: The unique ID of the highlighted dodcument is no longer attached to the document name in the 'Refer to the following documents' citations block
BUG FIX: When using the free-tier of the AzureCV OCR service, it will handle UsageLimitExceeded errors even when submitting multiple documents back-to-back, auto-waiting and resuming correctly
BUG FIX: handle_api_error events will now actually return to the front-end!
Refactored process_new_file method into smaller blocks that are now shared with the GoogleDrive loader and can be used by other integrations in the future too
Increased chunk size to 500 and removed '250' from the name of the SBERT VectorDB created
Cleaned up print and newline statements
Improvements to accuracy and relevance of page numbers and doc names cited in response, further refinements on-going
Replaced Whoosh indexing search opearator from the default AND to OR
HF-Waitress local-LLM server integration begins!

Full Changelog: v1.7...v1.8

Assets 2

08 Aug 21:54

abgulati

v1.7

fc4d681

v1.7

New models supported - Google Gemma2, DeepSeek V2, Llama-3.1

Revamped Docker builds - new dockerfiles

Pre-built images shared

Various bug-fixes and enhancements

Full Changelog: v1.6...v1.7

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: abgulati/LARS

v2.0-beta6: Major HF-Waitress LLM Server Update

v2.0-beta5: UI Enhancements

v2.0-beta4: HQQ Fix and Minor Refinements

v2.0-beta3

v2.0-beta2: Enhanced HF-Waitress LLM Management Features, Error-Reporting Refinements and Bug Fixes

v2.0-beta1: New LLM Server -- HF-Waitress!

v1.9.1 - Re-ranker Robustness & Minor UI Tweak

v1.9 - Vector Re-Ranking & No More Whoosh

v1.8

v1.7