Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(blog): Add new post on llama-cpp-python and instructor library usage #434

Merged
merged 6 commits into from
Feb 12, 2024

Conversation

jxnl
Copy link
Owner

@jxnl jxnl commented Feb 12, 2024

Ellipsis 🚀 This PR description was created by Ellipsis for commit bf826c3.

Summary:

This PR adds a new blog post discussing the use of llama-cpp-python for structured outputs and the enhancement of create calls with the instructor library, including a Python code example.

Key points:

  • Added a new blog post in /docs/blog/posts/llama-cpp-python.md
  • The post discusses using llama-cpp-python for structured outputs
  • It also covers enhancing create calls with instructor library
  • Includes a Python code example demonstrating these features

Generated with ❤️ by ellipsis.dev

@ellipsis-dev ellipsis-dev bot changed the title ... feat(blog): Add new post on llama-cpp-python and instructor library usage Feb 12, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Reviewed entire PR up to commit bf826c3

Reviewed 121 lines of code across 1 files in 1 minute(s) and 3 second(s).

See details
  • Skipped files: 0 (please contact us to request support for these files)
  • Confidence threshold: 85%
  • Drafted 0 additional comments.
  • Workflow ID: wflow_b0ZkUnb5GvB5ybuj

Something look wrong? You can customize Ellipsis by editing the ellipsis.yaml for this repository.

Generated with ❤️ by ellipsis.dev

docs/blog/posts/llama-cpp-python.md Outdated Show resolved Hide resolved

Recently llama-cpp-python has made support structured outputs via JSON schema available. This is a time-saving alternative to extensive prompt engineering and can be used to obtain structured outputs.

In this example we'll cover a more advanced use case of by using `JSON_SCHEMA` mode to stream out partial models. To learn more partial streaming check out [partial streaming](../../concepts/partial.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example we'll cover a more advanced use case of JSON_SCHEMA mode to stream out partial models. To learn more partial streaming check out partial streaming.

console.print(obj)
```

1. We use `LlamaPromptLookupDecoding` to obtain structured outputs using JSON schema via a mixture of constrained sampling and speculative decoding. 10 is good for GPU, 2 is good for CPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use LlamaPromptLookupDecoding to speed up structured output generation using speculative decoding. The draft model generates candidate tokens during generation 10 is good for GPU, 2 is good for CPU.

@jxnl jxnl merged commit 1ddb147 into main Feb 12, 2024
11 of 12 checks passed
@jxnl jxnl deleted the llama-cpp-blog branch February 12, 2024 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants