Introducing Structured Outputs #939

ivanleomk · 2024-08-20T14:37:46Z

I tried a more explicit title - but we can also do something like Announcing support for Structured Output

🚀	This description was created by Ellipsis for commit `1e49b66`

Summary:

This PR adds a blog post comparing OpenAI's Structured Outputs with the instructor tool, addressing challenges and showcasing solutions with examples.

Key points:

Added blog post docs/blog/posts/introducing-structured-outputs.md on OpenAI's Structured Outputs.
Updated author avatar in docs/blog/.authors.yml.
Replaced Mode.STRUCTURED_OUTPUTS with Mode.TOOLS_STRICT in examples.
Discussed challenges: limited validation, streaming issues, latency spikes.
Included pydantic code examples for validation.
Benchmarked against instructor tool.
Highlighted instructor features: automatic validation, retries, real-time streaming, provider-agnostic API.
Demonstrated streaming and partial data extraction.

Generated with ❤️ by ellipsis.dev

ellipsis-dev

👍 Looks good to me! Reviewed everything up to e1eb60c in 11 seconds

More details

Looked at 489 lines of code in 2 files
Skipped 1 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. docs/blog/posts/introducing-structured-outputs.md:4

Draft comment:
The slug is-instructor-dead is misleading and unrelated to the content of the blog post. Consider changing it to something more relevant, like introducing-structured-outputs.
Reason this comment was not posted:
Confidence changes required: 80%
The PR introduces a new blog post, but the slug in the front matter is misleading and unrelated to the content.

2. docs/blog/posts/introducing-structured-outputs.md:1

Draft comment:
Ensure this new markdown file is added to mkdocs.yml for proper documentation inclusion.
Reason this comment was not posted:
Confidence changes required: 80%
The new markdown file should be added to mkdocs.yml for documentation consistency.

Workflow ID: wflow_SrIsJn2iqFKVC6aJ

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

cloudflare-workers-and-pages · 2024-08-20T14:39:01Z

Deploying instructor with Cloudflare Pages

Latest commit:	`1e49b66`
Status:	✅ Deploy successful!
Preview URL:	https://2336d6ba.instructor.pages.dev
Branch Preview URL:	https://introducing-structured-outpu.instructor.pages.dev

View logs

jxnl · 2024-08-20T15:58:01Z

change the title to "should i be using structured outputs"

and answer the question directly in the intro


# Is Instructor Dead?

## What's Open AI's Structured Output mode all about?

OpenAI's new Structured Output mode is a huge step change for developers building complex workflows. Given an arbitrary JSON Schema, Structured Output ensures that the response matches the schema exactly.

Here's a basic example.

to something like

# Two challenges with OpenAI's Structured outputs

With guaranteed schema adherence, outputs always conform to your defined Pydantic model, eliminating type mismatches and missing fields. However, while Structured Outputs solve many common issues, two key challenges emerge when building more sophisticated applications

1. Limited capabilities for reasking validations 
2. Limited Capabilities for streaming structured data

## What is Structured Output mode?

OpenAI's new Structured Output mode is a huge step change for developers building complex workflows. Given an arbitrary JSON Schema, Structured Output ensures that the response matches the schema exactly.

Here's a basic example.

jxnl · 2024-08-20T16:01:42Z

but in your body you mention more than streaming and validation, so you need to forshadow more.

validation: ...
streaming: ...
latency: ...

Don't assume someone is going to read it without knowing what they get out of the article.

jxnl · 2024-08-20T16:02:46Z

remove the citations section

docs/blog/posts/introducing-structured-outputs.md

jxnl · 2024-08-20T16:03:24Z

docs/blog/posts/introducing-structured-outputs.md

+#> name='Jason' age=25
+```
+
+With guaranteed schema adherence, outputs always conform to your defined Pydantic model, eliminating type mismatches and missing fields. However, while Structured Outputs solve many common issues, two key challenges emerge when building more sophisticated applications - that of Validation and Streaming.


move this to top but theres more issues than these two, like latency, and also use a list

docs/blog/posts/introducing-structured-outputs.md

jxnl · 2024-08-20T16:04:26Z

docs/blog/posts/introducing-structured-outputs.md

+            #> {"name":"Jason","age":25}
+```
+
+## Should you be using Structured Output mode?


you ask should i use it, but then you share latency metrics,

it should just be a section on latency

docs/blog/posts/introducing-structured-outputs.md

jxnl · 2024-08-20T16:07:23Z

docs/blog/posts/introducing-structured-outputs.md

+
+## What's Open AI's Structured Output mode all about?
+
+OpenAI's new Structured Output mode is a huge step change for developers building complex workflows. Given an arbitrary JSON Schema, Structured Output ensures that the response matches the schema exactly.


mention that we were referenced to the release article

link to the release article

jxnl · 2024-08-20T16:07:33Z

docs/blog/posts/introducing-structured-outputs.md

+
+### Streaming
+
+Streaming with Structured Outputs is supported but a challenging endeavour. There's no built-in partial validation and you need to manually parse the generated response while simultaneously having to now use a context manager to access the generated values.


link to docs

I added streaming links at the bottom portion instead when we mention instructor

docs/blog/posts/introducing-structured-outputs.md

jxnl · 2024-08-20T16:08:33Z

personally don't care if its 'Announcing support for Structured Output' it benefits no one,

"should i use structured outputs vs instructor" answers a qustion peopel have in their head.

ellipsis-dev

👍 Looks good to me! Incremental review on b3d4dec in 38 seconds

More details

Looked at 231 lines of code in 7 files
Skipped 0 files when reviewing.
Skipped posting 12 drafted comments based on config settings.

1. instructor/client.py:417

Draft comment:
The mode STRUCTURED_OUTPUTS has been changed to TOOLS_STRICT. Ensure this change is consistent across the codebase and that all references to STRUCTURED_OUTPUTS are updated accordingly.
Reason this comment was not posted:
Confidence changes required: 80%
The PR changes the mode from STRUCTURED_OUTPUTS to TOOLS_STRICT in multiple places. This change should be consistent across the codebase. The change in instructor/client.py is consistent with the changes in other files.

2. instructor/dsl/iterable.py:97

Draft comment:
The mode STRUCTURED_OUTPUTS has been changed to TOOLS_STRICT. Ensure this change is consistent across the codebase and that all references to STRUCTURED_OUTPUTS are updated accordingly.
Reason this comment was not posted:
Confidence changes required: 80%
The PR changes the mode from STRUCTURED_OUTPUTS to TOOLS_STRICT in multiple places. This change should be consistent across the codebase. The change in instructor/dsl/iterable.py is consistent with the changes in other files.

3. instructor/dsl/partial.py:173

Draft comment:
The mode STRUCTURED_OUTPUTS has been changed to TOOLS_STRICT. Ensure this change is consistent across the codebase and that all references to STRUCTURED_OUTPUTS are updated accordingly.
Reason this comment was not posted:
Confidence changes required: 80%
The PR changes the mode from STRUCTURED_OUTPUTS to TOOLS_STRICT in multiple places. This change should be consistent across the codebase. The change in instructor/dsl/partial.py is consistent with the changes in other files.

4. instructor/function_calls.py:268

Draft comment:
The mode STRUCTURED_OUTPUTS has been changed to TOOLS_STRICT. Ensure this change is consistent across the codebase and that all references to STRUCTURED_OUTPUTS are updated accordingly.
Reason this comment was not posted:
Confidence changes required: 80%
The PR changes the mode from STRUCTURED_OUTPUTS to TOOLS_STRICT in multiple places. This change should be consistent across the codebase. The change in instructor/function_calls.py is consistent with the changes in other files.

5. instructor/process_response.py:253

Draft comment:
The mode STRUCTURED_OUTPUTS has been changed to TOOLS_STRICT. Ensure this change is consistent across the codebase and that all references to STRUCTURED_OUTPUTS are updated accordingly.
Reason this comment was not posted:
Confidence changes required: 80%
The PR changes the mode from STRUCTURED_OUTPUTS to TOOLS_STRICT in multiple places. This change should be consistent across the codebase. The change in instructor/process_response.py is consistent with the changes in other files.

6. instructor/retry.py:111

Draft comment:
The mode STRUCTURED_OUTPUTS has been changed to TOOLS_STRICT. Ensure this change is consistent across the codebase and that all references to STRUCTURED_OUTPUTS are updated accordingly.
Reason this comment was not posted:
Confidence changes required: 80%
The PR changes the mode from STRUCTURED_OUTPUTS to TOOLS_STRICT in multiple places. This change should be consistent across the codebase. The change in instructor/retry.py is consistent with the changes in other files.

7. instructor/mode.py:22

Draft comment:
If TOOLS_STRICT is a new mode replacing STRUCTURED_OUTPUTS, ensure that documentation and tests are updated accordingly.
Reason this comment was not posted:
Confidence changes required: 80%
The PR introduces a new mode TOOLS_STRICT replacing STRUCTURED_OUTPUTS. This change should be reflected in the documentation and tests.

8. instructor/client.py:68

Draft comment:
The create method overloads have been modified. Ensure that the documentation is updated to reflect these changes.
Reason this comment was not posted:
Confidence changes required: 80%
The create method overloads have been modified, but the changes are not documented. This is a library code change, so documentation should be updated.

9. instructor/client.py:132

Draft comment:
The create_partial method overloads have been modified. Ensure that the documentation is updated to reflect these changes.
Reason this comment was not posted:
Confidence changes required: 80%
The create_partial method overloads have been modified, but the changes are not documented. This is a library code change, so documentation should be updated.

10. instructor/client.py:188

Draft comment:
The create_iterable method overloads have been modified. Ensure that the documentation is updated to reflect these changes.
Reason this comment was not posted:
Confidence changes required: 80%
The create_iterable method overloads have been modified, but the changes are not documented. This is a library code change, so documentation should be updated.

11. instructor/client.py:232

Draft comment:
The create_with_completion method overloads have been modified. Ensure that the documentation is updated to reflect these changes.
Reason this comment was not posted:
Confidence changes required: 80%
The create_with_completion method overloads have been modified, but the changes are not documented. This is a library code change, so documentation should be updated.

12. instructor/client.py:447

Draft comment:
The from_litellm function has been modified. Ensure that the documentation is updated to reflect these changes.
Reason this comment was not posted:
Confidence changes required: 80%
The from_litellm function has been modified, but the changes are not documented. This is a library code change, so documentation should be updated.

Workflow ID: wflow_QbK0th4AZ1Hxsg9H

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

❌ Changes requested. Incremental review on e2dda01 in 23 seconds

More details

Looked at 437 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. docs/blog/posts/introducing-structured-outputs.md:176

Draft comment:
Typo in 'targetted'. Consider changing it to 'targeted'.

This built-in retry logic allows for targeted correction to the generated response, ensuring that outputs are not only consistent with your schema but also correct for your use-case. This is invaluable in building reliable LLM systems.

Reason this comment was not posted:
Confidence changes required: 10%
The blog post contains a typo in the word 'targetted'.

2. docs/blog/posts/introducing-structured-outputs.md:284

Draft comment:
Typo in 'swtich'. Consider changing it to 'switch'.

For example, the switch from OpenAI to Anthropic requires only three adjustments

Reason this comment was not posted:
Confidence changes required: 10%
The blog post contains a typo in the word 'swtich'.

Workflow ID: wflow_a7twIC67RZiZTyjh

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-08-21T14:44:01Z

docs/blog/posts/introducing-structured-outputs.md

+  - OpenAI
+authors:
+  - ivanleomk
+---


Since this is a new markdown file, ensure it's added to mkdocs.yml for proper documentation inclusion.

ellipsis-dev

👍 Looks good to me! Incremental review on 322ab5e in 25 seconds

More details

Looked at 437 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. docs/blog/posts/introducing-structured-outputs.md:71

Draft comment:
Typo: 'targetted' should be 'targeted'.

This leaves developers without the means to implement retry logic so that the LLM can provide a targeted correction and regenerate its response.

Reason this comment was not posted:
Confidence changes required: 10%
The term 'targetted' is misspelled and should be corrected for clarity and professionalism.

2. docs/blog/posts/introducing-structured-outputs.md:126

Draft comment:
Typo: 'satisfication' should be 'satisfaction'.

potentially impacting the overall user satisfaction and retention rates.

Reason this comment was not posted:
Confidence changes required: 10%
The word 'satisfication' is misspelled and should be corrected to 'satisfaction'.

3. docs/blog/posts/introducing-structured-outputs.md:284

Draft comment:
Typo: 'swtich' should be 'switch'.

For example, the switch from OpenAI to Anthropic requires only three adjustments

Reason this comment was not posted:
Confidence changes required: 10%
The word 'swtich' is misspelled and should be corrected to 'switch'.

Workflow ID: wflow_BhCLvKKLw9rRJ0hk

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 322ab5e in 17 seconds

More details

Looked at 437 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 4 drafted comments based on config settings.

1. docs/blog/posts/introducing-structured-outputs.md:71

Draft comment:
Typo: 'targetted' should be 'targeted'.
Reason this comment was not posted:
Confidence changes required: 10%
The term 'targetted' is misspelled and should be corrected to 'targeted'.

2. docs/blog/posts/introducing-structured-outputs.md:176

Draft comment:
Typo: 'corect' should be 'correct'.
Reason this comment was not posted:
Confidence changes required: 10%
The term 'corect' is misspelled and should be corrected to 'correct'.

3. docs/blog/posts/introducing-structured-outputs.md:284

Draft comment:
Typo: 'swtich' should be 'switch'.
Reason this comment was not posted:
Confidence changes required: 10%
The word 'swtich' is misspelled and should be corrected to 'switch'.

4. docs/blog/posts/introducing-structured-outputs.md:9

Draft comment:
Since this is a new blog post, ensure it is added to mkdocs.yml for proper documentation.
Reason this comment was not posted:
Confidence changes required: 80%
The document is a new blog post, so it should be added to mkdocs.yml for proper documentation.

Workflow ID: wflow_BhCLvKKLw9rRJ0hk

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 378f7fa in 1 minute and 5 seconds

More details

Looked at 39 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 4 drafted comments based on config settings.

1. docs/blog/posts/introducing-structured-outputs.md:21

Draft comment:
Add a period at the end of the sentence for consistency.

1. **Limited Validation And Retry Logic**: Structured Outputs ensure adherence to the schema but not useful content. You might get perfectly formatted yet unhelpful responses.

Reason this comment was not posted:
Confidence changes required: 50%
The blog post contains several instances where sentences are missing periods at the end. This is a grammatical issue that should be corrected for consistency and professionalism.

2. docs/blog/posts/introducing-structured-outputs.md:22

Draft comment:
Add a period at the end of the sentence for consistency.

2. **Streaming Challenges**: Parsing raw JSON objects from streamed responses with the sdk is error-prone and inefficient.

Reason this comment was not posted:
Confidence changes required: 50%
The blog post contains several instances where sentences are missing periods at the end. This is a grammatical issue that should be corrected for consistency and professionalism.

3. docs/blog/posts/introducing-structured-outputs.md:23

Draft comment:
Add a period at the end of the sentence for consistency.

3. **Unpredictable Latency Issues** : Structured Outputs suffers from random latency spikes that might result in an almost 20x increase in response time.

Reason this comment was not posted:
Confidence changes required: 50%
The blog post contains several instances where sentences are missing periods at the end. This is a grammatical issue that should be corrected for consistency and professionalism.

4. docs/blog/posts/introducing-structured-outputs.md:18

Draft comment:
Ensure this new blog post is added to the mkdocs.yml file for proper documentation inclusion.
Reason this comment was not posted:
Confidence changes required: 80%
The blog post is a new addition and should be included in the mkdocs.yml file for documentation.

Workflow ID: wflow_LqRNtfeHVX8MSe5U

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

jxnl

going to approve merge any time, but you should also mention 'vendor lock in' as one of the challenges.

ellipsis-dev

❌ Changes requested. Incremental review on 1e49b66 in 25 seconds

More details

Looked at 15 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_T51vXKcMq76QInGk

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-08-22T04:39:19Z

docs/blog/posts/introducing-structured-outputs.md

+But before you do so, three key challenges remain:
+
+1. **Limited Validation And Retry Logic**: Structured Outputs ensure adherence to the schema but not useful content. You might get perfectly formatted yet unhelpful responses
+2. **Streaming Challenges**: Parsing raw JSON objects from streamed responses with the sdk is error-prone and inefficient


Ensure this new blog post is added to the mkdocs.yml file for documentation.

Added a new blog article draft

e1eb60c

ellipsis-dev bot reviewed Aug 20, 2024

View reviewed changes

jxnl requested changes Aug 20, 2024

View reviewed changes

ellipsis-dev bot reviewed Aug 21, 2024

View reviewed changes

ivanleomk force-pushed the introducing-structured-outputs branch from b3d4dec to e1eb60c Compare August 21, 2024 02:01

ellipsis-dev bot reviewed Aug 21, 2024

View reviewed changes

Fixed up new structured output article changes

322ab5e

ivanleomk force-pushed the introducing-structured-outputs branch from e2dda01 to 322ab5e Compare August 21, 2024 14:46

ellipsis-dev bot reviewed Aug 21, 2024

View reviewed changes

Fixed up the code annotations

378f7fa

ellipsis-dev bot reviewed Aug 21, 2024

View reviewed changes

ivanleomk requested a review from jxnl August 21, 2024 15:15

jxnl approved these changes Aug 21, 2024

View reviewed changes

Fixed up vendor lock in content

1e49b66

ellipsis-dev bot reviewed Aug 22, 2024

View reviewed changes

ivanleomk merged commit 30f4e2d into structured-output-v2 Aug 22, 2024
6 of 7 checks passed

ivanleomk deleted the introducing-structured-outputs branch August 22, 2024 04:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing Structured Outputs #939

Introducing Structured Outputs #939

ivanleomk commented Aug 20, 2024 •

edited by ellipsis-dev bot

Loading

ellipsis-dev bot left a comment

cloudflare-workers-and-pages bot commented Aug 20, 2024 •

edited

Loading

jxnl commented Aug 20, 2024

jxnl commented Aug 20, 2024

jxnl commented Aug 20, 2024

jxnl Aug 20, 2024

jxnl Aug 20, 2024

jxnl Aug 20, 2024

jxnl Aug 20, 2024

ivanleomk Aug 21, 2024

jxnl commented Aug 20, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot Aug 21, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

jxnl left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot Aug 22, 2024


		## What's Open AI's Structured Output mode all about?

		OpenAI's new Structured Output mode is a huge step change for developers building complex workflows. Given an arbitrary JSON Schema, Structured Output ensures that the response matches the schema exactly.


		### Streaming

		Streaming with Structured Outputs is supported but a challenging endeavour. There's no built-in partial validation and you need to manually parse the generated response while simultaneously having to now use a context manager to access the generated values.

Introducing Structured Outputs #939

Introducing Structured Outputs #939

Conversation

ivanleomk commented Aug 20, 2024 • edited by ellipsis-dev bot Loading

Summary:

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

cloudflare-workers-and-pages bot commented Aug 20, 2024 • edited Loading

Deploying instructor with Cloudflare Pages

jxnl commented Aug 20, 2024

jxnl commented Aug 20, 2024

jxnl commented Aug 20, 2024

jxnl Aug 20, 2024

Choose a reason for hiding this comment

jxnl Aug 20, 2024

Choose a reason for hiding this comment

jxnl Aug 20, 2024

Choose a reason for hiding this comment

jxnl Aug 20, 2024

Choose a reason for hiding this comment

ivanleomk Aug 21, 2024

Choose a reason for hiding this comment

jxnl commented Aug 20, 2024

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Aug 21, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

jxnl left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Aug 22, 2024

Choose a reason for hiding this comment

ivanleomk commented Aug 20, 2024 •

edited by ellipsis-dev bot

Loading

cloudflare-workers-and-pages bot commented Aug 20, 2024 •

edited

Loading