-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tutuorial for RAG of openai and bedrock #2975
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,197 @@ | ||
# Topic | ||
|
||
This tutorial explains how to use conversational search with Bedrock Claude2 model. For more information, see [Conversational search](https://opensearch.org/docs/latest/search-plugins/conversational-search/). | ||
|
||
Note: Replace the placeholders that start with `your_` with your own values. | ||
Claude3 model not supported yet. | ||
|
||
The other way to build RAG/conversational search is using Agent Framework, see [RAG_with_conversational_flow_agent](../agent_framework/RAG_with_conversational_flow_agent.md) | ||
|
||
# Steps | ||
|
||
## 0. Preparation | ||
|
||
Ingest test data: | ||
``` | ||
POST _bulk | ||
{"index": {"_index": "qa_demo", "_id": "1"}} | ||
{"text": "Chart and table of population level and growth rate for the Ogden-Layton metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\nThe current metro area population of Ogden-Layton in 2023 is 750,000, a 1.63% increase from 2022.\nThe metro area population of Ogden-Layton in 2022 was 738,000, a 1.79% increase from 2021.\nThe metro area population of Ogden-Layton in 2021 was 725,000, a 1.97% increase from 2020.\nThe metro area population of Ogden-Layton in 2020 was 711,000, a 2.16% increase from 2019."} | ||
{"index": {"_index": "qa_demo", "_id": "2"}} | ||
{"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019."} | ||
{"index": {"_index": "qa_demo", "_id": "3"}} | ||
{"text": "Chart and table of population level and growth rate for the Chicago metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Chicago in 2023 is 8,937,000, a 0.4% increase from 2022.\\nThe metro area population of Chicago in 2022 was 8,901,000, a 0.27% increase from 2021.\\nThe metro area population of Chicago in 2021 was 8,877,000, a 0.14% increase from 2020.\\nThe metro area population of Chicago in 2020 was 8,865,000, a 0.03% increase from 2019."} | ||
{"index": {"_index": "qa_demo", "_id": "4"}} | ||
{"text": "Chart and table of population level and growth rate for the Miami metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Miami in 2023 is 6,265,000, a 0.8% increase from 2022.\\nThe metro area population of Miami in 2022 was 6,215,000, a 0.78% increase from 2021.\\nThe metro area population of Miami in 2021 was 6,167,000, a 0.74% increase from 2020.\\nThe metro area population of Miami in 2020 was 6,122,000, a 0.71% increase from 2019."} | ||
{"index": {"_index": "qa_demo", "_id": "5"}} | ||
{"text": "Chart and table of population level and growth rate for the Austin metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Austin in 2023 is 2,228,000, a 2.39% increase from 2022.\\nThe metro area population of Austin in 2022 was 2,176,000, a 2.79% increase from 2021.\\nThe metro area population of Austin in 2021 was 2,117,000, a 3.12% increase from 2020.\\nThe metro area population of Austin in 2020 was 2,053,000, a 3.43% increase from 2019."} | ||
{"index": {"_index": "qa_demo", "_id": "6"}} | ||
{"text": "Chart and table of population level and growth rate for the Seattle metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of Seattle in 2023 is 3,519,000, a 0.86% increase from 2022.\\nThe metro area population of Seattle in 2022 was 3,489,000, a 0.81% increase from 2021.\\nThe metro area population of Seattle in 2021 was 3,461,000, a 0.82% increase from 2020.\\nThe metro area population of Seattle in 2020 was 3,433,000, a 0.79% increase from 2019."} | ||
|
||
``` | ||
|
||
## 1. Create connector and model | ||
|
||
1. Create connector for Bedrock Claude2 model: | ||
|
||
Follow [this blueprint](https://github.com/opensearch-project/ml-commons/blob/2.x/docs/remote_inference_blueprints/bedrock_connector_anthropic_claude_blueprint.md) | ||
|
||
``` | ||
POST _plugins/_ml/connectors/_create | ||
{ | ||
"name": "Bedrock Claude2", | ||
"description": "Connector for Bedrock Claude2", | ||
"version": 1, | ||
"protocol": "aws_sigv4", | ||
"credential": { | ||
"access_key": "your_access_key", | ||
"secret_key": "your_secret_key", | ||
"session_token": "your_session_token" | ||
}, | ||
"parameters": { | ||
"region": "your_aws_region", | ||
"service_name": "bedrock", | ||
"model": "anthropic.claude-v2" | ||
}, | ||
"actions": [ | ||
{ | ||
"action_type": "predict", | ||
"method": "POST", | ||
"headers": { | ||
"content-type": "application/json" | ||
}, | ||
"url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke", | ||
"request_body": "{\"prompt\":\"\\n\\nHuman: ${parameters.inputs}\\n\\nAssistant:\",\"max_tokens_to_sample\":300,\"temperature\":0.5,\"top_k\":250,\"top_p\":1,\"stop_sequences\":[\"\\\\n\\\\nHuman:\"]}" | ||
} | ||
] | ||
} | ||
``` | ||
|
||
Note the connector ID; you will use it to create the model. | ||
|
||
2. Create model: | ||
``` | ||
POST /_plugins/_ml/models/_register?deploy=true | ||
{ | ||
"name": "Bedrock Claude2 model", | ||
"function_name": "remote", | ||
"description": "Bedrock Claude2 model", | ||
"connector_id": "your_connector_id" | ||
} | ||
``` | ||
|
||
Note the model ID; you will use it in the following steps. | ||
|
||
3. Test the model: | ||
``` | ||
POST /_plugins/_ml/models/your_model_id/_predict | ||
{ | ||
"parameters": { | ||
"inputs": "Who won the world series in 2020?" | ||
} | ||
} | ||
``` | ||
Sample response: | ||
``` | ||
{ | ||
"inference_results": [ | ||
{ | ||
"output": [ | ||
{ | ||
"name": "response", | ||
"dataAsMap": { | ||
"type": "completion", | ||
"completion": " The Los Angeles Dodgers won the 2020 World Series, defeating the Tampa Bay Rays 4 games to 2. The World Series was played at a neutral site in Arlington, Texas due to the COVID-19 pandemic. It was the Dodgers' first World Series championship since 1988.", | ||
"stop_reason": "stop_sequence", | ||
"stop": "\n\nHuman:" | ||
} | ||
} | ||
], | ||
"status_code": 200 | ||
} | ||
] | ||
} | ||
``` | ||
|
||
## 2. Conversational search | ||
|
||
### 2.1 Create pipeline | ||
``` | ||
PUT /_search/pipeline/my-conversation-search-pipeline-claude2 | ||
{ | ||
"response_processors": [ | ||
{ | ||
"retrieval_augmented_generation": { | ||
"tag": "Demo pipeline", | ||
"description": "Demo pipeline Using Bedrock Claude2", | ||
"model_id": "your_model_id_created_in_step1", | ||
"context_field_list": [ | ||
"text" | ||
], | ||
"system_prompt": "You are a helpful assistant", | ||
"user_instructions": "Generate a concise and informative answer in less than 100 words for the given question" | ||
} | ||
} | ||
] | ||
} | ||
|
||
``` | ||
|
||
### 2.2 Search | ||
Conversational search has some extra parameters you specify in `generative_qa_parameters`: | ||
``` | ||
GET /qa_demo/_search?search_pipeline=my-conversation-search-pipeline-claude2 | ||
{ | ||
"query": { | ||
"match": { | ||
"text": "What's the population increase of New York City from 2021 to 2023?" | ||
} | ||
}, | ||
"size": 1, | ||
"_source": [ | ||
"text" | ||
], | ||
"ext": { | ||
"generative_qa_parameters": { | ||
"llm_model": "bedrock/claude", | ||
"llm_question": "What's the population increase of New York City from 2021 to 2023?", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is a conversational search, can you try a follow up question to demo this is an on-going conversation? Maybe we can try another questions like "based on the population increased in New York, is it the fastest growing city in the world?", or other follow up questions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. make sense, will add a following up question |
||
"context_size": 5, | ||
"timeout": 15 | ||
} | ||
} | ||
} | ||
``` | ||
Sample response: | ||
``` | ||
{ | ||
"took": 1, | ||
"timed_out": false, | ||
"_shards": { | ||
"total": 1, | ||
"successful": 1, | ||
"skipped": 0, | ||
"failed": 0 | ||
}, | ||
"hits": { | ||
"total": { | ||
"value": 6, | ||
"relation": "eq" | ||
}, | ||
"max_score": 9.042081, | ||
"hits": [ | ||
{ | ||
"_index": "qa_demo", | ||
"_id": "2", | ||
"_score": 9.042081, | ||
"_source": { | ||
"text": "Chart and table of population level and growth rate for the New York City metro area from 1950 to 2023. United Nations population projections are also included through the year 2035.\\nThe current metro area population of New York City in 2023 is 18,937,000, a 0.37% increase from 2022.\\nThe metro area population of New York City in 2022 was 18,867,000, a 0.23% increase from 2021.\\nThe metro area population of New York City in 2021 was 18,823,000, a 0.1% increase from 2020.\\nThe metro area population of New York City in 2020 was 18,804,000, a 0.01% decline from 2019." | ||
} | ||
} | ||
] | ||
}, | ||
"ext": { | ||
"retrieval_augmented_generation": { | ||
"answer": " The population of the New York City metro area increased by 114,000 people from 2021 to 2023, going from 18,823,000 in 2021 to 18,937,000 in 2023. This represents a 0.6% increase over the two year period." | ||
} | ||
} | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a required parameter here? Why don't we use the content of query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
llm_question
is mandatory , check this doc https://opensearch.org/docs/latest/search-plugins/conversational-search/This is the feature's design. Some query may have no the content or have different query with the question.