Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added LLama-2-13b-Chat plugin and modified README.md #5

Merged
merged 12 commits into from
Sep 21, 2023

Conversation

aaron-sim
Copy link

Hi,

I have created a plugin which uses Llama-2-13b-chat model (requires the model to be deployed via SageMaker Jumpstart)

Added an additional folder for llama-2-13b-chat-llm plugin and modified the README.md to include the information related to Llama.

Copy link
Contributor

@rstrahan rstrahan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great.. Left a few comments - if you can address them we can get this merged in quickly. Thanks so much!

README.md Outdated
@@ -10,6 +10,7 @@ The directions below explain how to build and deploy the plugins. For more infor
1. AI21 LLM: Uses AI21's Jurassic model API - requires an AI21 account with an API Key
2. Anthropic LLM: Uses Anthropic's Claude model API - requires an Anthropic account with an API Key
3. Amazon Bedrock Embeddings and LLM: Uses Amazon Bedrock service API (preview) - requires access to Amazon Bedrock service (currently in private preview)
4. Llama 2 13b Chat LLM: Uses Llama 2 13b Chat model - requires Llama-2-chat model to be deployed via SageMaker JumpStart.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a reference link to point users to guide on how to install the model in SageMaker JumpStart

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.

import io
from typing import Dict

# TEMPERATURE = os.environ.get("TEMPERATURE", 1e-10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code throughout, if not used, for cleanliness.
Presumably these settings are supplied via model parameters in the event, not local function environment vars.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.

# MAX_NEW_TOKENS = os.environ.get("MAX_NEW_TOKENS", 1024) # max number of tokens to generate in the output

# grab environment variables
ENDPOINT_NAME = os.environ['ENDPOINT_NAME']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename to SAGEMAKER_ENDPOINT_NAME so it doesn't get confused with ENDPOINT_URL used in the other functions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.

Description: QnABot on AWS LLM Plugin for Llama-2-13b-chat

Parameters:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need parameters to identify the (existing)

  • SageMakerEndpointName, and
  • SageMakerEndpointArn

And add more helpful text to the parameter descriptions to make it very clear how they should provision the endpoint (eg Jumpstart docs) and obtain the Name and Arn.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.

Action:
- "sagemaker:InvokeEndpoint"
Resource:
- "*"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope this to be least privilege.. Lambda only needs to invoke the specific endpoint specified as a parameter eg SageMakerEndpointArn

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.


Parameters:

LLMModel:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to me the SageMaker endpoint name? If se, let's rename it to SageMakerEndpointName and provide more complete description.
And also add one for Arn, so you can scope the invoke policy properly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.

Runtime: python3.10
Environment:
Variables:
ENDPOINT_NAME: !Ref LLMModel
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted above, param name is confusing, since you want the endpoint name, not the model. E.g. Call it SageMakerEndpointName as suggested earlier.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.

@@ -0,0 +1,112 @@
AWSTemplateFormatVersion: "2010-09-09"
Description: QnABot on AWS LLM Plugin for Llama-2-13b-chat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deployed using SageMaker JumpStart (add link)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the 2nd commit.

… construct the ARN using the SageMakerEndpointName parameter
@rstrahan rstrahan changed the base branch from main to develop September 21, 2023 21:51
@rstrahan rstrahan merged commit aaaf41b into aws-samples:develop Sep 21, 2023
@aaron-sim aaron-sim deleted the llama2_plugin branch October 10, 2023 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants