Skip to content

Add classifyCategories function for llm scorecard #794

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion packages/scripts/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,11 @@
"@release-it/bumper": "^5.1.0",
"@slack/web-api": "^7.8.0",
"@stdlib/random-sample": "^0.2.1",
"chatbot-server-mongodb-public": "*",
"csv": "^6.3.1",
"dotenv": "^16.3.1",
"mongodb-chatbot-server": "*",
"mongodb-rag-core": "*",
"chatbot-server-mongodb-public": "*",
"yaml": "^2.3.4",
"yargs": "^17.7.2"
},
Expand Down
149 changes: 149 additions & 0 deletions packages/scripts/src/llm-scorecard/classifyCategories.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
import {
type ClassificationType,
getEnv,
makeClassifier,
} from "mongodb-rag-core";
import { AzureOpenAI } from "mongodb-rag-core/openai";

const env = getEnv({
required: ["OPENAI_API_KEY", "OPENAI_ENDPOINT", "OPENAI_API_VERSION"],
});

const openAiClient = new AzureOpenAI({
apiKey: env.OPENAI_API_KEY,
endpoint: env.OPENAI_ENDPOINT,
apiVersion: env.OPENAI_API_VERSION,
});

const classificationTypes: ClassificationType[] = [
Copy link
Preview

Copilot AI Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] This inline definition of many category examples can become hard to maintain as you add new categories. Consider loading this data from an external JSON or CSV file and transforming it into classificationTypes to keep the code concise.

Copilot uses AI. Check for mistakes.

{
type: "Advanced Features",
description:
"Prompts that only apply to specific use cases or complex features.",
examples: [
{ text: "does mongodb support transactions" },
{ text: "how to use mongodump" },
{ text: "How do I backup a MongoDB database" },
{ text: "How do I use mongorestore from dump" },
{
text: "What's the difference between ANN and ENN search in Atlas Vector Search?",
},
{ text: "What are mongosync limitations" },
{ text: "how to use gridfs in mongodb" },
{
text: "Does Atlas Vector Search work with images, media files, and other types of data?",
},
],
},
{
type: "AI/LLM Integration",
description:
"Prompts that are about MongoDB's AI/LLM integration features.",
examples: [
{ text: "How do I build AI applications with MongoDB?" },
{
text: "Does MongoDB support LangGraph checkpointers? If so, are they asynchronous or synchronous?",
},
{ text: "How does MongoDB help with AI projects?" },
{ text: "Can I use MongoDB for RAG implementations? How?" },
{ text: "Does MongoDB offer support for developing AI applications?" },
{ text: "Does MongoDB generate embeddings?" },
{ text: "What is Retrieval-augmented generation?" },
{ text: "How will MongoDB and Voyage AI work together?" },
],
},
{
type: "Foundational Concepts",
description: "Prompts that are about MongoDB's core features and concepts.",
examples: [
{ text: "explain indexes in mongodb" },
{ text: "when to use findone vs find in mongodb" },
{ text: "What is a mongodb change streams example" },
{ text: "what's the difference between updateone and findoneandupdate" },
{ text: "What is the mongodb list collections command" },
{ text: "What is MongoDB?" },
{ text: "What is aggregation in MongoDB" },
{ text: "how many authentication methods for MongoDB" },
],
},
{
type: "Positioning",
description:
"Prompts that position MongoDB in the market relative to other solutions.",
examples: [
{
text: "What specific advantages does the new Atlas Flex tier provide over traditional serverless models?",
},
{
text: "What are the key differentiators when comparing MongoDB to Azure Data Explorer (ADX)?",
},
{ text: "How does MongoDB compare to Postgres?" },
{ text: "How is MongoDB used by companies in the energy industry?" },
{
text: "Are there any case studies demonstrating MongoDB’s effectiveness?",
},
{
text: "How does the new pricing model of the new Atlas Flex tier ensure more predictability compared to previous offerings?",
},
{ text: "How can I migrate from MySQL to MongoDB?" },
{ text: "What industries use MongoDB?" },
],
},
{
type: "Practical Usage & Queries",
description:
"Prompts that are about how to use MongoDB in concrete scenarios.",
examples: [
{ text: "how to get connection string from mongodb atlas" },
{ text: "command to create new collection" },
{ text: "What are the installation steps for mongodb compass" },
{ text: "What is the mongodb filter query for a nested object" },
{ text: "connect to mongodb nodejs" },
{ text: "How do you use not equal in MongoDB for multiple values" },
{ text: "how to query mongodb collection" },
{
text: "What are the step by step setup instructions for replication in mongodb with linux",
},
],
},
{
type: "Troubleshooting & Best Practices",
description:
"Prompts that ask about bugs, performance, and other MongoDB best practices.",
examples: [
{ text: "What limitations for mongodb time series" },
{ text: "are there any best practices for mongodb crud operations" },
{ text: "What are the common exceptions for the mongodb java driver" },
{ text: "mongodb ttl not working" },
{ text: "How can Atlas users specify maintenance timing?" },
{
text: "I'm trying to use Compass with DocumentDB, and I keep running into unexpected behavior. For example, collection and database stats don't render, and I can't analyze my schema. Is there a workaround? ",
},
{
text: "Why can't I read my own writes with a numbered write concern and read concern majority?",
},
{ text: "I have enough memory, how can I further improve performance?" },
],
},
{
type: "General Information",
description:
"Prompts that are related to MongoDB but not directly about the product, such as release notes, documentation, and other general information.",
examples: [
{ text: "what's new in mongodb 8" },
{ text: "Where is the official MongoDB documentation" },
{ text: "Where are mongodb release notes" },
{ text: "Can I hire MongoDB developers to build my application?" },
{ text: "Is MongoDB currently hiring?" },
{
text: "Where can I find the changes in the newest version of the MongoDB Administration API?",
},
],
},
];

export const classifyCategories = makeClassifier({
openAiClient,
model: "gpt-4.1-mini",
classificationTypes,
});
Loading