Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datasets #215

Merged
merged 15 commits into from
Sep 27, 2024
Merged

Add datasets #215

merged 15 commits into from
Sep 27, 2024

Conversation

adamnolte
Copy link
Contributor

@adamnolte adamnolte commented Sep 26, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a command-line interface for dataset-related commands.
    • Added a method to retrieve datasets with specific schema versions and optional revisions.
    • Implemented functionality for autogenerating TypeScript interfaces based on dataset schemas.
    • Enhanced error handling in command-line scripts for improved robustness.
  • Bug Fixes

    • Updated ESLint configuration to improve linting for TypeScript files.
  • Documentation

    • Added tests to validate the functionality of autogenerated dataset types.
  • Chores

    • Expanded build configuration to include additional dataset modules.

Copy link

coderabbitai bot commented Sep 26, 2024

Warning

Rate limit exceeded

@adamnolte has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 24 minutes and 26 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Files that changed from the base of the PR and between 09ece1f and 8aed7f5.

📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough
📝 Walkthrough

Walkthrough

The pull request introduces several modifications, including the addition of a new command-line interface for dataset management, enhancements to the ESLint configuration for TypeScript files, and the introduction of new types and interfaces for dataset schemas. A new script (datasets-cli.js) is created to facilitate dataset-related commands, while TypeScript interfaces are autogenerated based on schemas retrieved from an API. Additionally, new tests are added to validate the functionality of these changes.

Changes

Files Change Summary
.lintstagedrc.yaml Updated ESLint command for TypeScript files to include --no-warn-ignored.
bin/datasets-cli.js Introduced a new CLI script for executing dataset-related commands.
bin/prompts-cli.js Added error handling to the main function for robustness.
eslint.config.mjs Added 'src/datasets/autogenerated.ts' to the ESLint configuration for linting.
package.json Added new CLI script entry for datasets-cli.js.
src/client.ts Added getDataset method to AutoblocksAPIClient class for retrieving datasets with specific schema versions.
src/datasets-cli/index.ts Introduced functionality for autogenerating TypeScript interfaces from dataset schemas.
src/datasets-cli/types.ts Defined schemas for dataset properties using Zod, including PropertyTypesEnum and various property schemas.
src/datasets/autogenerated.ts Created interfaces and types for managing datasets and their schema versions.
src/datasets/index.ts Re-exported types from ./autogenerated for easier access across modules.
src/index.ts Exported new dataset-related types for broader usage.
test/datasets-cli/index.spec.ts Added unit tests for validating autogenerated dataset types.
tsup.config.js Updated build configuration to include new source files for datasets and CLI functionality.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant API
    participant FileSystem

    User->>CLI: Execute command (e.g., generate)
    CLI->>API: Fetch dataset schemas
    API-->>CLI: Return dataset schemas
    CLI->>FileSystem: Generate TypeScript interfaces
    FileSystem-->>CLI: Save generated interfaces
    CLI-->>User: Output success message
Loading

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (6)
src/datasets/index.ts (1)

1-5: LGTM! Consider adding a brief comment for clarity.

The re-export of types with cleaner aliases is a good practice. It provides a clear, centralized point for importing dataset-related types and abstracts away the auto-generation details from the rest of the codebase.

Consider adding a brief comment above the export statement to explain the purpose of this file and the origin of these types. For example:

// Re-export auto-generated dataset types with clean aliases
export type {
  __Autogenerated__DatasetName as DatasetName,
  __Autogenerated__DatasetSchemaVersion as DatasetSchemaVersion,
  __Autogenerated__DatasetItem as DatasetItem,
} from './autogenerated';

This comment would help other developers quickly understand the file's purpose and the source of these types.

bin/datasets-cli.js (2)

5-11: LGTM with a minor suggestion: Consider handling the no-argument case

The main function is well-structured and handles the 'generate' command appropriately. The error message for unknown commands is clear and informative.

Consider adding a check for the case where no arguments are provided. This could improve the user experience by providing guidance on how to use the CLI. Here's a suggested modification:

 async function main() {
-  if (process.argv[2] === 'generate') {
+  if (process.argv.length < 3) {
+    console.error('Usage: datasets-cli <command>');
+    console.error('Available commands: generate');
+  } else if (process.argv[2] === 'generate') {
     await run();
   } else {
     console.error(`Unknown command: ${process.argv.slice(2).join(' ')}`);
   }
 }

1-13: Suggestion: Consider a more robust command handling structure

While the current implementation works for a single command, it may become harder to maintain as more commands are added. Additionally, it's important to ensure that the run function is properly secured against potential malicious input.

  1. Consider implementing a command pattern or using a CLI framework like Commander.js for better scalability and maintainability as more commands are added.

  2. Ensure that the run function in the imported module performs proper input validation and sanitization to prevent potential security risks.

Here's a conceptual example of how you might structure this using a command pattern:

#!/usr/bin/env node

const { generate } = require('../dist/datasets-cli');

const commands = {
  generate: generate,
  // Add more commands here as needed
};

async function main() {
  const command = process.argv[2];
  if (!command) {
    console.error('Usage: datasets-cli <command>');
    console.error('Available commands:', Object.keys(commands).join(', '));
    process.exit(1);
  }

  const runCommand = commands[command];
  if (runCommand) {
    try {
      await runCommand();
    } catch (error) {
      console.error(`Error executing ${command}:`, error);
      process.exit(1);
    }
  } else {
    console.error(`Unknown command: ${command}`);
    process.exit(1);
  }
}

main().catch(error => {
  console.error('An unexpected error occurred:', error);
  process.exit(1);
});

This structure allows for easier addition of new commands and provides better error handling and user feedback.

src/index.ts (1)

14-18: LGTM! Consider adding JSDoc comments for the new types.

The addition of these new type exports from the './datasets' module aligns well with the PR objective of adding datasets. The exports are consistent with the existing style in the file and will enhance type safety for dataset-related operations throughout the project.

To improve code documentation, consider adding JSDoc comments for these new types. This would provide more context about their purpose and usage. For example:

export type {
  /** Represents the name of a dataset */
  DatasetName,
  /** Represents the schema version of a dataset */
  DatasetSchemaVersion,
  /** Represents an item within a dataset */
  DatasetItem,
} from './datasets';
test/datasets-cli/index.spec.ts (2)

10-94: LGTM: Comprehensive test data setup with a suggestion.

The test data covers various scenarios, including multiple schema versions, different property types, and an empty schema. This comprehensive setup should effectively test the autogeneration functionality.

Consider extracting the test data into a separate constant or fixture to improve readability and maintainability. This would make it easier to update or extend the test data in the future.

const TEST_DATASETS: ParsedDataset[] = [
  // ... (existing dataset definitions)
];

// In the test case
const autogenerated = config?.generate({
  symbolName,
  datasets: TEST_DATASETS,
});

96-122: LGTM: Thorough assertion with a suggestion for improvement.

The test assertion thoroughly checks the autogenerated output against the expected TypeScript interface definitions. The expected output correctly covers all datasets and their schema versions, including proper handling of optional properties and union types.

To improve maintainability and reduce the risk of formatting-related test failures, consider using a snapshot test instead of a hardcoded string comparison. This approach would make it easier to update the expected output when the autogeneration logic changes:

expect(autogenerated).toMatchSnapshot();

This change would require updating the test framework to support snapshot testing if it's not already configured.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 2f8b8e7 and b10b6b0.

📒 Files selected for processing (12)
  • .lintstagedrc.yaml (1 hunks)
  • bin/datasets-cli.js (1 hunks)
  • eslint.config.mjs (1 hunks)
  • package.json (1 hunks)
  • src/client.ts (2 hunks)
  • src/datasets-cli/index.ts (1 hunks)
  • src/datasets-cli/types.ts (1 hunks)
  • src/datasets/autogenerated.ts (1 hunks)
  • src/datasets/index.ts (1 hunks)
  • src/index.ts (1 hunks)
  • test/datasets-cli/index.spec.ts (1 hunks)
  • tsup.config.js (1 hunks)
🧰 Additional context used
🪛 Biome
src/datasets/autogenerated.ts

[error] 4-5: An empty interface is equivalent to {}.

Safe fix: Use a type alias instead.

(lint/suspicious/noEmptyInterface)

🔇 Additional comments (13)
.lintstagedrc.yaml (1)

4-4: Consider the implications of adding --no-warn-ignored flag

The addition of the --no-warn-ignored flag to the ESLint command for TypeScript files will suppress warnings about ignored files. While this can reduce noise in the linting output, especially in large projects, it might also hide potentially important information.

Consider the following points:

  1. Ensure that this change aligns with your team's preferences for linting output verbosity.
  2. Be aware that this might make it less obvious which files are being ignored by ESLint.
  3. For consistency, you might want to apply this flag to all file types, not just TypeScript files, if it's deemed beneficial for the project.

To understand the impact of this change, you can run the following command to see which files are currently being ignored by ESLint:

This will help in assessing whether suppressing ignored file warnings is appropriate for your project.

bin/datasets-cli.js (1)

1-3: LGTM: Proper setup for a Node.js CLI script

The shebang and import statement are correctly implemented for a Node.js command-line interface script. The use of a compiled version of the CLI (from '../dist/datasets-cli') is a good practice, especially if the main logic is written in TypeScript.

tsup.config.js (1)

10-11: LGTM! Verify the existence of the new modules.

The additions to the entry array are consistent with the existing pattern and will include the new datasets modules in the build process. This change looks good and aligns with the PR objectives of adding datasets functionality.

To ensure the new modules exist, please run the following script:

This script will help confirm that the new modules have been properly created and contain the expected content.

✅ Verification successful

Verified! The new modules exist and are properly included in the build process.

The additions to the entry array in tsup.config.js correctly include 'src/datasets/index.ts' and 'src/datasets-cli/index.ts'. Both modules are present and contain the expected content, ensuring they will be incorporated into the build as intended.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence of the new dataset modules

# Test: Check if the new module files exist
fd -t f "index.ts" src/datasets src/datasets-cli

# Test: Verify the content of these files (optional, but recommended)
fd -t f "index.ts" src/datasets src/datasets-cli -x cat

Length of output: 6662

src/datasets/autogenerated.ts (5)

7-8: LGTM: __Autogenerated__DatasetName type definition

The __Autogenerated__DatasetName type is well-defined, ensuring that dataset names are both keys of the autogenerated types and strings. This aligns with the intended purpose described in the comments.


9-11: LGTM: Well-structured generic type for dataset schema versions

The __Autogenerated__DatasetSchemaVersion generic type is well-defined and provides strong type safety. It ensures that:

  1. Only valid dataset names (T extends __Autogenerated__DatasetName) can be used.
  2. The returned type is both a key of the specific dataset's schema versions and a string.

This structure will help prevent errors when working with different datasets and their schema versions.


12-15: LGTM: Excellent type-safe definition for dataset items

The __Autogenerated__DatasetItem generic type is well-structured and provides strong type safety. It ensures that:

  1. Only valid dataset names (T extends __Autogenerated__DatasetName) can be used.
  2. Only valid schema versions (U extends __Autogenerated__DatasetSchemaVersion) for the given dataset can be used.
  3. The returned type is the specific item type for the given dataset and schema version.

This structure will help prevent errors and provide excellent autocomplete support when working with different datasets, their schema versions, and the corresponding item types.


1-15: Overall: Well-structured foundation for type-safe dataset management

This file provides an excellent TypeScript structure for managing datasets with different schema versions. Key points:

  1. The use of generics and type constraints ensures type safety when working with datasets.
  2. The structure will facilitate autocomplete and error prevention in IDEs.
  3. The design accommodates future autogeneration of dataset types.

The only potential issue is the empty interface, which should be verified as necessary for the autogeneration process. If it is required, consider adding a comment to explain its purpose and suppress static analysis warnings.

Great job on setting up this type-safe foundation for dataset management!

🧰 Tools
🪛 Biome

[error] 4-5: An empty interface is equivalent to {}.

Safe fix: Use a type alias instead.

(lint/suspicious/noEmptyInterface)


1-5: Consider the implications of using an empty interface

The empty interface __Autogenerated__DatasetsTypes serves as a placeholder for future autogenerated content. While the static analysis tool suggests using a type alias instead, changing this might interfere with the autogeneration process.

To ensure this approach aligns with the autogeneration strategy, please verify:

  1. The CLI command that will populate this interface.
  2. Whether using an empty interface is necessary for the autogeneration process.

If the empty interface is indeed required, consider adding a comment explaining why it's necessary despite being empty, to suppress future static analysis warnings.

✅ Verification successful

Verified: The empty interface __Autogenerated__DatasetsTypes is appropriately used within the codebase and testing framework.

  • References found in both source and test files confirm its necessity for the autogeneration process and testing.
  • Changing it to a type alias may disrupt existing functionality.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for references to __Autogenerated__DatasetsTypes to understand its usage

rg --type typescript "__Autogenerated__DatasetsTypes"

Length of output: 88


Script:

#!/bin/bash
# Description: Search for references to __Autogenerated__DatasetsTypes using the correct file type

rg --type ts "__Autogenerated__DatasetsTypes"

Length of output: 690

🧰 Tools
🪛 Biome

[error] 4-5: An empty interface is equivalent to {}.

Safe fix: Use a type alias instead.

(lint/suspicious/noEmptyInterface)

eslint.config.mjs (1)

26-26: LGTM: Appropriate addition to ESLint ignore list

The addition of 'src/datasets/autogenerated.ts' to the ESLint ignore list is appropriate and consistent with the existing pattern for auto-generated files. This change ensures that ESLint will not attempt to lint this auto-generated file, which is a good practice to avoid unnecessary linting errors or warnings on code that is not meant to be manually edited.

This change aligns well with the PR objective of adding datasets and the mentioned auto-generation of TypeScript interfaces based on schemas.

test/datasets-cli/index.spec.ts (2)

1-6: LGTM: Imports are appropriate and well-structured.

The import statements are correctly bringing in the necessary components from the datasets-cli module for the test suite.


7-9: LGTM: Well-structured test suite.

The test suite is properly organized using describe blocks, and the test case description is clear and concise.

src/datasets-cli/index.ts (1)

97-128: Ensure robust symbol detection in determineStartAndEndIdx

The determineStartAndEndIdx function searches for symbol appearances using exact string matches. This approach might fail if there are minor formatting differences, such as varying whitespace or line endings, which could cause the autogeneration to fail.

Consider using regular expressions or normalizing the content to improve the robustness of symbol detection.

Run the following script to verify if the function correctly detects symbols regardless of formatting:

This script uses ripgrep to search for the interface declaration across the codebase, accommodating different formatting styles.

src/client.ts (1)

267-280: Method signature is correctly defined

The getDataset method's signature is well-defined with appropriate generic type parameters, ensuring type safety when retrieving datasets with specific schema versions and optional revisions.

}
}

main();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Suggestion: Add error handling for the main function call

While the current implementation works, it's a good practice to handle potential errors that might be thrown by the async main function.

Consider wrapping the main() call in a try-catch block to handle any uncaught errors. This will prevent the script from crashing silently if an unexpected error occurs. Here's a suggested modification:

-main();
+main().catch(error => {
+  console.error('An unexpected error occurred:', error);
+  process.exit(1);
+});

This change will log any uncaught errors and exit the process with a non-zero status code, which is useful for scripting and CI/CD pipelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
main();
main().catch(error => {
console.error('An unexpected error occurred:', error);
process.exit(1);
});

Comment on lines +6 to +7
"prompts": "./bin/prompts-cli.js",
"datasets": "./bin/datasets-cli.js"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Action Required: Address Executable Permissions and Potential Circular Dependency in datasets-cli.js.

  • The datasets-cli.js file is not executable. Please set the appropriate permissions using:
    chmod +x ./bin/datasets-cli.js
  • A potential circular dependency has been detected in datasets-cli.js. Review and refactor the code to resolve this issue.
🔗 Analysis chain

LGTM! New datasets CLI added successfully.

The addition of the "datasets" CLI script in the bin section is correct and aligns with the PR objectives. This change expands the package's functionality by providing a new command-line interface for dataset management.

To ensure the new CLI script exists and is executable, please run the following verification script:

As a follow-up:

  1. Ensure that the documentation is updated to include information about this new CLI command.
  2. Consider adding tests to cover the functionality of the new datasets CLI.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the existence and executability of the new datasets CLI script

# Test 1: Check if the file exists
if [ -f "./bin/datasets-cli.js" ]; then
    echo "datasets-cli.js file exists."
else
    echo "Error: datasets-cli.js file not found in the ./bin directory."
    exit 1
fi

# Test 2: Check if the file is executable
if [ -x "./bin/datasets-cli.js" ]; then
    echo "datasets-cli.js is executable."
else
    echo "Warning: datasets-cli.js is not executable. Consider running 'chmod +x ./bin/datasets-cli.js'."
fi

# Test 3: Check for potential circular dependencies
if grep -q "require.*datasets-cli" "./bin/datasets-cli.js"; then
    echo "Warning: Potential circular dependency detected in datasets-cli.js"
fi

echo "Verification complete."

Length of output: 595

Comment on lines +28 to +32
const zBooleanPropertySchema = zCommonFields.extend({
type: z.literal(PropertyTypesEnum.Boolean),
defaultValue: z.boolean().optional(),
required: z.literal(true),
});
Copy link

@coderabbitai coderabbitai bot Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Confirm if 'required' being always true for Boolean properties is intentional

In zBooleanPropertySchema, the required field is set to z.literal(true), which forces all boolean properties to be required. If the intention is to allow optional boolean properties, consider using z.boolean().default(false) like in zCommonFields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirming since it will either be true or false and always set

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamnolte Understood, thanks for the clarification.

(^・ω・^ )


✏️ Learnings added
Learnt from: adamnolte
PR: autoblocksai/javascript-sdk#215
File: src/datasets-cli/types.ts:28-32
Timestamp: 2024-09-26T22:18:12.195Z
Learning: In `zBooleanPropertySchema`, the `required` field is set to `z.literal(true)` because Boolean properties are always required, since they will either be `true` or `false` and always set.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines +95 to +101
export interface ParsedDataset {
name: string;
schemaVersions: {
schema: DatasetSchema;
version: number;
}[];
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider defining a separate interface for schema versions to improve clarity

In the ParsedDataset interface, the schemaVersions array contains inline objects with schema and version properties. Defining a separate interface for schema versions can enhance readability and reusability.

Apply this refactor:

export interface SchemaVersion {
  schema: DatasetSchema;
  version: number;
}

export interface ParsedDataset {
  name: string;
  schemaVersions: SchemaVersion[];
}

Comment on lines +133 to +140
const resp = await fetch(`${API_ENDPOINT}/datasets/schemas`, {
method: 'GET',
headers: {
...AUTOBLOCKS_HEADERS,
Authorization: `Bearer ${args.apiKey}`,
},
});
return await resp.json();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Handle HTTP errors when fetching dataset schemas

In the getAllDatasetSchemasFromAPI function, the response from the API is not checked for success before attempting to parse the JSON. If the API returns a non-200 status code, this could result in unhandled exceptions or misleading error messages.

Consider checking resp.ok and handling non-success status codes appropriately.

 const resp = await fetch(`${API_ENDPOINT}/datasets/schemas`, {
   method: 'GET',
   headers: {
     ...AUTOBLOCKS_HEADERS,
     Authorization: `Bearer ${args.apiKey}`,
   },
 });
+if (!resp.ok) {
+  throw new Error(`Failed to fetch dataset schemas: ${resp.status} ${resp.statusText}`);
+}
 return await resp.json();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const resp = await fetch(`${API_ENDPOINT}/datasets/schemas`, {
method: 'GET',
headers: {
...AUTOBLOCKS_HEADERS,
Authorization: `Bearer ${args.apiKey}`,
},
});
return await resp.json();
const resp = await fetch(`${API_ENDPOINT}/datasets/schemas`, {
method: 'GET',
headers: {
...AUTOBLOCKS_HEADERS,
Authorization: `Bearer ${args.apiKey}`,
},
});
if (!resp.ok) {
throw new Error(`Failed to fetch dataset schemas: ${resp.status} ${resp.statusText}`);
}
return await resp.json();

Comment on lines 51 to 66
function makeTypeForProperty(args: { property: PropertySchema }): string {
switch (args.property.type) {
case PropertyTypesEnum.String:
return 'string';
case PropertyTypesEnum.Number:
return 'number';
case PropertyTypesEnum.Boolean:
return 'boolean';
case PropertyTypesEnum.Select:
return args.property.options.map((option) => `'${option}'`).join(' | ');
case PropertyTypesEnum.MultiSelect:
return `(${args.property.options.map((option) => `'${option}'`).join(' | ')})[]`;
case PropertyTypesEnum.ValidJSON:
return 'Record<string, unknown>';
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add a default case in makeTypeForProperty to handle unexpected property types

In the makeTypeForProperty function, the switch statement does not include a default case. If args.property.type is not one of the handled PropertyTypesEnum values, the function will return undefined, which may lead to runtime errors or unexpected behavior.

Consider adding a default case to handle unexpected types, either by throwing an error or returning a generic type.

 function makeTypeForProperty(args: { property: PropertySchema }): string {
   switch (args.property.type) {
     case PropertyTypesEnum.String:
       return 'string';
     case PropertyTypesEnum.Number:
       return 'number';
     case PropertyTypesEnum.Boolean:
       return 'boolean';
     case PropertyTypesEnum.Select:
       return args.property.options.map((option) => `'${option}'`).join(' | ');
     case PropertyTypesEnum.MultiSelect:
       return `(${args.property.options.map((option) => `'${option}'`).join(' | ')})[]`;
     case PropertyTypesEnum.ValidJSON:
       return 'Record<string, unknown>';
+    default:
+      throw new Error(`Unhandled property type: ${args.property.type}`);
   }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
function makeTypeForProperty(args: { property: PropertySchema }): string {
switch (args.property.type) {
case PropertyTypesEnum.String:
return 'string';
case PropertyTypesEnum.Number:
return 'number';
case PropertyTypesEnum.Boolean:
return 'boolean';
case PropertyTypesEnum.Select:
return args.property.options.map((option) => `'${option}'`).join(' | ');
case PropertyTypesEnum.MultiSelect:
return `(${args.property.options.map((option) => `'${option}'`).join(' | ')})[]`;
case PropertyTypesEnum.ValidJSON:
return 'Record<string, unknown>';
}
}
function makeTypeForProperty(args: { property: PropertySchema }): string {
switch (args.property.type) {
case PropertyTypesEnum.String:
return 'string';
case PropertyTypesEnum.Number:
return 'number';
case PropertyTypesEnum.Boolean:
return 'boolean';
case PropertyTypesEnum.Select:
return args.property.options.map((option) => `'${option}'`).join(' | ');
case PropertyTypesEnum.MultiSelect:
return `(${args.property.options.map((option) => `'${option}'`).join(' | ')})[]`;
case PropertyTypesEnum.ValidJSON:
return 'Record<string, unknown>';
default:
throw new Error(`Unhandled property type: ${args.property.type}`);
}
}

Comment on lines +29 to +46
let generated = `interface ${args.symbolName} {`; // start of definition

args.datasets.forEach((dataset) => {
generated += `\n '${dataset.name}': {`; // start of dataset definition

dataset.schemaVersions.forEach((schemaVersion) => {
generated += `\n '${schemaVersion.version}': {`; // start of schema version definition
schemaVersion.schema.forEach((property) => {
generated += `\n '${property.name}'${property.required ? '' : '?'}: ${makeTypeForProperty({ property })};`;
});
generated += '\n };'; // end of schema version definition
});

generated += '\n };'; // end of dataset definition
});

generated += '\n}'; // end of definition
return generated;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor string concatenation in generate function for readability and efficiency

In the generate function, string concatenation is used extensively to build the generated interface definition. This approach can reduce code readability and may have performance implications with large datasets.

Consider using an array to collect the lines and then join them at the end. This enhances readability and maintains performance.

 generate: (args) => {
-  let generated = `interface ${args.symbolName} {`; // start of definition
+  const lines = [`interface ${args.symbolName} {`]; // start of definition

   args.datasets.forEach((dataset) => {
-    generated += `\n  '${dataset.name}': {`; // start of dataset definition
+    lines.push(`  '${dataset.name}': {`); // start of dataset definition

     dataset.schemaVersions.forEach((schemaVersion) => {
-      generated += `\n    '${schemaVersion.version}': {`; // start of schema version definition
+      lines.push(`    '${schemaVersion.version}': {`); // start of schema version definition

       schemaVersion.schema.forEach((property) => {
-        generated += `\n      '${property.name}'${property.required ? '' : '?'}: ${makeTypeForProperty({ property })};`;
+        lines.push(`      '${property.name}'${property.required ? '' : '?'}: ${makeTypeForProperty({ property })};`);
       });
-      generated += '\n    };'; // end of schema version definition
+      lines.push('    };'); // end of schema version definition
     });

-    generated += '\n  };'; // end of dataset definition
+    lines.push('  };'); // end of dataset definition
   });

-  generated += '\n}'; // end of definition
-  return generated;
+  lines.push('}');
+  return lines.join('\n');
 },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let generated = `interface ${args.symbolName} {`; // start of definition
args.datasets.forEach((dataset) => {
generated += `\n '${dataset.name}': {`; // start of dataset definition
dataset.schemaVersions.forEach((schemaVersion) => {
generated += `\n '${schemaVersion.version}': {`; // start of schema version definition
schemaVersion.schema.forEach((property) => {
generated += `\n '${property.name}'${property.required ? '' : '?'}: ${makeTypeForProperty({ property })};`;
});
generated += '\n };'; // end of schema version definition
});
generated += '\n };'; // end of dataset definition
});
generated += '\n}'; // end of definition
return generated;
const lines = [`interface ${args.symbolName} {`]; // start of definition
args.datasets.forEach((dataset) => {
lines.push(` '${dataset.name}': {`); // start of dataset definition
dataset.schemaVersions.forEach((schemaVersion) => {
lines.push(` '${schemaVersion.version}': {`); // start of schema version definition
schemaVersion.schema.forEach((property) => {
lines.push(` '${property.name}'${property.required ? '' : '?'}: ${makeTypeForProperty({ property })};`);
});
lines.push(' };'); // end of schema version definition
});
lines.push(' };'); // end of dataset definition
});
lines.push('}');
return lines.join('\n');

Comment on lines +281 to +294
const encodedName = encodeURIComponent(args.name);
const encodedSchemaVersion = encodeURIComponent(args.schemaVersion);
if (args.revisionId) {
return this.get(
`/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}/revisions/${encodeURIComponent(
args.revisionId,
)}`,
);
}

return this.get(
`/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}`,
);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor to reduce code duplication

The if block and the subsequent return statements contain duplicated code for constructing the URL and calling this.get(). Consider refactoring to consolidate the URL construction and have a single return statement.

Apply this diff to simplify the method:

 const encodedName = encodeURIComponent(args.name);
 const encodedSchemaVersion = encodeURIComponent(args.schemaVersion);
+let path = `/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}`;
+if (args.revisionId) {
+  const encodedRevisionId = encodeURIComponent(args.revisionId);
+  path += `/revisions/${encodedRevisionId}`;
+}
 
-if (args.revisionId) {
-  return this.get(
-    `/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}/revisions/${encodeURIComponent(
-      args.revisionId,
-    )}`,
-  );
-}
-
-return this.get(
-  `/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}`,
-);
+return this.get(path);

This refactor reduces code duplication and improves readability.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const encodedName = encodeURIComponent(args.name);
const encodedSchemaVersion = encodeURIComponent(args.schemaVersion);
if (args.revisionId) {
return this.get(
`/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}/revisions/${encodeURIComponent(
args.revisionId,
)}`,
);
}
return this.get(
`/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}`,
);
}
const encodedName = encodeURIComponent(args.name);
const encodedSchemaVersion = encodeURIComponent(args.schemaVersion);
let path = `/datasets/${encodedName}/schema-versions/${encodedSchemaVersion}`;
if (args.revisionId) {
const encodedRevisionId = encodeURIComponent(args.revisionId);
path += `/revisions/${encodedRevisionId}`;
}
return this.get(path);

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
bin/prompts-cli.js (2)

13-16: Approve changes with a minor suggestion for improvement

The addition of error handling for the main() function is a good practice. It ensures that unexpected errors are caught, logged, and the process exits with an appropriate status code.

Consider enhancing the error logging to provide more context:

 main().catch((error) => {
-  console.error('An unexpected error occurred: ', error);
+  console.error('An unexpected error occurred while running the prompts CLI:', error);
   process.exit(1);
 });

This change provides more specific information about where the error occurred, which can be helpful for debugging.


Line range hint 1-16: Suggest improvements for command handling and user experience

While the current implementation works, consider the following improvements to enhance the CLI's functionality and user experience:

  1. Add error handling for the 'generate' command:
async function main() {
  if (process.argv[2] === 'generate') {
    try {
      await run();
    } catch (error) {
      console.error('Error during generate command:', error);
      process.exit(1);
    }
  } else {
    console.error(`Unknown command: ${process.argv.slice(2).join(' ')}`);
    process.exit(1);
  }
}
  1. Implement a help command to display available options:
function printHelp() {
  console.log('Available commands:');
  console.log('  generate    Generate prompts');
  console.log('  help        Display this help message');
}

async function main() {
  const command = process.argv[2];
  switch (command) {
    case 'generate':
      try {
        await run();
      } catch (error) {
        console.error('Error during generate command:', error);
        process.exit(1);
      }
      break;
    case 'help':
      printHelp();
      break;
    default:
      console.error(`Unknown command: ${command}`);
      printHelp();
      process.exit(1);
  }
}

These changes will make the CLI more robust and user-friendly.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between b10b6b0 and 09ece1f.

📒 Files selected for processing (4)
  • bin/datasets-cli.js (1 hunks)
  • bin/prompts-cli.js (1 hunks)
  • src/datasets-cli/index.ts (1 hunks)
  • test/datasets-cli/index.spec.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • bin/datasets-cli.js
  • src/datasets-cli/index.ts
  • test/datasets-cli/index.spec.ts

@adamnolte adamnolte merged commit e86368e into main Sep 27, 2024
4 checks passed
@adamnolte adamnolte deleted the adam/epd-1054-js-sdk-updates-for-datasets branch September 27, 2024 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants