Skip to content

Connector Validator

Anthony Virtuoso edited this page Nov 16, 2019 · 2 revisions

One of the most challenging aspects of integrating systems (in this case our connector and Athena) is testing how these two things will work together. Lambda will capture logging from out connector in Cloudwatch Logs but we've also tried to provide some tools to stream line detecting and correcting common semantic and logical issues with your custom connector. By running Athena's connector validation tool you can simulate how Athena will interact with your Lambda function and get access to diagnostic information that would normally only be available within Athena or require you to add extra diagnostics to your connector.

The Connector Validator emulates the calls that Athena will make to your Lambda function as part of executing a select * from .

where . The goal of this tool is to help you troubleshoot connectors by giving you visibility of what 'Athena' would see. You can run this tool by using the helper script in the tools directory of this repository. Usage details can be found below.
usage: ./validate_connector.sh --lambda-func lambda_func [--catalog catalog] [--schema schema [--table table
                               [--constraints constraints]]] [--planning-only] [--help]
 -c,--constraints <arg>   A comma-separated list of field/value pair constraints to be applied when reading metadata and records from the Lambda
                          function to be validated
 -f,--lambda-func <arg>   The name of the Lambda function to be validated. Uses your configured default AWS region.
 -h,--help                Prints usage information.
 -p,--planning-only       If this option is set, then the validator will not attempt to read any records after calling GetSplits.
 -r,--record-func <arg>   The name of the Lambda function to be used to read data records. If not provided, this defaults to the value provided for
                          lambda-func. Uses your configured default AWS region.
 -s,--schema <arg>        The schema name to be used when validating the Lambda function. If not provided, a random existing schema will be chosen.
 -t,--table <arg>         The table name to be used when validating the Lambda function. If not provided, a random existing table will be chosen.