Skip to content

cloudfuse-io/lambdatization

Repository files navigation

lambdatization (l12n)

Engines License: MIT

The goal of this project is to assess which query engines can realistically run inside cloud functions (in particular AWS Lambda) and have a first feeling about their performances in this highly constrained environment.

📈 Explore the results

We want to provide an accurate and interactive representation of our experimental results. We believe that this is best achieved through open interactive dashboards. This work is still work in progress, feel free to play with it and give us your feedback!

🔨 Lambdatize yourself

The l12n-shell

The l12n-shell provides a way to run all commands in an isolated Docker environement. It is not strictly necessary, but simplifies the collaboration on the project. To set it up:

  • you must have a recent version (v20+) of Docker installed, it is the only dependency
  • clone this repository:
    • git clone https://github.com/cloudfuse-io/lambdatization
  • add the l12n-shell to your path (optional)
    • sudo ln -s $(pwd)/lambdatization/l12n-shell /usr/local/bin/l12n-shell
  • run L12N_BUILD=1 l12n-shell:
    • the L12N_BUILD environment variable indicates to the l12n-shell script that it needs to build the image.
    • l12n-shell operates in the current directory to:
      • look for a .env file to source configurations from (see configuration section below).
      • stores the terraform state if the local backend is used.
      • store the terraform data, i.e the cache data generated by terraform init.
    • the l12n-shell without any argument runs an interactive bash terminal in the CLI container. Note that the .env file is loaded only once when the l12n-shell is started.
    • 12n-shell cmd and echo "cmd" | l12n-shell both run cmd in the l12n-shell.

Note:

  • l12n-shell only supports amd64 for now
  • it is actively tested on Linux only

Configurations

General

l12n-shell can be configured through environement variables or a .env file in the current directory:

  • L12N_PLUGINS is a comma seprated list of plugins to activate
  • L12N_AWS_REGION is the region where the stack should run

AWS

You can also provide the usual AWS variables:

  • AWS_PROFILE
  • AWS_SHARED_CREDENTIALS_FILE
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY

You might also want to verify you "Concurrent executions" quota for Lambda in your AWS account and ask for an increase if required.

Terraform Cloud

If you want to use Terraform Cloud as a backend instead of local, set TF_STATE_BACKEND=cloud. You should then also configure:

  • TF_ORGANIZATION, the name of an existing organization in your Terraform Cloud account.
  • TF_API_TOKEN, a Terraform Cloud user token.
  • TF_WORKSPACE_PREFIX, a prefix shared by all workspaces. Should contain only alphanumeric or - characters (e.g TF_WORKSPACE_PREFIX=l12n-dev-).
  • Add the tfcloud plugin to the L12N_PLUGINS list to enable the l12n tfcloud.config command. This will help you automatically configure the workspaces for all your active plugins with the right settings and credentials.

Note Environment variables will take precedence over the .env file

Tracing backend

For better analysis of the proxying components, you can setup any observability backend compatible with the OpenTelemetry Protocol (OTLP) over the http protocol. We recommend in particular Grafana Cloud which has a generous Free Tier and a nice interface.

L12N_CHAPPY_OPENTELEMETRY_URL=https://otlp-gateway-{$grafana_region}.grafana.net/otlp/v1/traces
L12N_CHAPPY_OPENTELEMETRY_AUTHORIZATION="Basic {echo -n "$instance_id:$api_key" | base64}"

Where:

  • grafana_region is the region of your Grafana Cloud instance, e.g prod-us-east-0
  • instance_id can be obtained from the detail page of your Grafana Cloud instance
  • api_key is a Grafana Cloud api key with MetricsPublisher role
  • base64(instance_id:api_key) is the base64 encoding of the two variables above separated by :

You can also try out [Aspecto][https://www.aspecto.io/] which has pretty similar capabilities and a very easy setup.

L12N_CHAPPY_OPENTELEMETRY_URL=https://otelcol.aspecto.io/v1/traces
L12N_CHAPPY_OPENTELEMETRY_AUTHORIZATION=aspecto_key

The l12n CLI

Inside the l12n-shell, you can use the following commands:

  • l12n -h to see all the available commands
  • l12n deploy -a will run the terraform scripts and deploy the necessary resources (buckets, functions, roles...)
  • l12n destroy -a to tear down the infrastructure and clean up your AWS account
  • l12n dockerized -e engine_name runs a preconfigured query in the dockerized version of the specified engine locally. It requires the core module to be deployed to have access to the data
  • l12n run-lambda -e engine_name -c sql_query runs the specified sql query on the given engine
    • you can also run pre-configured queries using the examples. Run l12n -h to see the list of examples.

About the stack

Infrastructure is managed by Terraform.

We use Terragrunt to:

  • DRY the Terraform config
  • Manage dependencies between modules and allow a plugin based structure.

We are actively monitoring CDK for Terraform and plan to migrate the infrastructure scripts once the tool becomes sufficiently mature (e.g reaches v1).

Contribute