Mastering lambdas with Poetry

Feb 07, 2022

Any project that relies heavily on lambda functions inevitably ends up with a monorepo containing hundreds of lambda functions. Each of these functions are really an application in their own right with their own tests, dependencies and code owners and potentially using a range of different runtime versions.

The AWS tutorial approach to building and deploying lambdas does not work in a clean deployment setting. The hard requirements I have for a system are:

It must be able to run with no human intervention in a CI pipeline.
It must provide some protection against issues in the supply-chain.
It must provide repeatable builds.

Poetry solves these problems

While it’s certainly not its primary use-case, Poetry is a powerful tool in solving some of these problems. Its main purpose is to allow developers to package code and manage dependencies with an eye to pushing to an repository like the Python Package Index. We want the packaging and dependency management but it just needs some tweaks to work for lambda.

Making Poetry work for Lambda

I’m not going to tackle how to install Poetry, a moving target, or how to configure Pyenv but these are the specific deviations that will make it work for Lambda.

Structure

There are a couple of tweaks that you need to make for Poetry to work with Lambda. Probably the most painful one with regards to code cleanliness is structural. Lambda expects the handler function to be in the root of your zip file, whereas Poetry by default will put it in a directory.

Standard poetry structure Lambda poetry structure

To reflect the new setup we need to also update the pyproject.toml file to ensure Poetry knows which modules to include.

[tool.poetry]
name = "lambda"
version = "0.1.0"
description = ""
authors = ["Daniel Bowman <[email protected]>"]
packages = [
  { include = "lambda" },
  { include = "lib" }
]

[tool.poetry.dependencies]
python = "^3.9"

[tool.poetry.dev-dependencies]
pytest = "^5.2"
boto3 = "1.18.55"
botocore = "1.21.55"
pytest-mock = "^3.7.0"
moto = "^3.0.2"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

Dev dependencies

When you deploy a Lambda function there are a couple of dependencies you don’t need to include. Boto and Botocore are included by default as they’re essential to working with any AWS service. The current versions are listed in the Lambda runtime documentation.

For local development you will still want these to be available for static analysis. To make this happen they should be included as dev dependencies.

poetry add --dev boto3==1.18.55 botocore==1.21.55

Building the package

The process of building the package is where we really start to deviate from standard Poetry processes. The first step is reasonably normal, we’re going to create a wheel package.

poetry build --format wheel

With the wheel in place we now need to create a directory that contains all of the dependencies. In the build environment I work with we need to specify an old platform for lambdas that will run on python 3.7 to pull in binaries linked to an older version of libc. For python 3.8 or above you should use a more modern platform.

poetry run pip install --upgrade --only-binary :all: --platform manylinux2010_x86_64 --target package dist/*.whl

Once all the dependencies have been brought in we do 2 steps to build the zip. First we reset all of the file modification times. Next we build the zip, ensuring we set flags to remove so extra information. You could just do a zip with no flags, but we want to be able to build the same package repeatedly and get the same file hash each time.

find . -exec touch -t 202201010000.00 {} \;
zip -X --no-dir-entries --recurse-paths ../artifact.zip .

Using Poetry in the pipeline

When you have hundreds of Lambda functions to deploy being able to do it in a pipeline is essential. I use a docker container with Pyenv, Poetry and any relevant Python versions pre-installed to make it reasonably simple. Then assuming you lambdas are structured something like:

lambda_functions/
  function_1/
  function_2/
  ...

You just need to have a script that looks something like:

find "lambda_functions/$1" -name "pyproject.toml" -type f -exec dirname "{}" \; \
  | xargs -r -L1 sh -c "cd \"\$0\" \
  && poetry env use -- $(which python) \
  && poetry build --format wheel \
  && poetry run pip install --quiet --upgrade --only-binary :all: --platform manylinux2010_x86_64 --target package/ dist/*.whl \
  && cd package \
  && find . -exec touch -t 202201010000.00 {} \; \
  && zip -X --no-dir-entries --quiet --recurse-paths ../artifact.zip ."

That will leave you with an artifact.zip in each lambda functions folder that you can push with whatever deployment tool your using into AWS.

Testing and linting scripts would look very similar, eg. for testing:

find "lambda_functions/$1" -name "pyproject.toml" -type f -exec dirname "{}" \; \
  | xargs -r -L1 sh -c "cd \"\$0\" \
  && poetry env use -- $(which python) \
  && poetry build --format wheel \
  && poetry run pytest