This document covers the technical steps for creating, releasing, suspending, and removing community providers. For governance policies, lifecycle stages, and health metrics, see Provider Governance.
Before proposing a new provider, review the acceptance process and ensure you have:
- At least two individuals willing to serve as stewards
- Sponsorship from at least one existing Airflow Committer (if none of the stewards are a committer)
- A commitment to meeting the incubation health metrics within 6 months
- A plan to participate in quarterly governance updates on the devlist
Include this information in your [DISCUSSION] thread when proposing a new provider.
This section gathers the necessary steps to create a new community provider and also guidelines for updating existing ones. Providers may have distinctions not covered in this guide. The sequence described was designed to meet the most linear flow possible in order to develop a new provider.
We recommend looking at an existing provider similar to yours for reference on tests, dependencies, and structure.
First, set up your local development environment. See
Contributors Quick Start
if you have not done so yet. We recommend using breeze to develop locally, as it provides an
environment similar to the one used by GitHub CI workflows.
./breezeThis sets up Docker containers that mount your local code to internal volumes. Changes made in your IDE are immediately available inside the container, allowing tests to be run quickly.
In this guide, <PROVIDER> is used as a placeholder for your provider name.
Below is the directory structure a provider may need. Not all providers require all components --- the folders are optional: example_dags, hooks, links, logs, notifications, operators, secrets, sensors, transfers, triggers (and the list changes continuously).
GIT apache/airflow/
└── providers/
└── <PROVIDER>/
├── pyproject.toml
├── provider.yaml
├── src/
│ └── airflow/
│ └── providers/<PROVIDER>/
│ ├── __init__.py
│ ├── executors/
│ │ ├── __init__.py
│ │ └── *.py
│ ├── hooks/
│ │ ├── __init__.py
│ │ └── *.py
│ ├── notifications/
│ │ ├── __init__.py
│ │ └── *.py
│ ├── operators/
│ │ ├── __init__.py
│ │ └── *.py
│ ├── transfers/
│ │ ├── __init__.py
│ │ └── *.py
│ └── triggers/
│ ├── __init__.py
│ └── *.py
└── tests/
├── unit/
│ └── <PROVIDER>/
│ ├── __init__.py
│ ├── executors/
│ │ ├── __init__.py
│ │ └── test_*.py
│ ├── hooks/
│ │ ├── __init__.py
│ │ └── test_*.py
│ ├── notifications/
│ │ ├── __init__.py
│ │ └── test_*.py
│ ├── operators/
│ │ ├── __init__.py
│ │ └── test_*.py
│ ├── transfers/
│ │ ├── __init__.py
│ │ └── test_*.py
│ └── triggers/
│ ├── __init__.py
│ └── test_*.py
├── integration/<PROVIDER>/
│ ├── __init__.py
│ └── test_integration_*.py
└── system/<PROVIDER>/
├── __init__.py
└── example_*.pyCreate unit tests for each component of your provider. Example of running unit tests inside breeze:
[Breeze:3.10.19] root@fafd8d630e46:/opt/airflow# python -m pytest providers/<PROVIDER>/tests/<PROVIDER>/hook/test_*.pyAn important part of building a new provider is the documentation. Some steps for documentation
occur automatically by prek --- see
Installing prek guide.
Key files in the Airflow source tree that affect providers:
- The
pyproject.tomlin root Airflow folder is automatically generated based on content ofprovider.yamlfile in each provider whenprek hookis run. - Files such as
extra-packages-ref.rstshould be manually updated because they are manually formatted for better layout andprek hookwill just verify if the information about provider is updated there. - Files like
commit.rstandCHANGELOGare automatically updated bybreeze release-managementcommand by release manager when providers are released.
Documentation file structure:
├── pyproject.toml
└── providers/<PROVIDER>/src/airflow/providers/
├── provider.yaml
├── pyproject.toml
├── CHANGELOG.rst
│
└── docs/
├── integration-logos
│ └── <PROVIDER>.png
├── index.rst
├── commits.rst
├── connections.rst
└── operators/
└── <PROVIDER>.rstIf your provider's name is not a common English word, add it to docs/spelling_wordlist.txt.
Add your provider dependencies into provider.yaml under dependencies key.
If your provider doesn't have any dependency add an empty list.
In the docs/apache-airflow-providers-<PROVIDER>/connections.rst:
- add information how to configure connection for your provider.
In the provider's docs/operators/<PROVIDER>.rst add information how to use the Operator.
It's important to add examples and additional information if your Operator has extra-parameters.
.. _howto/operator:NewProviderOperator:
NewProviderOperator
===================
Use the :class:`~airflow.providers.<PROVIDER>.operators.NewProviderOperator` to do something
amazing with Airflow!
Using the Operator
^^^^^^^^^^^^^^^^^^
The NewProviderOperator requires a ``connection_id`` and this other awesome parameter.
You can see an example below:
.. exampleinclude:: /../../<PROVIDER>/example_dags/example_<PROVIDER>.py
:language: python
:start-after: [START howto_operator_<PROVIDER>]
:end-before: [END howto_operator_<PROVIDER>]Copy from another, similar provider the docs: docs/*.rst. At least these docs should be present:
- security.rst
- changelog.rst
- commits.rst
- index.rst
- installing-providers-from-sources.rst
- configurations-ref.rst - if your provider has
configelement in provider.yaml with configuration options specific for your provider
Make sure to update/add all information that are specific for the new provider.
In the providers/<PROVIDER>/src/airflow/providers/<PROVIDER>/provider.yaml add information of
your provider:
package-name: apache-airflow-providers-<PROVIDER>
name: <PROVIDER>
description: |
`<PROVIDER> <https://example.io/>`__
versions:
- 1.0.0
integrations:
- integration-name: <PROVIDER>
external-doc-url: https://www.example.io/
logo: /docs/integration-logos/<PROVIDER>.png
how-to-guide:
- /docs/apache-airflow-providers-<PROVIDER>/operators/<PROVIDER>.rst
tags: [service]
operators:
- integration-name: <PROVIDER>
python-modules:
- airflow.providers.<PROVIDER>.operators.<PROVIDER>
hooks:
- integration-name: <PROVIDER>
python-modules:
- airflow.providers.<PROVIDER>.hooks.<PROVIDER>
sensors:
- integration-name: <PROVIDER>
python-modules:
- airflow.providers.<PROVIDER>.sensors.<PROVIDER>
connection-types:
- hook-class-name: airflow.providers.<PROVIDER>.hooks.<PROVIDER>.NewProviderHook
- connection-type: provider-connection-typeAfter creating and updating the files, build the documentation locally. The first command builds your provider's documentation, the second ensures the main Airflow documentation is also working:
breeze build-docs <provider id>
breeze build-docs apache-airflowSometimes providers need to have different variants for different versions of Airflow. This is done by:
- copying
version_compat.pyfrom one of the providers that already have conditional variants to the root package of the provider you are working on - importing the
AIRFLOW_V_X_Y_PLUSthat you need from that importedversion_compat.pyfile.
The main reasons we are doing it in this way:
- checking version >= in Python has a non-obvious problem that the pre-release version is always considered
lower than the final version. This is why we are using
AIRFLOW_V_X_Y_PLUSto check for the version that is greater or equal to the version we are checking against - because we want the RC candidates to be considered as equal to the final version (because those RC candidates already contain the feature that is added in the final version). - We do not want to add dependencies to another provider (say
common.compat) without strong need - Even if the code is duplicated, it is just one
version_compat.pyfile that is wholly copied and it is not a big deal to maintain it. - There is a potential risk of one provider importing the same
AIRFLOW_V_X_Y_PLUSfrom another provider (and introduce accidental dependency) or from test code (which should not happen), but we are preventing it via prek hookcheck-imports-in-providersthat will fail if theversion_compatmodule is imported from another provider or from test code.
When releasing providers for the first time, you need to release them in state not-ready.
This will make it available for release management commands, but it will not be added to airflow's
preinstalled providers list - allowing Airflow in main CI builds to be built without expecting the
provider to be available in PyPI.
You need to add --include-not-ready-providers if you want to add them to the list of providers
considered by the release management commands.
As soon as the provider is released, you should update the provider to state: ready.
Sometimes we might want to release provider for previous MAJOR when new release is already
released (or bumped in main). This is done by releasing them from providers-<PROVIDER>/vX-Y branch
- for example providers-fab/v1-5 can be used to release the 1.5.2 when 2.0.0 is already being
released or voted on.
The release process looks like usual, the only difference is that the specific branch is used to release the provider and update all documentation, the changes and cherry-picking should be targeting that branch.
We regularly bump min Airflow version for all providers we release. This bump is done according to our Provider policies and it is only applied to non-suspended/removed providers. We are running basic import compatibility checks in our CI and the compatibility checks should be updated when min Airflow version is updated.
Details on how this should be done are described in Provider policies
As of April 2023, we have the possibility to suspend individual providers, so that they are not holding back dependencies for Airflow and other providers. The criteria and process for suspending providers is described in Suspending releases for providers.
Suspending a provider is done by setting state: suspended in the provider.yaml of the
provider. This should be followed by committing the change and either automatically or manually running
prek hooks that will either update derived configuration files or ask you to update them manually.
Note that you might need to run prek several times until all the static checks pass,
because modification from one prek hook might impact other prek hooks.
If you have prek installed, it will be run automatically on commit. If you want to run it
manually after commit, you can run it via prek --last-commit some of the tests might fail
because suspension of the provider might cause changes in the dependencies, so if you see errors about
missing dependencies imports, non-usable classes etc., you will need to build the CI image locally
via breeze build-image --python 3.9 --upgrade-to-newer-dependencies after the first prek
and then run the static checks again.
If you want to be absolutely sure to run all static checks you can always do this via
prek --all-files.
Some of the manual modifications you will have to do (in both cases prek will guide you on what to do):
- You will have to run
breeze setup regenerate-command-imagesto regenerate breeze help files - you will need to update
extra-packages-ref.rstand in some cases - when mentioned there explicitly -pyproject.tomlto remove the provider from list of dependencies.
What happens under-the-hood as a result, is that pyproject.toml file is updated with
the information about available providers and their dependencies and it is used by our tooling to
exclude suspended providers from all relevant parts of the build and CI system (such as building CI image
with dependencies, building documentation, running tests, etc.)
The steps above are usually enough for most standalone providers. However, extra steps might be needed for providers that are used by other providers, or that are part of the default PROD Dockerfile.
Test collection failures
Most tests for the suspended provider will be automatically excluded by pytest collection. However,
if another provider depends on the suspended one, its tests might fail to be collected. In such cases,
add pytest.importorskip at the top of the failing test module.
Example failing collection after google provider has been suspended:
_____ ERROR collecting providers/apache/beam/tests/apache/beam/operators/test_beam.py ______
ImportError while importing test module '/opt/airflow/providers/apache/beam/tests/apache/beam/operators/test_beam.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.8/importlib/__init__.py:127: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
providers/apache/beam/tests/apache/beam/operators/test_beam.py:25: in <module>
from airflow.providers.apache.beam.operators.beam import (
airflow/providers/apache/beam/operators/beam.py:35: in <module>
from airflow.providers.google.cloud.hooks.dataflow import (
airflow/providers/google/cloud/hooks/dataflow.py:32: in <module>
from google.cloud.dataflow_v1beta3 import GetJobRequest, Job, JobState, JobsV1Beta3AsyncClient, JobView
E ModuleNotFoundError: No module named 'google.cloud.dataflow_v1beta3'
_ ERROR collecting providers/microsoft/azure/tests/microsoft/azure/transfers/test_azure_blob_to_gcs.py _Fix by adding this line at the top of the failing test module:
pytest.importorskip("apache.airflow.providers.google")Import failures in provider verification
Some providers might unconditionally import the suspended provider and fail during the provider verification step in CI. Turn these imports into conditional imports.
Example when import fails after amazon provider has been suspended:
Traceback (most recent call last):
File "/opt/airflow/scripts/in_container/verify_providers.py", line 266, in import_all_classes
_module = importlib.import_module(modinfo.name)
...
File "/usr/local/lib/python3.8/site-packages/airflow/providers/mysql/transfers/s3_to_mysql.py", line 23, in <module>
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
ModuleNotFoundError: No module named 'airflow.providers.amazon'or:
Error: The ``airflow.providers.microsoft.azure.transfers.azure_blob_to_gcs`` object in transfers list in
airflow/providers/microsoft/azure/provider.yaml does not exist or is not a module:
No module named 'gcloud.aio.storage'Fix by turning the feature into an optional provider feature:
try:
from airflow.providers.amazon.aws.hooks.s3 import S3Hook
except ImportError as e:
from airflow.exceptions import AirflowOptionalProviderFeatureException
raise AirflowOptionalProviderFeatureException(e)Other potential impacts
- If the suspended provider is part of the default Dockerfile, you might want to update the tests for
PROD docker image in
docker-tests/tests/docker_tests/test_prod_image.py. - Some suspended providers might also fail
breezeunit tests that expect a fixed set of providers. Those tests should be adjusted (but this is not very likely, because the tests use only the most common providers that we will not be likely to suspend).
Resuming providers is done by reverting the original change that suspended it. In case there are changes needed to fix problems in the reverted provider, our CI will detect them and you will have to fix them as part of the PR reverting the suspension.
When removing providers from Airflow code, we need to make one last release where we mark the provider as removed - in documentation and in description of the PyPI package. For the criteria and process, see Removing community providers.
The release manager has to add state: removed in the provider yaml file and include the provider in
the next wave of the providers (and then remove all the code and documentation related to the provider).
The state: removed flag will cause the provider to be available for the following commands (note that
such provider has to be explicitly added as selected to the package - it will not be included in
the available list of providers or when documentation is built unless --include-removed-providers
flag is used):
breeze build-docsbreeze release-management prepare-provider-documentationbreeze release-management prepare-provider-distributionsbreeze release-management publish-docs
For all those commands, release manager needs to specify --include-removed-providers when all providers
are built or must add the provider id explicitly during the release process.
Except the changelog that needs to be maintained manually, all other documentation (main page of the provider
documentation, PyPI README), will be automatically updated to include removal notice.