Using Terraform to deploy Flink App or Statement¶
Source of major information can be found in the Terraform Confluent provider formal documentation, and in the examples of deployment.
There are two approaches to manage Confluent Platform Kafka cluster in the same Terraform workspace:
- Manage multiple clusters
- Manage a single Kafka cluster
But Terraform plugin also support Flink statement deployments on Confluent Cloud.
Pre-requisites¶
- If not done create Confluent Cloud API key with secret for your user or for a service account (See confluent.cloud/settings/api-keys). Your user needs
OrganizationAdmin
role. For production do not use, user keys but prefer service account keys. -
If not already done create a service account for terraform runner. Assign the OrganizationAdmin role to this service account by following this guide.
-
To get the visibility of the existing keys use the command:
-
Export as environment variables:
Infrastructure¶
Kafka¶
See the Product documentation to create kafka cluster with Terraform. The basic cluster sample project describes the needed steps, but it is recommended to use standard kafka cluster with RBAC access control.
This repository includes a demo with IaC definition in the deployment cc-terraform folder which defines the following components:
- Confluent Cloud Environment
- A Service account to manage the environment:
env_manager
with the role ofEnvironmentAdmin
and API keys - A Kafka cluster in a single AZ (for demo), with service accounts for app-manager, producer and consumer apps
- A schema registry with API keys to access the registry at runtime
Compute pool¶
Flink Compute pool can also be configured with Terraform, and see this example in the Terraform Confluent quickstart repository.
The flink.tf
in deployment cc-terraform: defines the following components:
- A flink pool, with 2 service accounts, one for flink app management and one for developing flink statements.
Deploy the configuration¶
-
Use the classical tf commands:
-
If there is a 401 error on accessing Confluent, it is a problem of api_key within the environment variables.
-
The output of this configuration needs to be used by other deployment like the Flink statement ones. It can be retrieved at any time with
Deploy a Flink statement¶
Flink tatement can be deployed using Terraform: DDL is the easier part. Automation of DML statement is more complex: the first deployment on a fresh environment is simple, the challenges come when doing cannary deployment on existing running statements for logic upgrade. For that see the statement life cycle chapter in the cookbook chapter.
resource "confluent_flink_statement" "ddl-dim_tenant-v1" {
properties = {
"sql.current-catalog" = var.current_catalog
"sql.current-database" = var.current_database
}
statement = file("${path.module}/sql-scripts/ddl.dim_tenant.sql")
statement_name = "dml-dim-tenant_v1"
stopped = false # change when updating the statement
}
statement can be a reference to a file, or a string including the statement content.