Skip to content

User Defined Functions

User-defined functions (UDFs) are extension to SQL for frequently used logic and custom program and integration. It can be done in Java or PyFlink.

Confluent documentation on UDF and a Confluent git repo with a sample UDF, and my new repository for UDFs catalog.

UDF Catalog

This repository includes the following UDFs:

  • Geo Distance using the Haversine formula to compute distance between two points on earth. It requires the latitude and longitude of the two points.

Deploying to Confluent Cloud

  • Get FlinkDeveloper RBAC to be able to manage workspaces and artifacts
  • Use the Confluent CLI to upload the jar file. Example from GEO_DISTANCE

    confluent environment list
    # then in your environment
    confluent flink artifact create geo_distance --artifact-file target/geo-distance-udf-1.0-0.jar --cloud aws --region us-west-2 --environment env-nk...
    

    +--------------------+--------------+
    | ID                 | cfa-nx6wjz   |
    | Name               | geo_distance |
    | Version            | ver-nxnnnd   |
    | Cloud              | aws          |
    | Region             | us-west-2    |
    | Environment        | env-nknqp3   |
    | Content Format     | JAR          |
    | Description        |              |
    | Documentation Link |              |
    +--------------------+--------------+
    

    Also visible in the Artifacts menu

  • UDFs are registered inside a Flink database

    CREATE FUNCTION GEO_DISTANCE
    AS
    'io.confluent.udf.GeoDistanceFunction'
    USING JAR 'confluent-artifact://cfa-...';
    

  • Use the function to compute distance between Paris and London: