Process Table Function¶

Apache Flink PTF and Confluent Cloud PTFs for flink SQL, deployed as UDF. It supports N rows to M rows semantics.

Features¶

Declare and implement State object, that will be persisted by flink, by partition key.
The TableAPI or SQL most powerful function API. Supports Stateless or Stateful processing.

Access to Flink’s managed state, event-time and timer services, and underlying table changelogs

public static class CountState {
    public int count = 0;
}

public void eval(
    @StateHint CountState state,
    @ArgumentHint(SET_SEMANTIC_TABLE) Row input
) {
    state.count++;
    ...

The increment is per partition key. Once deployed it is used as:

SELECT *
FROM EventCounter(
    input => TABLE examples.marketplace.clicks PARTITION BY user_id,
    uid => 'event-counter-v1'
);

Use Cases¶

PTFs unlock use cases that can’t be expressed in a declarable way with either SQL or the Table API implementations. It serves a similar purposes compared to the ProcessFunction in Apache Flink’s Datastream API, giving primitives for handling the most common building blocks for stateful processing applications: events, state and timers.

Apply transformations on each row of a table.
Logically partition the table into distinct sets and apply transformations per set.
Store seen events for repeated access.
Continue the processing at a later point in time enabling waiting, synchronization, or timeouts.
Buffer and aggregate events using complex state machines or rule-based conditional logic.

Process Table Function¶

Features¶

Use Cases¶

Sources¶