Process Table Function¶
Apache Flink PTF and Confluent Cloud PTFs for flink SQL, deployed as UDF. It supports N rows to M rows semantics.
Features¶
- Declare and implement State object, that will be persisted by flink, by partition key.
- The TableAPI or SQL most powerful function API. Supports Stateless or Stateful processing.
-
Access to Flink’s managed state, event-time and timer services, and underlying table changelogs
public static class CountState { public int count = 0; } public void eval( @StateHint CountState state, @ArgumentHint(SET_SEMANTIC_TABLE) Row input ) { state.count++; ...The increment is per partition key. Once deployed it is used as:
Use Cases¶
PTFs unlock use cases that can’t be expressed in a declarable way with either SQL or the Table API implementations. It serves a similar purposes compared to the ProcessFunction in Apache Flink’s Datastream API, giving primitives for handling the most common building blocks for stateful processing applications: events, state and timers.
- Apply transformations on each row of a table.
- Logically partition the table into distinct sets and apply transformations per set.
- Store seen events for repeated access.
- Continue the processing at a later point in time enabling waiting, synchronization, or timeouts.
- Buffer and aggregate events using complex state machines or rule-based conditional logic.