Pipe and Filter Pattern
Description
The pipe and filter pattern is loosely based on Unix pipes. It is a software architecture pattern that processes data through a series of filters connected by pipes. Each filter processes the data in some way and passes it to the next filter through a pipe. The filters are independent of each other and can be combined in different ways to create different processing pipelines.
Some common operations that can be implemented using the pipe and filter pattern include data transformation, data validation, data aggregation, and data routing. The pattern is particularly useful when the data processing logic can be decomposed into a series of independent steps that can be executed in parallel.
There are many real-examples of this sort of process. For example, in a compiler, the source code is processed through a series of filters that perform lexical analysis, syntax analysis, semantic analysis, code generation, and optimization. In a data processing pipeline, data is processed through a series of filters that perform data transformation, data validation, data aggregation, and data routing.
Diagram
Just drawing a diagram doesn't really help explain the pattern.
It's better to show an example of how the pattern is used in practice. In this example, we have a data processing pipeline that reads server health from some set of machines. The first filter transforms the raw data into more structure metrics. The second splits the data into two streams based on some criteria, perhaps the health of the server. Unhealthy servers have some additional context added by another filter. Finally, the data is aggregated and sent to a monitoring system.
When to Use
The pipe and filter pattern is most commonly used with data processing applications where the data can be decomposed into a series of independent steps that can be executed in sequence. Both batch processing and transaction bases systems can benefit from this pattern.
Advantages
Some of the advantages of the pipe and filter pattern include:
- It is easy to understand and support transformation reuse.
- Workflow style matches many business application.
- Works well with both sequential and concurrent systems.
- The system can easily evolve by adding transformations.
Disadvantages
Some of the disadvantages of the pipe and filter pattern include:
- Does not work for interactive systems.
- Data formats must be defined early to allow for component communication and might be difficult or impossible to change.
- Each transformation needs to parse its input and un-parse its output to the agreed on form, which may hurt performance.