Big data processing frameworks

The Big data ecosystems data processing frameworks are classified in the following blocks:

Batch Processing

  • Hadoop Map-Reduce: Batch or batch processing engine.

Real-time processing

  • Apache Storm
  • Apache Samza
  • IBM InfoSphere
  • Apache S4 (Yahoo)
  • Apache complexion

Hybrid processing

  • Apache Spark streaming: Batch processing engine with streaming functions via micro-batches. Uses a lambda architecture 
  • Apache Flink: Streaming processing engine where Bach processing is a particular case of streaming. It uses a Kappa architecture.