Akka streams provide a reactive streams implementation for handling streaming data asynchronously and with backpressure in Akka. Some of the key benefits of using Akka streams include:
Asynchronous Processing
Akka streams allow for asynchronous and non-blocking processing of streaming data. This means that one stream processing stage does not have to block and wait for the next stage to complete. Each stage can process data at its own pace while propagating backpressure signals backwards when needed.
This asynchronous and non-blocking behavior makes Akka streams well-suited for high throughput and low latency stream processing applications. It also helps with better resource utilization since threads do not have to stay idle waiting for IO operations to complete.
Backpressure Support
Akka streams have native support for backpressure. Backpressure provides a feedback mechanism where each stream processing stage can signal to upstream stages to slow down when it is overwhelmed. This prevents issues like buffer overflows or OutOfMemoryErrors from overloading a stage with more data than it can handle.
Backpressure makes Akka streams robust and resilient even under high data volume scenarios. The native backpressure support also standardizes backpressure mechanisms instead of custom implementations specific to each application.
Reactive Streams Compatibility
Akka streams implement the Reactive Streams specification for asynchronous and non-blocking stream processing. This makes Akka streams interoperable with other implementations of Reactive Streams such as RxJava, Ratpack, Vert.x etc.
Applications can integrate different stream processing libraries that support Reactive Streams. For example, incoming data can be received via Akka Streams, processed using RxJava operators and then persisted using another reactive library.
Stream Composition
Akka streams make it easy to compose reusable streaming components. Complex stream processing pipelines can be built by connecting reusable Source, Flow (transformation) and Sink components.
This composition model makes it easy to reason about stream processing chains. Each component encapsulates its execution while abstracting away the complexity. Components can be independently developed, tested and reused across applications.
Abstraction over Low Level APIs
Akka streams provide a higher level abstraction over raw asynchronous IO APIs like Java NIO. Developers do not have to work with low level APIs and callbacks. Common flow control and backpressure mechanisms are handled transparently by Akka streams.
This simplifies streaming application development. Developers can focus on business logic rather than low level resource and concurrency management.
Integration with Akka Actors
Akka streams integrate seamlessly with Akka actors. Queues and buffers used internally by Akka streams are implemented using Akka actors under the hood. As such streams can consume and publish data to actors transparently.
This makes it easy to plug in actor based business logic at any point in a streaming pipeline. For example, a Map async operator can perform asynchronous computation on each stream element using actors.
Fault Tolerance
Akka streams recover gracefully from failures and exceptions. Internal queues and buffers use actors, which enable them to persist data to be reprocessed after failures. Temporary failures are handled retry mechanisms, while permanent failures can be routed to alternate processing paths.
This built-in fault tolerance minimizes data loss during failures and enables continuous stream processing in presence of failures.
Testing Streams
Akka streams enable testing stream logic without involving actual IO or actors. Control interfaces allow programmatically injecting and consuming data from streams similar to testing iterators.
Unit testing stream components in isolation simplifies the testing process and enables test driven development practices for streaming applications.
Performance
Akka streams have excellent performance characteristics with optimized implementations for throughput and latency. Backpressure controls avoid overload scenarios, while batching and fused stages reduce context switching and intermediate boundaries.
Akka streams have been shown to handle over 100 million events per second on a single machine. Their performance characteristics make them well suited for building high performance data streaming applications.
Use Cases
Here are some common use cases where Akka streams provide benefits:
- Stream processing of high volume data – Fast data streams, user interaction streams etc.
- Asynchronous IO handling – Database drivers, REST API clients, file IO etc.
- Inter-service communication – Streaming integration between services, websockets etc.
- Complex event processing – Combining and aggregating multiple data streams.
- Reactive applications – React to data streams with low latency.
Stream Components
Akka streams are composed using the following components:
Source
A source represents the origin of a data stream. It might fetch data from a database, listen on a socket connection or generate a stream algorithmically.
Flow
A flow is a stream processing stage that transforms the data elements passing through it. This is where most stream business logic resides.
Sink
A sink represents the destination or termination of a stream. It might publish data to a message broker, write to database or simply consume the incoming stream.
Runnable Graph
The composed stream processing pipeline that connects sources, flows and sinks into a complete stream is called a RunnableGraph. This represents a reusable stream definition that can be materialized into running streams as needed.
Java API Overview
Here is a brief overview of some of the common classes in the Akka streams Java API:
Class | Description |
---|---|
Source | Represents a stream source |
Sink | Represents a stream sink |
Flow | Represents a stream transformation flow |
RunnableGraph | Represents a materializable stream blueprint |
ActorMaterializer | Materializes streams using actors |
Attributes | Used to parametrize Reactive Stream attributes |
Akka streams integrate with Akka actors. Key actor classes like UntypedActor and Props are used for managing stream components. Akka typed actors can also be used in streams.
Example
Here is a simple example of an Akka stream that generates a stream of numbers using a Source, transforms them using a Flow and prints them out using a Sink:
// Source that emits numbers
Source<Integer> numbers = Source.from(1, 2, 3, 4, 5);
// Flow that doubles numbers
Flow<Integer, Integer, NotUsed> doubler =
Flow.of(Integer.class).map(num -> num * 2);
// Sink that prints numbers
Sink<Integer> printSink =
Sink.foreach(num -> System.out.println(num));
// Wire up the stream
RunnableGraph<NotUsed> runnableGraph =
numbers.via(doubler).to(printSink);
// Materialize and run the stream
ActorMaterializer materializer =
ActorMaterializer.create(system);
materializer.materialize(runnableGraph);
This will print out 2, 4, 6, 8, 10.
Streams are materialized into running entities that can process data. The components are reusable and can be wired together in different ways to build other streaming pipelines.
Akka streams have many predefined and custom stream components. Complex use cases can be handled by composing these components in novel ways.
Integration with other Reactive Technologies
Akka streams being Reactive Streams compliant integrate nicely with other reactive libraries in the Java/JVM ecosystem. Here are some examples:
Spring WebFlux
Akka streams can be used for streaming data to and from reactive REST controllers in Spring WebFlux applications. The Spring BodyHandler
and WebFilter
interfaces have Akka variants that enable streaming request and response bodies.
Reactor
Reactor has an Akka streams connector that allows bridging between Project Reactor and Akka Streams. This allows building pipelines using both Reactor and Akka streams APIs.
RxJava
There are libraries like reactive-streams-rxjava
that enable integrating RxJava Observables with Reactive Streams like Akka. RxJava operators can be used in Akka stream flows for added functionality.
Kafka Streams
Kafka has Akka Streams connectors that allow reading and writing data to Kafka topics using Akka streams APIs. This provides integration between Kafka consumer/producer streams and Akka.
Cassandra
DataStax has developed Reactive Streams adapters for Cassandra that allow reading and writing data using Akka Streams style flows.
Quarkus & Vert.x
Quarkus provides out of the box support for Akka Streams. Vert.x also has Akka Streams adapters as part of the vert.x-reactive-streams module.
DSL Variants
Akka streams provide both a Java API as well as a Scala DSL for defining streams. The DSL provides a more fluent and idiomatic interface in Scala. Here is an example of the same stream definition using the Scala DSL:
// Source
val numbers = Source(1 to 5)
// Flow
val doubler = Flow[Int].map(num => num * 2)
// Sink
val printSink = Sink.foreach[Int](num => println(num))
// Runnable graph
val runnableGraph = numbers via doubler to printSink
// Materialize
runnableGraph.run()
The DSL makes stream definition highly readable and expressive. Under the hood the DSL constructs the same stream topology as the Java API. The DSL is available only in Scala, while the Java API can be used from both Java and Scala.
Actor Integration
Akka actors can be easily integrated with Akka streams. Here are some examples:
Consuming Messages from Actors
// Actor that sends messages
class MessageSender extends UntypedActor {
var msgNum = 1
override def onReceive(msg: Any) {
sender ! s"Message $msgNum"
msgNum += 1
}
}
// Source that receives messages from actor
val msgSource = Source.actorRef(senderRef, 10, OverflowStrategy.dropHead)
// Flow that processes messages
val processFlow = Flow[String].map(msg => /* process msg */)
// Sink
val msgSink = Sink.ignore
// Runnable graph
msgSource.via(processFlow).to(msgSink).run()
The actor source can transparently receive messages from actors and stream them further.
Publishing Stream Elements to Actors
// Actor that receives streamed elements
classReceiver extends UntypedActor {
override def onReceive(msg: String) {
// process msg
}
}
// Sink that publishes elements to an actor
val actorSink = Sink.actorRef(receiverRef, onCompleteMessage = "Done")
// Source, Flow
// Runnable graph
source.via(flow).to(actorSink)
The actor sink allows publishing elements to actors. The actor receives streamed elements as messages.
Using Actors for Async Processing
// Actor for async processing
class Processor extends UntypedActor{
override def onReceive(input: String) {
// ... process input
sender ! result
}
}
// Flow that sends elements to Processor actor
val processAsync = Flow[String].mapAsync(4)(elem =>
ask(processorRef, elem).mapTo[String]
)
// Usage
source.via(processAsync)....
The mapAsync
operator allows asynchronous stream processing using actors. Each element can be processed concurrently using multiple actors.
Parallelism
Akka streams provide ways to introduce parallelism in stream processing for performance:
- Asynchronous stages – Operators like
mapAsync
allow concurrent execution across elements - Graph stages – Single stages can be internally parallelized using graph stages
- Partitioning – Streams can be partitioned into multiple sub-streams using custom or round-robin partitioning
These approaches can introduce parallelism across operators, elements and streams themselves. Care must be taken to avoid issues like ordering when introducing parallelism.
Stream ordering
Akka streams maintain ordering of elements within a linear stream. However, in presence of:
- Internal buffers/queues
- Asynchronous processing
- Multiple sub-streams
Ordering between elements can be lost. For use cases that require strict ordering, care must be taken when designing the stream topology and bounds must be placed on async processing.
Debugging
Debugging stream processing issues can be challenging due to the asynchronous non-blocking nature. Akka streams provide some useful tools:
- Logging – Extensive log messages at DEBUG level help trace element flow.
- Stats and Control – Expose metrics and stream state to monitor and debug issues.
- TestKit – Intercept and control internal stream signals for testing.
These tools help gain visibility into the internal workings and state of streams during development and debugging.
Conclusion
In summary, Akka streams provide a powerful on-demand reactive streaming solution for Java and Scala based applications. Its asynchronous and backpressured architecture makes it well suited for highly concurrent data streaming applications.
Akka streams interoperate smoothly with Akka actors and other reactive technologies. They simplify streaming application development by providing high level abstractions over low level non-blocking IO mechanisms.
By composing reusable stream components in flexible ways, Akka streams enable building scalable, resilient and elastic streaming applications for today’s highly demanding data processing needs.