Why?
Decoupling, Improved scalability, Increased availability, Better performance
Questions
msg format, size, type (text/media)
repeated consumption
msg order
data retention
producer, consumers
delivery semantics ⇒ {atleast , atmost, exactly} once
target throughput, latency
Messaging Models
Point to point
Publish-Subscribe
Clients
Producer: pushes messages to specific topics.
Consumer group: subscribes to topics and consumes messages.
Core service and storage
- Broker: holds multiple partitions. A partition holds a subset of messages for a topic.
- Storage
- Data storage: messages are persisted in data storage in partitions.
- State storage: consumer states are managed by state storage.
- Metadata storage: configuration and properties of topics are persisted in metadata storage.
- Storage
- Coordination service
- Service discovery: which brokers are alive.
- Leader election: one of the brokers is selected as the active controller. There is only one active controller in the cluster. The active controller is responsible for assigning partitions.
- Apache Zookeeper [2] or etcd [3] are commonly used to elect a controller.