Message Queue¶
1. Introduction¶
In TCUP (TCS Connected Universe Platform) the Message Queue service allow users to create consumer groups and manage or modify the offsets(index for message stored in the underlying broker i.e apache kafka) with multiple parameters so that user can retriev old data/messages(which is already processed/consumed once) as per the requirement.
This Service is used to send, store and receive messages (of any volume) without the threat of losing messages or expecting the services to be available at all times to consume messages. The messages are stored in the kafka topic until they are processed. This service is essentially leveraged to dissociate heavyweight processing, ensure ease of workload and to buffer data/messages.
The service allows to extract various information from the consumer group such as:
Getting the status of the consumer group
Message statistics (incoming/outgoing message offsets)
Getting the number of topics and their partion details inside consumer group
Binding a consumer group with a topic
Unbinding a consumer group with a topic etc.
1.1 Intended Audience¶
The intended audience of this document is anyone who wants to have an overview of TCUP Message Queue Service. After going through this document, the user will understand the capability of TCUP Message Queue Service in IoT platform.
2. Key Concepts¶
In order to use the Message Queue Service, a user needs to understand some of the basic building blocks of the Service as described below:
2.1 Producer¶
Producer is a program or a component that creates and publishes messages to the message broker Apache Kafka .
2.2 Consumer¶
Consumer is a program or a component that attaches itself to the broker and subscribes to a topic precisely a partition inside a topic to consume the messages.
Note
The producer, consumer and the broker need not necessarily reside on the same machine.
2.3 Kafka Topic¶
A Topic is an element name inside the message broker(kafka) to which messages are stored and published.
2.4 Kafka topic partition¶
Kafka topics are divided into a number of segments called partition, which contain messages in an unchangeable sequence. Each message in a partition is assigned and identified by its unique index(called offset). A topic can also have multiple partitions.So thet multiple consumers can read from a topic in parallelly.
2.5 Offset¶
The offset is a simple integer number that is used by Kafka to maintain the position of a message for a partition inside a topic.
2.6 Log end offset¶
It is the index of the last message published in a particular topic partition. It can be defined by the number of messages published in the topic(partition).
2.7 Current offset¶
The current offset is a pointer to the last message that Kafka has already sent to a consumer. The current offset maintains the current position of the consumer. The consumer doesn’t get the same message twice because of the current offset.
2.8 Offset LAG¶
Number of message published in the topic but not consumed.
offset lag = (Log end offset - current offset)
2.9 Offset Reset/management¶
Changing the value of lag by increasing/decreasing the value of current offset.
The following are the different properties by which offsets can be changed:
2.9.1 to-earliest¶
Reset the current offset value to its earliest value or ‘0’.Consumers listening to the group will receive all the data from beginning.
2.9.2 to-latest¶
Reset the current offset to its latest value.2
2.9.3 to-datetime¶
Reset offsets to offset from the given datetime. The date time format will be ‘YYYY-MM-DDTHH:mm:SS.sss’. eg. 2020-09-02T03:32:31.298
2.9.4 to-offset¶
Reset offsets to a specific offset. The max value for the given offset value should be equal to the end offset value of the corresponding consumer group.
2.9.5 shift-by¶
Reset offsets shifting current offset by any long value ‘n’, where ‘n’ can be positive or negative. Offset cannot be negative The value of ‘n’ should be set accordingly.
2.9.6 by-duration¶
Reset current offsets to the offset by duration/timestamp from current timestamp. The timestamp format will be ‘PnDTnHnMnS’ . eg. ‘P2DT3H4M’ parses as “2 days, 3 hours and 4 minutes” ,’PT20S’ parses as “20 seconds”
2.10 Consumer group¶
A consumer group is a group of multiple consumers created to perform a particular function. One consumer group might be responsible for delivering messages to high-speed, in-memory microservices while another consumer group is streaming those same messages to Hadoop. Consumer groups have names to identify them from other consumer groups.Each consumer present in a group reads data directly from the exclusive partitions. In case, the number of consumers are more than the number of partitions, some of the consumers will be in an inactive state.
2.11 Bindings¶
This is a relationship between a topic and a consumer group that user can define using API by specifying topic name and consumer group name. This can be simply read as: the consumer_group is interested in messages from this topic.
3. Functional Capabilities¶
TCUPs Message Queue service has the following functional capabilities:
Can create consumer groups and manage consumer groups offsets with the various parameters such as the date, of the message received ,timestamp and many more.
Queries the following information about the consumer groups:
Status of the consumer groups
list of topics and their internal structure.
Number of undelivered messages(lag) for the topic partitions.
curret offset value of the topic partitions.
last offset value of the topic partitions.
Updates or deletes consumer groups.
Links a consumer group to a topic to receive the messages having the same identification labels from the broker.
Deletes a link between consumer group and topic.
It follows first in first out eviction policy.
4. Purpose/Usage¶
The purpose of Message Queue service is to create a general purpose data queue for other service to consume.
Using MQ service, data can be retained until consumer is ready to consume the data.
We can use MQ service to distribute the load. When multiple consumers subscribe to a single queue then the first data will be received by the first consumer and then the subsequent data will be received by the subsequent consumer and so on. Message Queue allows to transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be available.
We can use MQ service to replay the old messaged which are already processed or consumed by resetting the current position of the consumer.
5. Example¶
In order to optimize resource consumption, the energy consumption pattern in an office building can be monitored based on the employee occupancy at any given time and precautionary measures can be taken to control energy wastage as and when any anomaly gets detected in the usage pattern.
The streams of data from the energy meter and occupancy sensors are sent to Message Routing services and the output is passed for further processing by an application or service. If the application or service is not available to receive the messages or is undergoing downtime for maintenance, the messages getting generated by message routing would get lost.
Therefore a message queue is created as a solution and is linked to a message routing endpoint to prevent the loss of any messages irrespective of the status of the application or service. When the application or service is available to receive the data, it will get all the messages which got generated during its absence.
For another example consider the situation where 50,000 cars are monitored and tracked and each car is sending a data to TCUP services at an interval of every 5 seconds. As a result the producer is producing the messages at a high rate but the consumer/application is unable to hold that load. The load needs to be distributed amongst the multiple consumers in such a way that the same data does not get executed multiple times by multiple consumers. This can be achieved by connecting all the consumers to a consumer group.
6. Reference Document¶
Please refer to the following documents for more details about this service:
API Guide