I. What is a message queue?
Message Queuing I don’t know if people think it’s a higher-end technology when they see the term, but I think it seems like it’s pretty awesome anyway.
Message Queue, generally we will shorten it to MQ (Message Queue), well, is very straightforward shorthand.
Let’s ignore the word message (Message) and take a look at the queue (Queue). This look at the queue should be familiar to everyone.
A queue is a first-in-first-out data structure.
There have been quite a few queues implemented in Java:
Why do you need message queuing (MQ) as a middleware? In fact, this question, and before I learned Redis when very similar. Redis is a form of key-value
stored in memory database, obviously we can use a similar HashMap implementation of this class can achieve similar results, then why Redis? Redis collection.
By this point, you can take a first guess as to why a middleware like Message Queue (MQ) is used, and will continue to add to it below.
Message queuing can be simply understood as putting the data to be transmitted in a queue.
)
- Putting data into a message queue is called a producer
- Fetching data from a message queue is called a consumer.
Second, why use a message queue?
Why message queues, that is, asking: what are the benefits of using message queues. Let’s look at the following scenario
2.1 Decoupling
Now I have a system A. System A can produce a userId
Then, there is now System B and System C that both need this userId
to do the relevant operations
Written as pseudo-code it might look like this:
public class SystemA {
SystemB systemB = new SystemB();
SystemC systemC = new SystemC();
private String userId = "Java3y";
public void doSomething() {
systemB.SystemBNeed2do(userId);
systemC.SystemCNeed2do(userId);
}
}
The structure is illustrated below:
OK, all was safe and sound for a couple of days.
One day, the person in charge of System B tells the person in charge of System A that the interface SystemBNeed2do(String userId)
is no longer in use in System B and to tell System A not to tune it.
So the person in charge of System A says “OK, I won’t call you then.” , and so the code to call the System B interface was removed:
public void doSomething() {
//systemB.SystemBNeed2do(userId);
systemC.SystemCNeed2do(userId);
}
After a few days, the person in charge of system D took a demand, also need to use the userId of system A, so he ran to say to the person in charge of system A: “Older brother, I want to use your userId, you adjust my interface.”
So System A says, “No problem, here we go.”
The code for System A is then as follows:
public class SystemA {
// SystemB systemB = new SystemB();
SystemC systemC = new SystemC();
SystemD systemD = new SystemD();
private String userId = "Java3y";
public void doSomething() {
//systemB.SystemBNeed2do(userId);
systemC.SystemCNeed2do(userId);
systemD.SystemDNeed2do(userId);
}
}
Time flies:
After a few more days, the person in charge of System E comes over and tells System A that the userId is needed.
After a few more days, the person in charge of System B came over and told System A that it would be better to re-drop that interface.
After a few more days, the person in charge of System F comes over and tells System A that the userId is needed.- ……
So the person in charge of System A, was harassed by this every day, changing it around and changing it …….
There is another issue, when calling system C, if system C hangs, system A still has to figure out how to handle it. If the request times out due to network latency when calling System D, does System A feed back fail
or retry?
Eventually, the person in charge of System A, decided it was no fun to change things around every so often, and ran away.
Then, the company recruited a big guy, the big guy after a few days of familiarization, came up and said: write the userId of system A to the message queue, so that system A does not have to change often. Why? Let’s look at it together below:
System A writes the userId to the message queue, and System C and System D take data from the message queue. What is the benefit of this?
System A is only responsible for writing the data to the queue, who wants or doesn’t want this data (message), system A doesn’t care at all.
Even if system D doesn’t want the userId data anymore, and system B suddenly wants the userId data, it has nothing to do with system A. System A doesn’t have to change a single bit of code.
System D no longer passes through system A to get the userId, but gets it from the message queue. Even if system D hangs or the request times out, it has nothing to do with system A, but only with the message queue.
In this way, system A is decoupled from systems B, C and D.
2.2 Asynchronous
Let’s take a look at the following scenario: system A still calls system B, C, and D directly
The code is as follows:
public class SystemA {
SystemB systemB = new SystemB();
SystemC systemC = new SystemC();
SystemD systemD = new SystemD();
private String userId ;
public void doOrder() {
userId = this.order();
systemB.SystemBNeed2do(userId);
systemC.SystemCNeed2do(userId);
systemD.SystemDNeed2do(userId);
}
}
Suppose it takes 50ms for system A to work out the specific value of userId, 300ms to call the interface of system B, 300ms to call the interface of system C, and 300ms to call the interface of system D. Then this request will take 50+300+300+300=950ms
And we are told that system A does the main business, while systems B, C, and D are non-main business. For example, system A deals with order placement, while system B is order placement is successful, that sends a text message to tell the specific user that this order has been successful, while system C and system D also deal with some minor things only.
Then at this point, in order to improve the user experience and throughput, you can actually call the interfaces of systems B, C and D asynchronously. So, we can make it look like this:
When system A is done, it writes the userId to the message queue and then just returns it (as for other operations, they are processed asynchronously).
- It would have taken 950ms for the entire request (synchronized)
Now asynchronize calls to other system interfaces to only 100ms (asynchronous)
(The examples may not be very good, but I think it’s fine to illustrate to the point, forgive me.)
2.3 Peak shaving/current limiting
Let’s take another scenario, now we are going to have a big sale once a month, and the concurrency during the sale could be very high, say 3000 requests per second. Let’s say we now have two machines handling the requests, and each machine can only handle 1000 requests at a time.
That extra 1,000 requests might just crash our whole system… So, there is a way that we can write to the message queue:
System B and System C go to the message queue to get data based on the number of requests they can handle, so that even if there are 8,000 requests per second, that’s just putting the requests in the message queue, and going to the message queue is under the control of the system itself, so that it doesn’t break the whole system.
Third, what are the problems with using message queues?
After our scenario above, we can already see that there is actually quite a lot that message queues can do.
That said, let’s go back to the beginning of the article, “Obviously the JDK already has quite a few queue implementations, we still need message queue middleware?” In fact, it is very simple, although there are many kinds of queues implemented by the JDK, but they are all simple memory queues. Why do I say JDK is a simple memory queue it? Let’s take a look at what we might want to consider in order to implement a message queue (middleware).
3.1 High Availability
Whether we are using message queues for decoupling, asynchronous or clipping, the message queue certainly can’t be a standalone machine. Try to think about it, if the message queue is a standalone machine, in case that machine hangs, then our whole system is almost unavailable.
So when we use message queues in our projects, we have to do . To do one necessarily expects that message queue to provide out-of-the-box support, rather than writing the code to implement it manually.
3.2 Data loss issues
We write our data to the message queue, and system B and C hang up before they have a chance to fetch the data from the message queue. If nothing was done, we lost our data.
Anyone who has studied Redis knows that Redis can persist data on disk, so that in case Redis hangs, it can still recover the data from the disk. Similarly, the data in the message queue needs to exist somewhere else, so as to minimize data loss.
Where does that exist?
- Database?
- Redis?
- Distributed file system?
Synchronous or asynchronous storage?
3.3 How does a consumer get data from a message queue?
How does a consumer get data from the message queue? There are two ways:
The producer will put the data into the message queue, the message queue has data, and actively call the consumer to get (commonly known as push)
Consumers constantly go round the message queue to see if there is new data and consume it if there is (commonly known as pull)
3.4 Other
In addition to this, we have to consider various things when we use it:
- What should I do if I double-consume a message?
- I want to make sure the messages are absolutely in order how?
- ……..
Although message queues bring us so many benefits, but at the same time we find that the introduction of message queues will also increase the complexity of the system. There are now a number of message queue wheels on the market, each message queue has its own characteristics, which MQ to choose still have to be carefully considered.