Application Background
In the actual development, we often need to read and search for disk data, so there will often be the scene of reading data from the database. However, when the number of data accesses increases, too many disk reads may end up becoming the performance bottleneck of the entire system, or even overwhelm the entire database, leading to serious problems such as system jamming.
In a regular application system, we usually do a lookup of the database when needed, so the general structure of the system is shown below:
When the amount of data is high, you need to reduce the disk read/write operations inside the database, so you usually choose to add a layer of caching between the business system and the MySQL database to reduce the access pressure on the database.
But very often, the application of cache in the actual project is not so simple. Below we will list some of the problems through a few of the more classic several cache application scenarios:
1. Data consistency issues between cache and database
I’ve summarized the mechanisms commonly used for cache handling as follows:
Cache Aside, Read Through, Write Through, Write Behind Caching
Cache Aside模式
This pattern handles caching usually by first caching a query from the database, and then doing a lookup from the database if the cache doesn’t have a hit. The three scenarios that can happen here are as follows:
cache hit
When the query finds that the cache exists, then it is fetched directly from the cache.
cache failure
When the cache is empty, the source data is read from the database and added to the cache.
Cache Updates
When a new write operation is performed to modify the data in the database, the corresponding data in the cache needs to be invalidated after the write operation is completed.
This Cache aside pattern is usually the most commonly used pattern in real-world application development. But it is not that this pattern of cache processing can always be perfect, about this pattern there will still be defects. For example, a read operation, but did not hit the cache, and then go to the database to get data, at this time to a write operation, after writing the database, so that the cache is invalidated, and then, before that read operation and then put the old data in, so it will cause dirty data. Facebook bulls have also been on the cache to deal with this problem published a related paper, the link is as follows: www.usenix.org/system/file… Distributed environment to fully ensure data consistency is an extremely difficult thing, we can only minimize this data inconsistency problem arising from the situation.
Read Through mode
Read Through mode means that the application always requests data from the cache. If the cache has no data, it is responsible for retrieving data from the database using the underlying provider plug-in. After retrieving the data, the cache updates itself and returns the data to the calling application. There is an advantage to using Read Through. We always use the key to retrieve data from the cache, the calling application doesn’t know about the database, and the storage side takes care of its own cache handling, which makes the code more readable and clearer. But this also has the corresponding defects, developers need to write the relevant program plug-ins, increasing the difficulty of development.
Write Through Mode
Write Through mode is similar to Read Through mode, when the data is updated, first go to the Cache to update, if hit, then update the cache and then the Cache side to update the database, if not hit, then directly update the data in the Database.
Write Behind Caching Mode
Write Behind Caching This mode is usually written to the cache inside the data, and then asynchronously written to the database for data synchronization, this design can directly reduce the data we have for the direct access to the database to reduce the pressure, and at the same time, for multiple modifications to the database can be merged operation, greatly improving the system capacity. This greatly improves the carrying capacity of the system. However, this mode of processing cached data has certain risks, for example, when the cache machine downtime, the data will have the possibility of loss.
2. Cache penetration issues
In highly concurrent scenarios, cache penetration is a problem that is often encountered.
What is Cache Penetration
A large number of requests do not query the specified data in the cache, so they need to be queried from the database, causing cache penetration.
What will happen?
A large number of requests in a short period of time into the database to query will increase the pressure on the database, and ultimately lead to the database can not carry the pressure of a single request from the customer, downtime jamming and other phenomena.
Commonly used solutions
1. Null Cache
In some specific business scenarios, the query for the data may be empty, there is no actual existence, and this type of data information in a short period of time for a number of iterative query will not change, so the whole process, many times to request the database operation will appear to be somewhat redundant. It may be worthwhile to store these empty values (no query results of the data) corresponding to the key in the cache, then the second time to find the time you do not need to request the database again so troublesome, only through the memory query can be. This approach can greatly reduce the pressure on the database access.
2. Bloom Filter
Usually the key value of the data in the database can be stored in advance in the Bloom filter to go, and then first filtered in the Bloom filter, if you find that there is no Bloom filter, then go to redis to query, if there is no data in redis, then go to the database query. This can avoid non-existent data information also go to the repository to query the situation.
You can refer to my notes on Bloom filters: blog.csdn.net/Danny_idea/…
3. Cached avalanche scenes
What is a cache avalanche
When the cache server is restarted or a large number of caches are concentrated to be invalidated at a certain period of time, this will also put a lot of pressure on the back-end system (e.g., DB) at the time of invalidation.
How to avoid cache avalanche problems
1. Use a locking queue to cope with this problem. When there are multiple requests pouring in, when the cache fails to add a distributed lock, only allow the successful request to grab the lock to read the data inside the library and then stored in the cache, and then release the lock, so that subsequent read requests to get data from the cache. But this approach has certain drawbacks, too many read requests thread blocking, the machine memory will be full, but still not able to fundamentally solve the problem.
2. Before the concurrency scenario occurs, manually trigger a request to store all the cache to reduce the pressure of the first query of the database by a later request. The data expiration time settings are spread out as much as possible so that the data does not appear to have cache expiration at the same time.
3. From the cache availability point of view to think, to avoid the problem of a single point of failure of the cache, can be combined with the use of master-slave + Sentinel pattern to build the cache architecture, but this pattern to build the cache architecture has a drawback, that is, can not be cache slicing, the amount of data in the storage cache has a limitation, so it can be upgraded to the Redis Cluster architecture to optimize the processing. (Need to combine the actual economic strength of the enterprise, after all, the construction of Redis Cluster requires more machines)
4. Ehcache local cache + Hystrix flow limiting & degradation to avoid MySQL being killed. The purpose of using Ehcache local cache is also to consider the Redis Cluster completely unavailable when the Ehcache local cache can still support a period of time. Using Hystrix for flow limiting & degrading, for example, if 5000 requests come in a second, we can set the assumption that only 2000 requests a second can pass through the component, then the other remaining 3000 requests will go through the flow limiting logic. Then we call our own degradation component (degradation), such as setting some default values and so on. This protects MySQL from being killed by a large number of requests.