Cache Coherence: Get Its Complete Information Now [MiniTool Wiki]
What Is Cache Coherence?
To begin with, what is cache coherence? In computer architecture, cache consistency is the unity of shared resource data, and the resource is ultimately stored in multiple local caches. When the client in the system maintains caches of common memory resources, data inconsistency may cause problems, especially for the CPU in a multi-processing system.
In a shared-memory multiprocessor system, each processor has a separate cache memory, and there can be many copies of shared data: one in main memory, and one in the local cache of each processor that requested it. When one copy of the data is changed, the other copies must reflect the change. Cache coherence is to ensure that changes in shared operands (data) values are propagated in the entire system in time.
The following are the requirements for cache coherence:
- Write Propagation: Any data changes in the cache must be propagated to other copies (the copy of the cache line) in the peer caches.
- Transaction Serialization: All processors must see the read/write to a single memory location in the same order.
In theory, cache coherence can be enforced at load/store granularity. However, in practice, it is usually executed at the granularity of cache blocks.
Coherence defines the behavior of reads and writes to a single address location. A kind of data that appears in different caches at the same time is called cache coherence, which is called global memory in some systems.
In a multi-processor system, consider that more than one processor has cached a copy of memory location X. To achieve cache coherence, the following conditions must be met:
- When the processor P reads the location X after the write of X by the same processor P, when the write of X by another processor does not occur between the write and the read instruction by P, X must always return the value written by P.
- After processor P1 reads location X and another processor P2 writes X, there are no other writes to processor X between the two accesses, and the read and write must be sufficiently separated. Therefore, X must always return the value written by P2. This condition defines the concept of a coherent view of memory. Propagating writes operations to shared memory locations ensures that all caches have a coherent view of memory. If processor P1 reads the old value of X, even after P2 is written, it can be said that the memory is incoherent.
The above conditions meet the Write Propagation criteria required for cache coherence. However, they are insufficient because they do not meet the Transaction Serialization conditions.
Cache Coherence Mechanisms
The two most common mechanisms for ensuring consistency are snooping and directory-based. Each mechanism has its own advantages and disadvantages. If there is enough bandwidth, snooping based protocols tend to be faster, because all transactions are requests/responses seen by all processors.
The disadvantage is that snooping cannot be extended. Each request must be broadcast to all nodes in the system, which means that as the system becomes larger, the size of the bus (logical or physical) and the bandwidth it provides must continue to increase.
On the other hand, directories tend to have longer latencies (3-hop request/forward/response), but use much less bandwidth because the messages are point to point rather than broadcast. Therefore, many larger systems (>64 processors) use this type of cache coherence.
The coherence protocol applies cache coherence in a multi-processor system. The goal is that two clients must not see different values of the same shared data. The protocol must fulfill the basic requirements of coherence. It can be tailored to the target system or application.
Protocols can also be classified as snoopy or directory-based. Generally, early systems used directory-based protocols, where the directory would track the data being shared and the sharers. In the snoopy protocol, transaction requests (to read, write or upgrade) are sent to all processors. All processors snoop the requests and respond appropriately.
Related post: Some Guides on How to Clear Cache on Windows 10/8/7