Hot Storage vs Cold Storage: Choosing the Right Tier for Your Data
The cloud has changed the way we think about data storage. No longer are we limited by the physical space in our office or data center. But with this newfound freedom comes some challenges. How do we decide what data to store where? How do we keep track of it all? These are important questions to answer as we move more and more data to the cloud. One way to think about it is in terms of hot, warm, and cold storage.
- Hot storage is data that is accessed frequently. This could be data that is being actively used by employees or customers. It needs to be stored on fast storage so that it can be accessed quickly.
- Warm storage is data that is accessed less frequently. This could be data that is used for reporting or analytics. It doesn’t need to be accessed as quickly as hot data, so it can be stored on slightly slower, capacity-optimized
- Cold storage is data that is rarely accessed. This could be data that is archived for compliance reasons. It can be stored on even slower, “cheap and deep” storage.
When deciding what data to store where, it’s important to consider the access patterns. Hot data needs to be stored on fast storage, for instance, a solid state disk (SSD). Warm data can be stored on slower storage, for example hard drives, and cold data can be stored on the cheapest storage, one that may incur some retrieval costs — for instance, an inexpensive object storage service.
Modern data management systems allow you to optimize the placement of your data so that hot data is stored on fast storage, warm data is stored on slower storage, and cold data is stored on your “cheap and deep” storage tier. This way, you can get the performance you need for hot data while still minimizing storage costs.
There are two placement approaches used by data management systems — tiering and caching.
- Tiering refers to having a lifecycle that moves each data object (or file) to the most appropriate storage tier based on its access frequency. For example, new files may be created in the hot storage tier, and if the system determines that they were not accessed for some period, the files are moved to a warm storage tier, and if they are still not accessed for a long time, they are eventually moved to a cold storage tier.
- Caching is quite different from tiering. Unlike tiering which moves your data between the storage tiers, in caching the gold copy of all your data resides in an inexpensive storage tier such as object storage. To provide decent access speeds, the system keeps a cache of frequently accessed files in a fast storage tier. In other words, caching does not move data across tiers — it keeps copies of some of the data in fast storage to provide quick access.
If you would like to learn more about caching vs tiering, and the benefits of each, read on in my article Cloud Caching vs. Tiering: Know the Difference.