HDFS has a feature called storage types/policies - it makes possible to store files on stores with different properties (fast SSD or slow but cheap archival storage).
I wonder if it's possible to use this feature through HBase?
My use case is that I have some data that is "hot" and is expected to be accessed frequently so I want to put it in "hot" (SSD) storage and some data is "cold" and is accessed infrequently so I want to put it on a cheaper storage. And I'm trying to find out how to organize this with HBase/HDFS.
I can see that Storage policy let you specify a policy on a file or directory and they are applied according to certain rules.
We should remember that during HBase instalation we specify the HDFS directory where data is stored, for example:
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8030/hbase</value>
</property>
So, /hbase
is a HDFS directory where you can specify policies. We know that the HBSE directory structure would be something like:
hdfs://hbase/data/MyFirstNamespace/MyTable1
hdfs://hbase/data/MyFirstNamespace/MyTable2
Therefore, I would set up a Storage policy at directory level in HDFS, for example, Cold
for MyTable1 and All_SSD
for MyTable2:
hdfs storagepolicies -setStoragePolicy -path /hbase/data/MyFirstNamespace/MyTable1 -policy Cold
hdfs storagepolicies -setStoragePolicy -path /hbase/data/MyFirstNamespace/MyTable2 -policy All_SSD
This needs to be done after you create a new HBase table.