bitmap简介
bitmap也即位图,在图形学中用的比较多,在互联网业务中也常常作为位存储结构使用。本文主要介绍Redis bitmap数据结构在互联网业务中的使用。
redis bitmap
bitmap可以作为安全打击以及VIP之类的等级信息位记录结构,这里就会涉及bitmap选型了,常用的有:Redis bitmap——也是本文要详细介绍的bitmap选型
redis bitmap基本知识
- 1.bitmap位数限制
Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type. Since strings are binary safe blobs and their maximum length is 512 MB, they are suitable to set up to 232 different bits.
也即可以支持2^32位
- 2.bitmap位基本操作
Bits are set and retrieved using the SETBIT and GETBIT commands:
> setbit key 10 1
(integer) 1
> getbit key 10
(integer) 1
> getbit key 11
(integer) 0
The SETBIT command takes as its first argument the bit number, and as its second argument the value to set the bit to, which is 1 or 0. The command automatically enlarges the string if the addressed bit is outside the current string length.
GETBIT just returns the value of the bit at the specified index. Out of range bits (addressing a bit that is outside the length of the string stored into the target key) are always considered to be zero.
- 3.bitmap位基本操作性能
Bit operations are divided into two groups: constant-time single bit operations, like setting a bit to 1 or 0, or getting its value, and operations on groups of bits, for example counting the number of set bits in a given range of bits (e.g., population counting).
也即SETBIT
与GETBIT
都是O(1)时间复杂度
- 4.bitmap存储
One of the biggest advantages of bitmaps is that they often provide extreme space savings when storing information. For example in a system where different users are represented by incremental user IDs, it is possible to remember a single bit information (for example, knowing whether a user wants to receive a newsletter) of 4 billion of users using just 512 MB of memory.
也即bitmap是非常节省存储的数据类型
- 5.bitmap应用场景
Common use cases for bitmaps are:
- Real time analytics of all kinds.
- Storing space efficient but high performance boolean information associated with object IDs. For example imagine you want to know the longest streak of daily visits of your web site users. You start counting days starting from zero, that is the day you made your web site public, and set a bit with SETBIT every time the user visits the web site. As a bit index you simply take the current unix time, subtract the initial offset, and divide by 3600*24.
This way for each user you have a small string containing the visit information for each day. With BITCOUNT it is possible to easily get the number of days a given user visited the web site, while with a few BITPOS calls, or simply fetching and analyzing the bitmap client-side, it is possible to easily compute the longest streak.
Bitmaps are trivial to split into multiple keys, for example for the sake of sharding the data set and because in general it is better to avoid working with huge keys. To split a bitmap across different keys instead of setting all the bits into a key, a trivial strategy is just to store M bits per key and obtain the key name with bit-number/M and the Nth bit to address inside the key with bit-number MOD M.
也即bitmap应用场景是:要求节省空间或者高性能的boolean分析操作
redis bitmap性能测试
这里利用redis-benchmark压测redis bitmap,如下:
getbit
date;./redis-benchmark -e -h xxx -p xxx -c 60 -a pwd -r 10000 -n 100000000 getbit __rand_int__ 64;date
setbit
date;./redis-benchmark -e -h xxx -p xxx -c 60 -a pwd -r 10000 -n 100000000 setbit __rand_int__ 64 1;date
可以在多台机器上同时用如上命令进行压测,然后将压测结果累加,不断添加机器,直到结果不能增加为止,也即最终的压测性能结果
应用
如上文所述,redis bitmap可以作为安全打击以及VIP之类的等级信息位记录结构,而使用上也非常简单,只需要定义好哪些位代表什么含义,就可以用SETBIT
与GETBIT
命令进行设置、清除以及获取,比如第一位表示是VIP用户,则有如下操作:
- 设置VIP用户
setbit key 0 1
- 取消VIP用户
setbit key 0 0
- 获取用户是否为VIP
getbit key 0