1 基本概念
redis速度非常快,但是有多块呢?首先我们需要分析一下当client发起对server的调用到获得结果这段时间内都经历了那些主要的步骤,比如如下代码:
Jedis jedis = new Jedis("localhost");
String result = jedis.set("name", "lnh");
详细分解一下其中经历的主要步骤:
- client发起调用;
- 初始化网络连接(或者从client端维护的连接池中获取连接);
- 把java方法调用和数据对象序列化成
RESP
1协议格式; - 写入网络I/O。
- 网络传输;
- 把上一步的转换成resp协议后的数据通过网络发送给server。
- server端处理调用;
- 接收请求数据,解析resp协议格式的数据;
- 执行解析后的command;如果开启AOF,则也会处理AOF的事情。
- 把执行结果序列化为resp协议格式。
- 网络传输;
- 把上一步的转换成resp协议后的数据通过网络发送给client。
- client接收响应;
- 读取网络IO数据,解析resp协议格式的数据。
- 反序列化为Java对象。
总结来说主要是3大块:client、网络传输、server。那么从使用者的角度来看,重点需要关注的在于client端的序列化以及网络连接消耗。比如采用了不合适的数据结构,导致每次需要传输的数据量过大;以及连接池的过大或过小,或者根本没,从而增大每次建立底层TCP连接的消耗;再有就是server端的配置导致一些额外的操作(aof的appendfsync配置2)、或者会导致长时间阻塞操作的命令导致的server端处理能力的下降。
有了对以上的基本概念的认知和理解后,就会发现有时候我们简单的写一个for循环重复取执行某一个操作的这种测试,其实是没有任何参考意义的,最终只是沦为对网络传输效率的测试。
2 测试工具
redis-benchmark
3是redis提供的一个基准测试工具,可以模拟N个客户端同时发出M个请求。当然我们的基准性能测试并不能完全模拟出实际的业务调用,不过至少可以根据以上的基础概念,来组织出来近似的测试用例来检查我们所需的配置。
查看帮助redis-benchmark --help
:
Usage: redis-benchmark [-h <host>] [-p <port>] [-c <clients>] [-n <requests>] [-k <boolean>]
-h <hostname> Server hostname (default 127.0.0.1)
-p <port> Server port (default 6379)
-s <socket> Server socket (overrides host and port)
-a <password> Password for Redis Auth
--user <username> Used to send ACL style 'AUTH username pass'. Needs -a.
-c <clients> Number of parallel connections (default 50)
-n <requests> Total number of requests (default 100000)
-d <size> Data size of SET/GET value in bytes (default 3)
--dbnum <db> SELECT the specified db number (default 0)
--threads <num> Enable multi-thread mode.
--cluster Enable cluster mode.
--enable-tracking Send CLIENT TRACKING on before starting benchmark.
-k <boolean> 1=keep alive 0=reconnect (default 1)
-r <keyspacelen> Use random keys for SET/GET/INCR, random values for SADD,
random members and scores for ZADD.
Using this option the benchmark will expand the string __rand_int__
inside an argument with a 12 digits number in the specified range
from 0 to keyspacelen-1. The substitution changes every time a command
is executed. Default tests use this to hit random keys in the
specified range.
-P <numreq> Pipeline <numreq> requests. Default 1 (no pipeline).
-e If server replies with errors, show them on stdout.
(no more than 1 error per second is displayed)
-q Quiet. Just show query/sec values
--precision Number of decimal places to display in latency output (default 0)
--csv Output in CSV format
-l Loop. Run the tests forever
-t <tests> Only run the comma separated list of tests. The test
names are the same as the ones produced as output.
-I Idle mode. Just open N idle connections and wait.
--help Output this help and exit.
--version Output version and exit.
Examples:
Run the benchmark with the default configuration against 127.0.0.1:6379:
$ redis-benchmark
Use 20 parallel clients, for a total of 100k requests, against 192.168.1.1:
$ redis-benchmark -h 192.168.1.1 -p 6379 -n 100000 -c 20
Fill 127.0.0.1:6379 with about 1 million keys only using the SET test:
$ redis-benchmark -t set -n 1000000 -r 100000000
Benchmark 127.0.0.1:6379 for a few commands producing CSV output:
$ redis-benchmark -t ping,set,get -n 100000 --csv
Benchmark a specific command line:
$ redis-benchmark -r 10000 -n 10000 eval 'return redis.call("ping")' 0
Fill a list with 10000 random elements:
$ redis-benchmark -r 10000 -n 10000 lpush mylist __rand_int__
On user specified command lines __rand_int__ is replaced with a random integer
with a range of values selected by the -r option.
从上述的帮助文档中可以看出它提供有如下几点功能:
-c
:并行的client的数量。-n
:总的请求数量。-k
:是否使用长连接。-r
:value的数据块的大小。-d
:key的随机大小。-P
:pipline中的命令条数。-t
:特定的命令。
3 测试用例
比如执行以下命令测试1000个随机key的set
和lpush
结果:
$ redis-benchmark -t set,lpush -n 100000 -r 1000
====== SET ======
100000 requests completed in 2.09 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 300 10
host configuration "appendonly": yes
multi-thread: no
Latency by percentile distribution:
0.000% <= 0.191 milliseconds (cumulative count 1)
50.000% <= 0.671 milliseconds (cumulative count 50901)
75.000% <= 0.887 milliseconds (cumulative count 75431)
87.500% <= 1.063 milliseconds (cumulative count 87642)
93.750% <= 1.215 milliseconds (cumulative count 93930)
96.875% <= 1.351 milliseconds (cumulative count 96973)
98.438% <= 1.479 milliseconds (cumulative count 98461)
99.219% <= 1.623 milliseconds (cumulative count 99234)
99.609% <= 1.823 milliseconds (cumulative count 99611)
99.805% <= 2.175 milliseconds (cumulative count 99807)
99.902% <= 2.559 milliseconds (cumulative count 99904)
99.951% <= 3.367 milliseconds (cumulative count 99952)
99.976% <= 4.991 milliseconds (cumulative count 99976)
99.988% <= 5.407 milliseconds (cumulative count 99988)
99.994% <= 5.647 milliseconds (cumulative count 99994)
99.997% <= 5.783 milliseconds (cumulative count 99997)
99.998% <= 5.831 milliseconds (cumulative count 99999)
99.999% <= 5.895 milliseconds (cumulative count 100000)
100.000% <= 5.895 milliseconds (cumulative count 100000)
Cumulative distribution of latencies:
0.000% <= 0.103 milliseconds (cumulative count 0)
0.002% <= 0.207 milliseconds (cumulative count 2)
1.928% <= 0.303 milliseconds (cumulative count 1928)
9.816% <= 0.407 milliseconds (cumulative count 9816)
24.073% <= 0.503 milliseconds (cumulative count 24073)
41.608% <= 0.607 milliseconds (cumulative count 41608)
55.195% <= 0.703 milliseconds (cumulative count 55195)
67.451% <= 0.807 milliseconds (cumulative count 67451)
76.876% <= 0.903 milliseconds (cumulative count 76876)
84.431% <= 1.007 milliseconds (cumulative count 84431)
89.594% <= 1.103 milliseconds (cumulative count 89594)
93.678% <= 1.207 milliseconds (cumulative count 93678)
96.161% <= 1.303 milliseconds (cumulative count 96161)
97.743% <= 1.407 milliseconds (cumulative count 97743)
98.650% <= 1.503 milliseconds (cumulative count 98650)
99.186% <= 1.607 milliseconds (cumulative count 99186)
99.424% <= 1.703 milliseconds (cumulative count 99424)
99.591% <= 1.807 milliseconds (cumulative count 99591)
99.672% <= 1.903 milliseconds (cumulative count 99672)
99.737% <= 2.007 milliseconds (cumulative count 99737)
99.773% <= 2.103 milliseconds (cumulative count 99773)
99.936% <= 3.103 milliseconds (cumulative count 99936)
99.957% <= 4.103 milliseconds (cumulative count 99957)
99.979% <= 5.103 milliseconds (cumulative count 99979)
100.000% <= 6.103 milliseconds (cumulative count 100000)
Summary:
throughput summary: 47801.15 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.723 0.184 0.671 1.255 1.567 5.895
====== LPUSH ======
100000 requests completed in 2.09 seconds
50 parallel clients
3 bytes payload
keep alive: 1
host configuration "save": 300 10
host configuration "appendonly": yes
multi-thread: no
Latency by percentile distribution:
0.000% <= 0.215 milliseconds (cumulative count 3)
50.000% <= 0.687 milliseconds (cumulative count 51059)
75.000% <= 0.903 milliseconds (cumulative count 75163)
87.500% <= 1.087 milliseconds (cumulative count 87758)
93.750% <= 1.247 milliseconds (cumulative count 93851)
96.875% <= 1.399 milliseconds (cumulative count 96899)
98.438% <= 1.567 milliseconds (cumulative count 98484)
99.219% <= 1.767 milliseconds (cumulative count 99225)
99.609% <= 1.991 milliseconds (cumulative count 99610)
99.805% <= 2.167 milliseconds (cumulative count 99806)
99.902% <= 2.351 milliseconds (cumulative count 99903)
99.951% <= 2.503 milliseconds (cumulative count 99953)
99.976% <= 2.615 milliseconds (cumulative count 99976)
99.988% <= 2.687 milliseconds (cumulative count 99988)
99.994% <= 2.775 milliseconds (cumulative count 99994)
99.997% <= 2.831 milliseconds (cumulative count 99997)
99.998% <= 2.919 milliseconds (cumulative count 99999)
99.999% <= 2.975 milliseconds (cumulative count 100000)
100.000% <= 2.975 milliseconds (cumulative count 100000)
Cumulative distribution of latencies:
0.000% <= 0.103 milliseconds (cumulative count 0)
2.009% <= 0.303 milliseconds (cumulative count 2009)
10.383% <= 0.407 milliseconds (cumulative count 10383)
23.762% <= 0.503 milliseconds (cumulative count 23762)
39.616% <= 0.607 milliseconds (cumulative count 39616)
53.245% <= 0.703 milliseconds (cumulative count 53245)
65.687% <= 0.807 milliseconds (cumulative count 65687)
75.163% <= 0.903 milliseconds (cumulative count 75163)
83.127% <= 1.007 milliseconds (cumulative count 83127)
88.543% <= 1.103 milliseconds (cumulative count 88543)
92.636% <= 1.207 milliseconds (cumulative count 92636)
95.196% <= 1.303 milliseconds (cumulative count 95196)
97.006% <= 1.407 milliseconds (cumulative count 97006)
98.021% <= 1.503 milliseconds (cumulative count 98021)
98.705% <= 1.607 milliseconds (cumulative count 98705)
99.069% <= 1.703 milliseconds (cumulative count 99069)
99.314% <= 1.807 milliseconds (cumulative count 99314)
99.477% <= 1.903 milliseconds (cumulative count 99477)
99.629% <= 2.007 milliseconds (cumulative count 99629)
99.739% <= 2.103 milliseconds (cumulative count 99739)
100.000% <= 3.103 milliseconds (cumulative count 100000)
Summary:
throughput summary: 47938.64 requests per second
latency summary (msec):
avg min p50 p95 p99 max
0.736 0.208 0.687 1.295 1.687 2.975
piplining的测试对比,可以明显看出一次piplining中设置为10条命令时,性能翻了5倍!
$ redis-benchmark -t set,lpush -n 100000 -r 1000 -q
SET: 47984.64 requests per second, p50=0.615 msec
LPUSH: 49875.31 requests per second, p50=0.615 msec
$ redis-benchmark -t set,lpush -n 100000 -r 1000 -q -P 10
SET: 248756.22 requests per second, p50=1.695 msec
LPUSH: 253164.55 requests per second, p50=1.639 msec