Big O notation_Mastering Redis-QQ阅读男生都市网

上QQ阅读APP看书，第一时间看更新

Big O notation

As you may already know and fully appreciate, Salvatore Sanfilippo intentionally documents the worst-case algorithmic performance of each Redis command on Redis's website at http://redis.io/commands/. This focus on an algorithmic measure of performance as a core actionable metric differentiates Redis from the other data storage technologies. A mathematical definition of Big O is that it "symbolically expresses the asymptotic behavior of a given function.2. Within computer science and more pertinent to our understanding of the big O notation within Redis, with this notation and understanding, we can classify the performance of a Redis command by how the commands perform with increasing inputs to the command over time.

Graphing Big O Notation

In the Redis documentation for each command, the time complexity of each Redis command is given in these big O cases:

The O(1) case in the big O notation is shorthand for increasing the number of inputs that do not change the time or processing. In O(1) algorithms, the upper bound of performance is in linear time, meaning that increasing the number of inputs does not degrade performance but is bound by the algorithm's complexity processing.
The next best big O case is O(log n) or logarithmic time where for each input, an operation is applied and the result returned is greater than N(1), but the performance is equivalent to applying a logarithm to n.
An intuitive grasp of the term O(n) in the big O notation follows the common sense idea that adding an extra unit increases the amount of time by a constant, proportional amount.
In O(n log n) or log linear time, for each input, O(log n) is applied to the input. For practical purposes, each time input is more than doubled in an O(n log n) algorithm.
For O(n^2) or quadratic time, as the size of n increases, the amount of time doubles. For each doubling of n, the time processed is four times as long. The performance of O(n^2) algorithms may be acceptable at smaller values for n but becomes quickly unrealistic for most uses as n increases in size.
In the case of problems that are solvable in O(2^n) or exponential time, for every additional input, the time doubles, making O(2^n) unusable for most larger inputs of n.
The most time-complex algorithms are noted in O(n!) factorial time where processing becomes quickly prohibitive as n increases slightly. For example, the difference for an O(n!) algorithm at 5 units vs. 6 units is significant (120 units of elapsed time vs. 720 units of elapsed time).

Computing big O notation for custom code

With Redis documentation providing the big O notation for each command, we can calculate a rough efficiency estimate of any proposed Redis-based solution. A simplistic approach is to take the sum of all of the Redis commands' big O notations for a certain level of n and then, estimate the big O notation for the implementing code to reach a rough estimate of time efficiency for your entire solution. For example, a simple cache in Redis may just be a single Redis SET and GET call, leading to O(1) + O(1) ≈ 2 units of time for the solution. More complicated use cases require more commands with higher big O notations.

Evaluating the time complexity of your data structures involves not just the data structure itself but also the optimization of the total number of Redis commands, both ingestion and extraction. Depending on your use case, it may be perfectly acceptable to have poor time execution of your ingesting data into say, a sorted set, with an equal to or lower than big O case for the access commands. The opposite use case of low latency in ingesting large amounts of data, you need faster (low-complexity big O cases) ingestion, while accessing the data may not need to be as fast. For these situations, say when storing logging information that expires after a certain amount of elapsed time, optimizing writes is more important than access, although access is still important for the application.

So, returning to the stationery example from before, let us compare our first custom Redis solution with the second Nohm-based solution focusing just on the Redis commands since we do not have the client code yet for the first. We'll start by examining the total time complexity of adding the blue and red stationery packages to our Redis database.

Redis command	O(n)	Total
`INCR`	`+1`	`1`
`HMSET`	`+3`	`4`
`INCRBY`	`+1`	`5`
`HMSET`	`+3`	`8`

The total complexity for adding one stationery object to our custom Redis solution is four, so adding an additional red stationery package brings our total to eight.

Now, we will analyze the Nohm Redis Monitor command and summarize the result of setting up and saving both the blue and the red stationery:

Redis Command	O(n)	Running total
`INFO`	`+1`	`1`
`GET`	`+1`	`2`
`SISMEMBER`	`+1`	`3`
`SET`	`+1`	`4`
`SET`	`+1`	`5`
`SET`	`+1`	`6`
`SADD`	`+1`	`7`
`HMSET`	`+3`	`10`
`SADD`	`+1`	`11`
`HMSET`	`+3`	`14`

The total complexity of adding one stationery object, including various setup commands, to the Nohm solution is 10, and similar to our custom Redis solution, that of adding an additional red stationery package is 14. Depending on your application models, using a Redis object mapper in the programming language of your choice may only have a relatively minor overhead as you scale your application. Ideally, we keep the time complexity of our client code at the most, O(n log n), with the goal being O(1) or O(log n). This grows increasingly difficult as your application matures over time and lets you discover and work through edge cases and from end-user feedback.

Although the differences between the summary time complexity of the two Redis solutions is relatively small, the Nohm Redis object mapper provides a lot of essential functionalities that we need to replicate if we want to build a full node.js application by using Redis as our database. Again, there are tradeoffs between an extremely fast but with limited support for object tracking and validation in our custom Redis solution, or the additional functionality of object metadata and field validations that Nohm provides to our application.