From 4a2893a181ae70a55f59d48f12ce5d0c8f416ea1 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Tue, 5 Aug 2014 10:05:19 -0400 Subject: [PATCH 1/4] add discussion of performance analysis to libcollection --- src/libcollections/lib.rs | 196 +++++++++++++++++++++++++++++++++++++- 1 file changed, 193 insertions(+), 3 deletions(-) diff --git a/src/libcollections/lib.rs b/src/libcollections/lib.rs index d2d8ad696d7c5..08c65c8265be0 100644 --- a/src/libcollections/lib.rs +++ b/src/libcollections/lib.rs @@ -8,9 +8,199 @@ // option. This file may not be copied, modified, or distributed // except according to those terms. -/*! - * Collection types. - */ +//! Rust's standard collections library provides several structures for organizing +//! and querying data. Choosing the right collection for the right job is a non- +//! trivial and important part of writing any good program. While Rust strives to +//! provide efficient and easy to use collections for common use-cases, given *only* +//! a list of the functions a collection provides it can be difficult to determine +//! the best choice. When in doubt, running tests on your actual code with your +//! actual data will always be the best way to identify the best collection for the +//! job. However, in practice this can be time-consuming or otherwise impractical to +//! do. As such, we strive to provide quality documentation on the absolute and +//! relative strengths and weaknesses of each collection. +//! +//! When in doubt, we recommend first considering `Vec`, `RingBuf`, `HashMap`, and +//! `HashSet` for the task, as their performance is excellent both in theoretical +//! and practical terms. These collections are easily the most commonly used ones by +//! imperative programmers, and can often be acceptable even when they aren't the +//! *best* choice. Other collections fill important but potentially subtle niches, +//! and the importance of knowing when they are more or less appropriate cannot be +//! understated. +//! +//! # Measuring Performance +//! +//! The performance of a collection is a difficult thing to precisely capture. One +//! cannot simply perform an operation and measure how long it takes or how much +//! space is used, as the results will depend on details such as how it was +//! compiled, the hardware it's running on, the software managing its execution, and +//! the current state of the program. These precise details are independent of the +//! collection's implementation itself, and are far too diverse to exhaustively test +//! against. +//! +//! To avoid this issue, we use *asymptotic analysis* to measure how performance +//! *scales* with the size of the input (commonly denoted by `n`), or other more +//! exotic properties. This is an excellent first-order approximation of +//! performance, but has some drawbacks that we discuss below. +//! +//! ## Big-Oh Notation +//! +//! The most common tool in performing asymptotic analysis is *Big-Oh notation*. +//! Big-Oh notation is a way of expressing the relation between the growth rate of +//! two functions. Given two functions `f` and `g`, when we say `f(x)` *is* +//! `O(g(x))`, you can generally read this as "`f(x)` is *on the order of* `g(x)`". +//! Informally, we say `f(x)` is `O(g(x))` if `g(x)` grows at least as fast as +//! `f(x)`. In effect, `g(x)` is an upper-bound on how `f(x)` scales with `x`. This +//! scaling ignores constant factors, so `2x` is `O(x)`, even though `2x` grows +//! faster. +//! +//! This ignoring of constants is exactly what we want when discussing the +//! performance of collections, because the the precise compilation and execution +//! details will generally only provide constant-factor speed ups. *In practice*, +//! these constant factors can be large and important, and should be part of the +//! collection selection process. However, Big-Oh notation provides a useful way to +//! quickly identify what a collection does well, and what a collection does poorly, +//! particularly in comparison to other collections. Note also that Big-Oh notation +//! is only interested in the *asymptotic* performance of the functions. For small +//! values of `x` the relationship between these two functions may be arbitrary. +//! +//! While the functions in Big-Oh notation can have arbitrary complexity, by +//! convention the function `g(x)` in `O(g(x))` should be written as simply as +//! possible, and is expected to be as tight as possible. For instance, `2x` is +//! `O(3x^2 + 5x - 2)`, but we would generally simplify the expression to only the +//! dominant factor, with constants stripped away. In this case, `x^2` grows the +//! fastest, and so we would simply say `2x` is `O(x^2)`. Similarly, although `2x` +//! *is* `O(x^2)`, this is needlessly weak. We would instead prefer to provide the +//! stronger bound `O(x)`. +//! +//! Several functions occur very often in Big-Oh notation, and so we note them here +//! for convenience: +//! +//! * `O(1)` - *Constant*: The performance of the operation is effectively +//! independent of context. This is usually *very* cheap. +//! +//! * `O(logn)` - *Logarithmic*: Performance scales with the logarithm of `n`. +//! This is usually cheap. +//! +//! * `O(n)` - *Linear*: Performance scales proportionally to `n`. +//! This is considered expensive, but tractable. +//! +//! * `O(nlogn)`: Performance scales a bit worse than linear. +//! Not to be done frequently if possible. +//! +//! * `O(n^2)` - *Quadratic*: Performance scales with the square of `n`. +//! This is considered very expensive, and is potentially catastrophic for large inputs. +//! +//! * `O(2^n)` - *Exponential*: Performance scales exponentially with `n`. +//! This is considered intractable for anything but very small inputs. +//! +//! ## Time Complexity +//! +//! The most common measure of performance is how long something takes. However, +//! even at the abstraction level of Big-Oh notation, this is not necessarily +//! straight forward. Time complexity is separated into several different +//! categories, to capture important distinctions. In the simplest case, an +//! operation *always* takes `O(g(x))` time to execute. However, we may also be +//! interested in the following measures of time: +//! +//! ### Worst-Case Time +//! +//! The amount of time an operation may take can vary greatly from input to input. +//! However, it is often possible to determine how much time is taken *in the worst- +//! case*. For some operations, the worst-case may be rare and very large. For other +//! operations, it may be the most common, with rare "fast" events. +//! +//! For instance, if an operation sometimes takes `O(1)`, `O(logn)`, or `O(n)` time, +//! we simply say it takes `O(n)` worst-case time, since `O(n)` is the largest. +//! +//! Worst-case analysis is often the easiest to perform, and is always applicable to +//! any operation. As such, it is the standard default measure of time complexity. +//! If time complexity is not qualified, it can be assumed to be worst-case. Having +//! a good worst-case time complexity is the most desirable, as it provides a strong +//! guarantee of reliable performance. However, sometimes the most efficient +//! operations in practice have poor worst-case times, due to rare degenerate +//! behaviors. +//! +//! `Vec`'s push operation usually takes `O(1)` time, but occasionally takes `O(n)` time, +//! and so takes `O(n)` worst-case time. +//! +//! ### Expected Time +//! +//! The running time of some operations may depend on a random or pseudo-random +//! process. In this case, expected time is used to capture how long the operation +//! takes *on average*. The operation may take much more or less time on any given +//! input, or even on different calls on the same input. +//! +//! For instance, if an operation takes `O(nlogn)` time *with high probability*, but +//! very rarely takes `O(n^2)` time, then the operation takes `O(nlogn)` expected +//! time, even though it has a worst-case time of `O(n^2)`. `QuickSort` is the +//! canonical randomized operation, with exactly this performance analysis. +//! +//! ### Amortized Time +//! +//! Some operations can have a very high worst-case cost, but over a *sequence* of +//! `m` operations the total cost can sometimes be guaranteed to not exceed some +//! smaller bound than `m` times the worst-case. +//! +//! For instance, `Vec`'s push operation almost always takes `O(1)` time, but after +//! (approximately) `n`operations, a single push may take `O(n)` time. By worst-case +//! analysis, all we can say of this situation is that a sequence of `m` pushes will +//! take `O(mn)` time. However, in reality we know that the sequence will only take +//! `O(m)` time, since the expensive `O(n)` operation can be *amortized* across the +//! many cheap operations that are *guaranteed* to occur before an expensive +//! operation. Therefore, we say that `Vec.push()` takes `O(1)` amortized time. +//! +//! ## Space Complexity +//! +//! Space complexity is less commonly discussed, but still an important +//! consideration. It can be used to measure either how much space a structure +//! consumes with respect to its contents, or how much additional space an operation +//! temporarily uses. Generally, a fast operation cannot use much space, because +//! time complexity is bounded below by space complexity. That is, it takes `O(n)` +//! time to even *allocate* `O(n)` memory, let alone use it productively. +//! +//! However, space consumption can be important in resource constrained +//! environments, or just when working on large datasets. An operation that takes +//! `O(n^2)` time on a large data set might be unfortunate, but consuming `O(n^2)` +//! extra space to do it, even if only temporary, might prove catastrophic. If the +//! extra space consumed is greater than `O(1)`, it is also likely allocated on the +//! heap, which is generally an expensive operation. Knowing this can help give +//! context to otherwise abstract time complexities. +//! +//! Like time complexity, space complexity can be expressed in worst-case, expected, +//! or amortized terms. Unless otherwise stated, an operation can be assumed to use +//! only `O(1)` additional worst-case space, and a structure containing `n` items +//! can be assumed to have worst-case size `O(n)`. +//! +//! ## Problems with Big-Oh Notation +//! +//! Big-Oh notation is great for broad-strokes analysis of collections and +//! operations, but it can sometimes be misleading in practice. +//! +//! For instance, from a pure asymptotic analysis perspective, `RingBuf` appears to +//! be a strictly superior collection to `Vec`. `RingBuf` supports every operation +//! that `Vec` does in the "same" amount of time, while improving the performance of +//! some operations. However, in practice `Vec` will outperform `RingBuf` on many of +//! the operations they appear to be equally good at. This is because `RingBuf` +//! takes a small constant performance penalty to speed up its other operations. +//! This penalty is not reflected in asymptotic analysis, precisely *because* it is +//! a constant. +//! +//! Similarly, `DList` appears to be better than `Vec` at many operations, and even +//! provides strong *worst-case* guarantees on operations like `push`, where `Vec` +//! only provides strong *amortized* guarantees. However, in practice `Vec` is +//! expected to *substantially* outperform DList over any large sequence of +//! `push`es. +//! +//! Worse yet, it can sometimes be the case that for all practically sized inputs, +//! an operation that appears to be asymptotically slower than another may be faster +//! in practice, because the "hidden" constant of the theoretically fast operation +//! can be catastrophically large. +//! +//! For these reasons, we will generally strive to discuss practical performance +//! considerations *in addition to* providing the much more convenient and simple +//! asymptotic bounds for high level comparisons. If an operation on a collection +//! does not provide any asymptotic performance information, it should be considered +//! a bug. #![crate_name = "collections"] #![experimental] From 9931e95786924ced8e930fbc44fe5e98ef523146 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Tue, 5 Aug 2014 10:50:05 -0400 Subject: [PATCH 2/4] hyperlinking --- src/libcollections/lib.rs | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/src/libcollections/lib.rs b/src/libcollections/lib.rs index 08c65c8265be0..0dffad3c80e6a 100644 --- a/src/libcollections/lib.rs +++ b/src/libcollections/lib.rs @@ -19,9 +19,11 @@ //! do. As such, we strive to provide quality documentation on the absolute and //! relative strengths and weaknesses of each collection. //! -//! When in doubt, we recommend first considering `Vec`, `RingBuf`, `HashMap`, and -//! `HashSet` for the task, as their performance is excellent both in theoretical -//! and practical terms. These collections are easily the most commonly used ones by +//! When in doubt, we recommend first considering [`Vec`](../vec/struct.Vec.html), +//! [`RingBuf`](struct.RingBuf.html), [`HashMap`](hashmap/struct.HashMap.html), and +//! [`HashSet`](hashmap/struct.HashSet.html) for the task, as their performance is +//! excellent both in theoretical and practical terms. +//! These collections are easily the most commonly used ones by //! imperative programmers, and can often be acceptable even when they aren't the //! *best* choice. Other collections fill important but potentially subtle niches, //! and the importance of knowing when they are more or less appropriate cannot be @@ -120,7 +122,7 @@ //! operations in practice have poor worst-case times, due to rare degenerate //! behaviors. //! -//! `Vec`'s push operation usually takes `O(1)` time, but occasionally takes `O(n)` time, +//! Vec's push operation usually takes `O(1)` time, but occasionally takes `O(n)` time, //! and so takes `O(n)` worst-case time. //! //! ### Expected Time @@ -185,7 +187,8 @@ //! This penalty is not reflected in asymptotic analysis, precisely *because* it is //! a constant. //! -//! Similarly, `DList` appears to be better than `Vec` at many operations, and even +//! Similarly, [`DList`](struct.DList.html) appears to be better than `Vec` +//! at many operations, and even //! provides strong *worst-case* guarantees on operations like `push`, where `Vec` //! only provides strong *amortized* guarantees. However, in practice `Vec` is //! expected to *substantially* outperform DList over any large sequence of From 1acb3399140e7dd9347b579c597a1a5b4447c9f5 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Wed, 6 Aug 2014 09:43:16 -0400 Subject: [PATCH 3/4] fixups --- src/libcollections/lib.rs | 34 +++++++++++++++++++++------------- 1 file changed, 21 insertions(+), 13 deletions(-) diff --git a/src/libcollections/lib.rs b/src/libcollections/lib.rs index 0dffad3c80e6a..4494ba7b1199b 100644 --- a/src/libcollections/lib.rs +++ b/src/libcollections/lib.rs @@ -8,6 +8,8 @@ // option. This file may not be copied, modified, or distributed // except according to those terms. +//! Collections types +//! //! Rust's standard collections library provides several structures for organizing //! and querying data. Choosing the right collection for the right job is a non- //! trivial and important part of writing any good program. While Rust strives to @@ -68,10 +70,14 @@ //! While the functions in Big-Oh notation can have arbitrary complexity, by //! convention the function `g(x)` in `O(g(x))` should be written as simply as //! possible, and is expected to be as tight as possible. For instance, `2x` is -//! `O(3x^2 + 5x - 2)`, but we would generally simplify the expression to only the -//! dominant factor, with constants stripped away. In this case, `x^2` grows the -//! fastest, and so we would simply say `2x` is `O(x^2)`. Similarly, although `2x` -//! *is* `O(x^2)`, this is needlessly weak. We would instead prefer to provide the +//! O(3x2 + 5x - 2), +//! but we would generally simplify the expression to only the +//! dominant factor, with constants stripped away. In this case, +//! `x2 grows the +//! fastest, and so we would simply say `2x` is +//! O(x2). Similarly, although `2x` +//! *is* O(x2), this is needlessly weak. +//! We would instead prefer to provide the //! stronger bound `O(x)`. //! //! Several functions occur very often in Big-Oh notation, and so we note them here @@ -80,19 +86,19 @@ //! * `O(1)` - *Constant*: The performance of the operation is effectively //! independent of context. This is usually *very* cheap. //! -//! * `O(logn)` - *Logarithmic*: Performance scales with the logarithm of `n`. +//! * `O(log n)` - *Logarithmic*: Performance scales with the logarithm of `n`. //! This is usually cheap. //! //! * `O(n)` - *Linear*: Performance scales proportionally to `n`. //! This is considered expensive, but tractable. //! -//! * `O(nlogn)`: Performance scales a bit worse than linear. +//! * `O(n log n)`: Performance scales a bit worse than linear. //! Not to be done frequently if possible. //! -//! * `O(n^2)` - *Quadratic*: Performance scales with the square of `n`. +//! * O(n2) - *Quadratic*: Performance scales with the square of `n`. //! This is considered very expensive, and is potentially catastrophic for large inputs. //! -//! * `O(2^n)` - *Exponential*: Performance scales exponentially with `n`. +//! * O(2n) - *Exponential*: Performance scales exponentially with `n`. //! This is considered intractable for anything but very small inputs. //! //! ## Time Complexity @@ -111,7 +117,7 @@ //! case*. For some operations, the worst-case may be rare and very large. For other //! operations, it may be the most common, with rare "fast" events. //! -//! For instance, if an operation sometimes takes `O(1)`, `O(logn)`, or `O(n)` time, +//! For instance, if an operation sometimes takes `O(1)`, `O(log n)`, or `O(n)` time, //! we simply say it takes `O(n)` worst-case time, since `O(n)` is the largest. //! //! Worst-case analysis is often the easiest to perform, and is always applicable to @@ -132,9 +138,10 @@ //! takes *on average*. The operation may take much more or less time on any given //! input, or even on different calls on the same input. //! -//! For instance, if an operation takes `O(nlogn)` time *with high probability*, but -//! very rarely takes `O(n^2)` time, then the operation takes `O(nlogn)` expected -//! time, even though it has a worst-case time of `O(n^2)`. `QuickSort` is the +//! For instance, if an operation takes `O(n log n)` time *with high probability*, but +//! very rarely takes O(n2) time, +//! then the operation takes `O(n log n)` expected time, even though it has a worst-case time +//! of O(n2). `QuickSort` is the //! canonical randomized operation, with exactly this performance analysis. //! //! ### Amortized Time @@ -162,7 +169,8 @@ //! //! However, space consumption can be important in resource constrained //! environments, or just when working on large datasets. An operation that takes -//! `O(n^2)` time on a large data set might be unfortunate, but consuming `O(n^2)` +//! O(n2) time on a large dataset might be unfortunate, +//! but consuming O(n2) //! extra space to do it, even if only temporary, might prove catastrophic. If the //! extra space consumed is greater than `O(1)`, it is also likely allocated on the //! heap, which is generally an expensive operation. Knowing this can help give From 90195defc3aebff5f97319ff93fa50bb0e3e4c26 Mon Sep 17 00:00:00 2001 From: Alexis Beingessner Date: Wed, 13 Aug 2014 08:00:21 -0400 Subject: [PATCH 4/4] simplify discussion --- src/libcollections/lib.rs | 179 +++++++------------------------------- 1 file changed, 32 insertions(+), 147 deletions(-) diff --git a/src/libcollections/lib.rs b/src/libcollections/lib.rs index 4494ba7b1199b..bfccd76a7dd6b 100644 --- a/src/libcollections/lib.rs +++ b/src/libcollections/lib.rs @@ -12,9 +12,9 @@ //! //! Rust's standard collections library provides several structures for organizing //! and querying data. Choosing the right collection for the right job is a non- -//! trivial and important part of writing any good program. While Rust strives to -//! provide efficient and easy to use collections for common use-cases, given *only* -//! a list of the functions a collection provides it can be difficult to determine +//! trivial and important part of writing any good program. While Rust +//! provides efficient and easy to use collections for common use-cases, given *only* +//! a list of the operations a collection provides it can be difficult to determine //! the best choice. When in doubt, running tests on your actual code with your //! actual data will always be the best way to identify the best collection for the //! job. However, in practice this can be time-consuming or otherwise impractical to @@ -31,7 +31,7 @@ //! and the importance of knowing when they are more or less appropriate cannot be //! understated. //! -//! # Measuring Performance +//! ## Terminology and Notation //! //! The performance of a collection is a difficult thing to precisely capture. One //! cannot simply perform an operation and measure how long it takes or how much @@ -39,49 +39,12 @@ //! compiled, the hardware it's running on, the software managing its execution, and //! the current state of the program. These precise details are independent of the //! collection's implementation itself, and are far too diverse to exhaustively test -//! against. -//! -//! To avoid this issue, we use *asymptotic analysis* to measure how performance -//! *scales* with the size of the input (commonly denoted by `n`), or other more -//! exotic properties. This is an excellent first-order approximation of -//! performance, but has some drawbacks that we discuss below. -//! -//! ## Big-Oh Notation -//! -//! The most common tool in performing asymptotic analysis is *Big-Oh notation*. -//! Big-Oh notation is a way of expressing the relation between the growth rate of -//! two functions. Given two functions `f` and `g`, when we say `f(x)` *is* -//! `O(g(x))`, you can generally read this as "`f(x)` is *on the order of* `g(x)`". -//! Informally, we say `f(x)` is `O(g(x))` if `g(x)` grows at least as fast as -//! `f(x)`. In effect, `g(x)` is an upper-bound on how `f(x)` scales with `x`. This -//! scaling ignores constant factors, so `2x` is `O(x)`, even though `2x` grows -//! faster. -//! -//! This ignoring of constants is exactly what we want when discussing the -//! performance of collections, because the the precise compilation and execution -//! details will generally only provide constant-factor speed ups. *In practice*, -//! these constant factors can be large and important, and should be part of the -//! collection selection process. However, Big-Oh notation provides a useful way to -//! quickly identify what a collection does well, and what a collection does poorly, -//! particularly in comparison to other collections. Note also that Big-Oh notation -//! is only interested in the *asymptotic* performance of the functions. For small -//! values of `x` the relationship between these two functions may be arbitrary. -//! -//! While the functions in Big-Oh notation can have arbitrary complexity, by -//! convention the function `g(x)` in `O(g(x))` should be written as simply as -//! possible, and is expected to be as tight as possible. For instance, `2x` is -//! O(3x2 + 5x - 2), -//! but we would generally simplify the expression to only the -//! dominant factor, with constants stripped away. In this case, -//! `x2 grows the -//! fastest, and so we would simply say `2x` is -//! O(x2). Similarly, although `2x` -//! *is* O(x2), this is needlessly weak. -//! We would instead prefer to provide the -//! stronger bound `O(x)`. +//! against. To abstract these issues away, we use Big-Oh notation, which, roughly +//! speaking, expresses how performance scales with input size. //! //! Several functions occur very often in Big-Oh notation, and so we note them here -//! for convenience: +//! for convenience. Generally, we will denote the size of the input or number of +//! elements in the collection as `n`: //! //! * `O(1)` - *Constant*: The performance of the operation is effectively //! independent of context. This is usually *very* cheap. @@ -101,117 +64,39 @@ //! * O(2n) - *Exponential*: Performance scales exponentially with `n`. //! This is considered intractable for anything but very small inputs. //! -//! ## Time Complexity -//! -//! The most common measure of performance is how long something takes. However, -//! even at the abstraction level of Big-Oh notation, this is not necessarily -//! straight forward. Time complexity is separated into several different -//! categories, to capture important distinctions. In the simplest case, an -//! operation *always* takes `O(g(x))` time to execute. However, we may also be -//! interested in the following measures of time: -//! -//! ### Worst-Case Time -//! -//! The amount of time an operation may take can vary greatly from input to input. -//! However, it is often possible to determine how much time is taken *in the worst- -//! case*. For some operations, the worst-case may be rare and very large. For other -//! operations, it may be the most common, with rare "fast" events. -//! -//! For instance, if an operation sometimes takes `O(1)`, `O(log n)`, or `O(n)` time, -//! we simply say it takes `O(n)` worst-case time, since `O(n)` is the largest. -//! -//! Worst-case analysis is often the easiest to perform, and is always applicable to -//! any operation. As such, it is the standard default measure of time complexity. -//! If time complexity is not qualified, it can be assumed to be worst-case. Having -//! a good worst-case time complexity is the most desirable, as it provides a strong -//! guarantee of reliable performance. However, sometimes the most efficient -//! operations in practice have poor worst-case times, due to rare degenerate -//! behaviors. -//! -//! Vec's push operation usually takes `O(1)` time, but occasionally takes `O(n)` time, -//! and so takes `O(n)` worst-case time. -//! -//! ### Expected Time -//! -//! The running time of some operations may depend on a random or pseudo-random -//! process. In this case, expected time is used to capture how long the operation -//! takes *on average*. The operation may take much more or less time on any given -//! input, or even on different calls on the same input. -//! -//! For instance, if an operation takes `O(n log n)` time *with high probability*, but -//! very rarely takes O(n2) time, -//! then the operation takes `O(n log n)` expected time, even though it has a worst-case time -//! of O(n2). `QuickSort` is the -//! canonical randomized operation, with exactly this performance analysis. -//! -//! ### Amortized Time -//! -//! Some operations can have a very high worst-case cost, but over a *sequence* of -//! `m` operations the total cost can sometimes be guaranteed to not exceed some -//! smaller bound than `m` times the worst-case. -//! -//! For instance, `Vec`'s push operation almost always takes `O(1)` time, but after -//! (approximately) `n`operations, a single push may take `O(n)` time. By worst-case -//! analysis, all we can say of this situation is that a sequence of `m` pushes will -//! take `O(mn)` time. However, in reality we know that the sequence will only take -//! `O(m)` time, since the expensive `O(n)` operation can be *amortized* across the -//! many cheap operations that are *guaranteed* to occur before an expensive -//! operation. Therefore, we say that `Vec.push()` takes `O(1)` amortized time. -//! -//! ## Space Complexity -//! -//! Space complexity is less commonly discussed, but still an important -//! consideration. It can be used to measure either how much space a structure -//! consumes with respect to its contents, or how much additional space an operation -//! temporarily uses. Generally, a fast operation cannot use much space, because -//! time complexity is bounded below by space complexity. That is, it takes `O(n)` -//! time to even *allocate* `O(n)` memory, let alone use it productively. -//! -//! However, space consumption can be important in resource constrained -//! environments, or just when working on large datasets. An operation that takes -//! O(n2) time on a large dataset might be unfortunate, -//! but consuming O(n2) -//! extra space to do it, even if only temporary, might prove catastrophic. If the -//! extra space consumed is greater than `O(1)`, it is also likely allocated on the -//! heap, which is generally an expensive operation. Knowing this can help give -//! context to otherwise abstract time complexities. +//! In addition, performance may be one of the following: //! -//! Like time complexity, space complexity can be expressed in worst-case, expected, -//! or amortized terms. Unless otherwise stated, an operation can be assumed to use -//! only `O(1)` additional worst-case space, and a structure containing `n` items -//! can be assumed to have worst-case size `O(n)`. +//! * Worst-Case: This is the worst possible behavior of the operation. For some operations, this +//! may be common or uncommon. If performance is unqualified, it is a worst-case bound. //! -//! ## Problems with Big-Oh Notation +//! * Expected: Performance depends internally on a randomized process, but this performance +//! is expected *on average*. Usually this occurs with high probability, and can be relied upon, +//! but operations with expected performance may be inappropriate for real-time or otherwise +//! resource-constrained applications. //! -//! Big-Oh notation is great for broad-strokes analysis of collections and -//! operations, but it can sometimes be misleading in practice. +//! * Amortized: Performance depends on the internal state of the structure, but over a +//! sufficiently long sequence of operations, cost per-operation averages out to this. This is +//! deterministically guaranteed, but the occasional high-cost operation may make these operations +//! inappropriate for real-time or otherwise resource-constrained applications. //! -//! For instance, from a pure asymptotic analysis perspective, `RingBuf` appears to -//! be a strictly superior collection to `Vec`. `RingBuf` supports every operation -//! that `Vec` does in the "same" amount of time, while improving the performance of -//! some operations. However, in practice `Vec` will outperform `RingBuf` on many of -//! the operations they appear to be equally good at. This is because `RingBuf` -//! takes a small constant performance penalty to speed up its other operations. -//! This penalty is not reflected in asymptotic analysis, precisely *because* it is -//! a constant. +//! ## Time vs Space //! -//! Similarly, [`DList`](struct.DList.html) appears to be better than `Vec` -//! at many operations, and even -//! provides strong *worst-case* guarantees on operations like `push`, where `Vec` -//! only provides strong *amortized* guarantees. However, in practice `Vec` is -//! expected to *substantially* outperform DList over any large sequence of -//! `push`es. +//! Usually, we are only interested in performance in terms of time taken to perform the operation. +//! As such, any unqualified discussion of performance should be assumed to be in terms of +//! time taken. However, performance may also occasionally be in terms of memory consumed. +//! Conveniently, a collection on `n` elements almost always simply occupies `O(n)` space, and +//! operations often only take `O(1)` additional memory. Therefore, space concerns are usually +//! excluded from analysis, and these bounds on memory usage can be assumed in that case. //! -//! Worse yet, it can sometimes be the case that for all practically sized inputs, -//! an operation that appears to be asymptotically slower than another may be faster -//! in practice, because the "hidden" constant of the theoretically fast operation -//! can be catastrophically large. +//! Note that while well-defined, Big-Oh notation is often imprecise from a practical perspective. +//! It should be used for broad-strokes comparison and evaluation of operations and collections. +//! One `O(1)` may be better than another in practice. Similarly, operations with +//! good amortized or expected performance often out-perform similar operations with worst-case +//! guarantees under sufficiently active usage patterns. //! //! For these reasons, we will generally strive to discuss practical performance //! considerations *in addition to* providing the much more convenient and simple -//! asymptotic bounds for high level comparisons. If an operation on a collection -//! does not provide any asymptotic performance information, it should be considered -//! a bug. +//! Big-Oh notation for high level comparisons. #![crate_name = "collections"] #![experimental]