Skip to content

Latest commit

 

History

History
317 lines (213 loc) · 18.5 KB

File metadata and controls

317 lines (213 loc) · 18.5 KB

C++ Containers and Iterators

Text extracted from book: “C++ 17 STL Cookbook”

Summary

Types of containers

The standard containers implement structures that are commonly used in our programs, such as:

  • dynamic arrays
  • queues
  • stacks
  • linked lists
  • trees
  • associative sets

The C++ container library categorizes containers into four types:

Sequence containers

Sequence containers are used for data structures that store objects of the same type in a linear manner. The STL SequenceContainer types are:

  • array provides a static contiguous array
  • vector provides a dynamic contiguous array
  • forward_list provides a singly-linked list
  • list provides a doubly-linked list
  • deque provides a double-ended queue, where elements can be added to the front or back of the queue.

While std::string is not included in most container lists, it does in fact meet the requirements of a SequenceContainer.

Sequence container adapters

Container adapters are a special type of container class. They are not full container classes on their own, but wrappers around other container types (such as a vector, deque, or list). These container adapters encapsulate the underlying container type and limit the user interfaces accordingly. Let’s consider std::stack. std::stack is a container that enforces a LIFO-type data structure. Here’s the declaration of std::stack: The standard container adapters are:

  • stack provides an LIFO data structure, stack can be implemented on top of vector, deque or list
  • queue provides a FIFO data structure, queue can be implemented on top of deque or list
  • priority_queue provides a priority queue, which allows for constant-time lookup of the largest element (by default), priority_queue can be implemented on top of deque or vector

Associative containers

Associative containers provide sorted data structures that provide a fast lookup (O(log n) time) using keys. The STL AssociativeContainer types are can be divided in two ways: containers which require unique keys, and those which allow multiple entries using the same key. #Keys are unique

  • set is a collection of unique keys, sorted by keys
  • map is a collection of key-value pairs, sorted by keys

#Multiple entries for the same key are permitted

  • multiset is a collection of keys, sorted by keys
  • multimap is a collection of key-value pairs, sorted by keys
  • Note that set and map are typically implemented using red-black trees.

Unordered associative containers

Unordered associative containers provide unsorted data structures that can be accessed using a hash. Access times are O(n) in the worst-case, but much faster than linear time for most operations. For all STL UnorderedAssociativeContainer types, a hashed key is used to access the data. Similar to the AssociativeContainer, the standard types are split between those that require unique keys, and those that do not: Keys are unique

  • unordered set is a collection of keys, hashed by keys
  • unordered_map is a collection of key-value pairs, hashed by keys

Multiple entires for the same key are permitted

  • unordered_multiset is a collection of keys, hashed by keys
  • unordered_multimap is a collection of key-value pairs, hashed by keys

Hashtables

C++ doesn’t have natively a container called “hash” they are called unorder_map or unorder_set, under the wood they are hastables

  • unordered_set
  • unordered_multiset
  • unordered_map
  • unordered_multimap

Standard Container Thread Safety

I want to share with you some relevant points regarding container thread safety, as many embedded systems will likely require sharing objects across threads. All container functions are safe to be called concurrently on different objects of the same container type. For operating on objects of the same type, the following guidelines are helpful to keep in mind:

  • All const member functions can be called concurrently by different threads
  • The following member functions are thread-safe: begin(), end(), rbegin(), rend(), front(), back(), data(), find(), lower_bound(), upper_bound(), equal_range(), at()
  • Operator [] is thread-safe for sequence containers, unordered associative containers, and container adapters.
  • Operator [] is NOT thread-safe for associative containers!
  • Different elements in the same container can be modified concurrently, except for the elements of std::vector<bool>.
  • Iterator operations read from a container, but do not modify it, so they are thread-safe
  • Container operations that invalidate iterators are NOT thread-safe, as they modify the container
  • In general, common-sense concurrency rules apply. If you are modifying the container in multiple threads, you will need to protect that container to prevent concurrent access.

Iterators

Text from book “C++ 17 STL Cookbook”

Types of iterators

Input iterator Input iterators can be dereferenced only for reading the values they point to. Once they are incremented, the last value they pointed to has been invalidated during the incrementation. This means that it is not possible to iterate over such a range multiple times. The std::istream_iterator is an example for this category.

source code from: http://www.cplusplus.com/reference/iterator/istream_iterator/

// istream_iterator example
#include <iostream>     // std::cin, std::cout
#include <iterator>     // std::istream_iterator

int main ()
{
	 double value1, value2;
 	 std::cout << "Please, insert two values: ";

  	std::istream_iterator<double> eos;              // end-of-stream iterator
  	std::istream_iterator<double> iit (std::cin);   // stdin iterator

  	if (iit!=eos)
	{ 
		value1=*iit;
	}
 	++iit;
  	
	if (iit!=eos)
	{ 
		value2=*iit;
	}
  	std::cout << value1 << "*" << value2 << "=" << (value1*value2) << '\n';

  	return 0;
}

Forward iterator Forward iterators are the same as input iterators,but they differ in that regard that the ranges they represent can be iterated over multiple times. The  std::forward_list iterators are an example of that. Such a list can only be iterated over forward , not backward, but it can be iterated over as often as we like to.

Bidirectional iterator The bidirectional iterator, as the name suggests, can be incremented and decremented, in order to iterate forward or backward. The iterators of std::list , std::set , and std::map , for example, support that.

Random access iterator Random access iterators allow jumping over multiple values at once, instead of single-stepping. This is the case for iterators of std::vector and std::deque.

Contiguous iterator This category specifies all of the aforementioned requirements, plus the requirement that the data that is being iterated through lies in contiguous memory, like it does in an array, or std::vector.

Output iterator Output iterators are detached from the other categories. This is because an iterator can be a pure output iterator, which can only be incremented and used to write to the data it points to. If they are being read from, the value will be undefined.

Mutable iterator If an iterator is an output iterator and one of the other categories at the same time, it is a mutable iterator. It can be read from and written to. If we obtain an iterator from a non-const container instance, it will usually be of this kind.

Resume

Iterator CategoryAbilityProviders
Input iteratorReads forwardistream
Output iteratorWrites forwardostream, inserter
Forward iteratorReads/writes forwardforward_list, unordered_[multi]set, unordered_[multi]map
Bidirectional it.Reads/writes forward/backwardlist, [multi]set, [multi]map
Random access it.Reads/writes with random accessvector, deque string, array

from source: https://stackoverflow.com/questions/5211914/types-of-iterator-output-vs-input-vs-forward-vs-random-access-iterator

Iterators Adapters

std::back_insert_iterator The back_insert_iterator can be wrapped around std::vector , std::deque , std::list , and so on. It will call the container’s push_back method, which inserts the new item past the existing items. If the container instance is not large enough, it will be grown automatically.

std::front_insert_iterator The front_insert_iterator does exactly the same thing as back_insert_iterator , but it calls the container’s push_front method, which inserts the new item before all the existing items. Note that for a container like std::vector , this means that all the existing items need to be moved one slot further in order to leave space for the new item at the front.

std::insert_iterator This iterator adapter is similar to the other inserters, but is able to insert new items between existing ones. The std::inserter helper function which constructs such a wrapper takes two arguments. The first argument is the container and the second argument is an iterator that points to the position where new items shall be inserted.

std::istream_iterator The istream_iterator is another very handy adapter. It can be used with any std::istream object (which can be the standard input or files for example and will try to parse the input from that stream object according to the template parameter it was instantiated with. In this advance how long the stream is. That leaves the question, where will the end iterator point to if we do not know where the stream’s end is? The way this works is that the iterator knows when it reaches the end of the stream. When it is compared to the end iterator, it will effectively not really compare itself with the end iterator but return if the stream has any tokens left . That’s why the end iterator constructor does not take any arguments.

std::ostream_iterator The ostream_iterator is the same thing as the istream_iterator , but it works the other way around: It doesn’t take tokens from an input stream–it pushes tokens into an output stream. Another difference to istream_iterator is that its constructor takes a second argument, which is a string that shall be pushed into the output stream after each item. That is useful because this way we can print a separating “,” or a new line after each item.

When to use it ?

Nice Flowchart !!!

../imgs/stl.png

https://stackoverflow.com/questions/471432/in-which-scenario-do-i-use-a-particular-stl-container

Complexity Big-Oh

What are the complexity guarantees of the standard containers?

STL Container Performance

Scott Meyers

Look at Effective STL by Scott Meyers. It’s good at explaining how to use the STL.

If you want to store a determined/undetermined number of objects and you’re never going to delete any, then a vector is what you want. It’s the default replacement for a C array, and it works like one, but doesn’t overflow. You can set its size beforehand as well with reserve().

If you want to store an undetermined number of objects, but you’ll be adding them and deleting them, then you probably want a list…because you can delete an element without moving any following elements - unlike vector. It takes more memory than a vector, though, and you can’t sequentially access an element.

If you want to take a bunch of elements and find only the unique values of those elements, reading them all into a set will do it, and it will sort them for you as well.

If you have a lot of key-value pairs, and you want to sort them by key, then a map is useful…but it will only hold one value per key. If you need more than one value per key, you could have a vector/list as your value in the map, or use a multimap.

It’s not in the STL, but it is in the TR1 update to the STL: if you have a lot of key-value pairs that you’re going to look up by key, and you don’t care about their order, you might want to use a hash - which is tr1::unordered_map. I’ve used it with Visual C++ 7.1, where it was called stdext::hash_map. It has a lookup of O(1) instead of a lookup of O(log n) for map.

Stack overflow

It all depends on what you want to store and what you want to do with the container. Here are some (very non-exhaustive) examples for the container classes that I tend to use most: vector: Compact layout with little or no memory overhead per contained object. Efficient to iterate over. Append, insert and erase can be expensive, particularly for complex objects. Cheap to find a contained object by index, e.g. myVector[10]. Use where you would have used an array in C. Good where you have a lot of simple objects (e.g. int). Don’t forget to use reserve() before adding a lot of objects to the container. list: Small memory overhead per contained object. Efficient to iterate over. Append, insert and erase are cheap. Use where you would have used a linked list in C. set (and multiset): Significant memory overhead per contained object. Use where you need to find out quickly if that container contains a given object, or merge containers efficiently. map (and multimap): Significant memory overhead per contained object. Use where you want to store key-value pairs and look up values by key quickly. The flow chart on the cheat sheet suggested by zdan provides a more exhaustive guide.

SGI

http://www.sgi.com/tech/stl/complexity.html http://www.sgi.com/tech/stl/table_of_contents.html

Fundamentally, it is difficult to define the notion of asymptotic algorithm complexity precisely for real computer hardware instead of an abstract machine model. Thus we settle for the following guidelines:

For an algorithm A to have running time O(f(n)), there must be a corresponding algorithm A’ that is correct on machines with arbitrarily long pointer and size_t types, such that A and A’ perform essentially the same sequence of operations on the actual hardware. (In simple cases A and A’ will be the same. In other cases A may have been simplified with the knowledge that adresses are bounded.) For inputs of sufficiently large size n, A’ must take at most time Cf(n), where C is a constant, independent of both n and the address size. (Pointer, size_t, and ptrdiff_t operations are presumed to take constant time independent of their size.) All container or iterator complexity specifications refer to amortized complexity. An individual operation may take longer than specified. But any sufficiently long sequence of operations on the same container or iterator will take at most as long as the corresponding sum of the specified operation costs. Algorithms specify either worst-case or average case performance, and identify which. Unless otherwise stated, averages assume that container elements are chosen from a finite type with more possible values than the size of the container, and that container elements are independently uniformly distributed. A complexity specification for an operation f assumes that operations invoked by f require at most the specified runtime. But algorithms generally remain appropriate if the invoked operations are no more than a logarithmic factor slower than specified in the expected case. If operations are more expensive than assumed by a function F in the current STL, then F will slow down at most in proportion to the added cost. Any future operations that fail to satisfy this property will make that explicit.

To make this precise, assume F is specified to use time f(m) for input of size m. F uses operations Gk, with specified running times gk(n) on input size n. If F is used in a context in which each Gk is slower than expected by at most a factor h(n), then F slows down by at most a factor h(m). This holds because none of the current algorithms ever apply the operations Gk to inputs significantly larger than m.

http://www.cs.northwestern.edu/~riesbeck/programming/c++/stl-summary.html

http://landenlabs.com/code/perf-stl/perf-stl.html

https://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html

General Rules of Thumb

Use sequential containers when you need to access elements by position

  • Use std:vector as your default sequential container, especially as an alternative to built-in arrays
  • If size is known in advance, use std::array instead of a built-in array
  • If you add or remove elements frequently at both the front and back of a container, use std::deque
  • Use a std::list (not std::deque) if you need to insert/remove elements in the middle of the sequence
  • Do not use std::list if you need random access to objects
  • Prefer std::vector over std::list if your system uses a cache
  • std::string is almost always better than a C-string

Use associative containers when you need to access elements by key

  • For key/value pair, default to std::unordered_map, or if element order matters, std::map
  • If you need multiple entries for the same key, use std::unordered_multimap, or if element order matters, std::multimap

Memory allocation may also be a factor in your decision. Here are the general rules of thumb for how the different sequential containers are storing memory:

  • std:vector, std::array, and std::string store memory contiguously and are compatible with C-style APIs
  • std::deque allocates memory in chunks
  • std::list allocates memory by node

Important Notes

Vector should be used by default as the (sequence) container:

  • It is more (space and time) efficient than other STL containers.
  • It is more convenient and safer than primitive array.
  • automatic memory management
  • rich interface

Associative containers properties

The underlying data structure is a balanced search tree:

  • logarithmic access time
  • requires order comparisons of keys
  • iteration in key order
  • Iterators, pointers and references stay valid until the pointed to element is removed.

Cheat Sheet

C++ Containers Cheat Sheet

C++ Iterators & Algorithms Cheat Sheet

Links and References

https://www.cs.helsinki.fi/u/tpkarkka/alglib/k06/lectures/containers.html

https://arne-mertz.de/2018/05/modern-c-features-stdvariant-and-stdvisit/