This is the documentation for the PGM‑index C++ library.
| Members | Descriptions |
|---|---|
class pgm::PGMIndex |
A space-efficient index that enables fast search operations on a sorted sequence of n numbers. |
class pgm::DynamicPGMIndex |
A sorted associative container that contains key-value pairs with unique keys. |
class pgm::CompressedPGMIndex |
A variant of PGMIndex that uses compression on the segments to reduce the space of the index. |
class pgm::OneLevelPGMIndex |
A variant of PGMIndex that does not build a recursive structure but uses a binary search on the segments. |
class pgm::BucketingPGMIndex |
A simple variant of OneLevelPGMIndex that builds a top-level lookup table to speed up the search on the segments. |
class pgm::EliasFanoPGMIndex |
A variant of OneLevelPGMIndex that builds a top-level succinct structure to speed up the search on the segments. |
class pgm::MappedPGMIndex |
A disk-backed container storing a sorted sequence of numbers and a PGMIndex for fast search operations. |
class pgm::MultidimensionalPGMIndex |
A multidimensional container that uses a PGMIndex for fast orthogonal range queries. |
struct pgm::ApproxPos |
A struct that stores the result of a query to a PGMIndex, that is, a range [lo, hi) centered around an approximate position pos of the sought key. |
pgm::PGMIndexA space-efficient index that enables fast search operations on a sorted sequence of n numbers.
A search returns a struct ApproxPos containing an approximate position of the sought key in the sequence and the bounds of a range of size 2*Epsilon+1 where the sought key is guaranteed to be found if present. If the key is not present, the range is guaranteed to contain a key that is not less than (i.e. greater or equal to) the sought key, or n if no such key is found. In the case of repeated keys, the index finds the position of the first occurrence of a key.
The Epsilon template parameter should be set according to the desired space-time trade-off. A smaller value makes the estimation more precise and the range smaller but at the cost of increased space usage.
Internally the index uses a succinct piecewise linear mapping from keys to their position in the sorted order. This mapping is represented as a sequence of linear models (segments) which, if EpsilonRecursive is not zero, are themselves recursively indexed by other piecewise linear mappings.
Parameters:
K the type of the indexed keys
Epsilon controls the size of the returned search range
EpsilonRecursive controls the size of the search range in the internal structure
Floating the floating-point type to use for slopes
| Members | Descriptions |
|---|---|
public PGMIndex() = default |
Constructs an empty index. |
public explicit PGMIndex(const std::vector<K> & data) |
Constructs the index on the given sorted vector. |
public template<> inline PGMIndex(RandomIt first,RandomIt last) |
Constructs the index on the sorted keys in the range [first, last). |
public ApproxPos search(const K & key) const |
Returns the approximate position and the range where key can be found. |
public size_t segments_count() const |
Returns the number of segments in the last level of the index. |
public size_t height() const |
Returns the number of levels of the index. |
public size_t size_in_bytes() const |
Returns the size of the index in bytes. |
public PGMIndex() = defaultConstructs an empty index.
public explicit PGMIndex(const std::vector<K> & data)Constructs the index on the given sorted vector.
Parameters:
data the vector of keys to be indexed, must be sortedpublic template<> inline PGMIndex(RandomIt first,RandomIt last)Constructs the index on the sorted keys in the range [first, last).
Parameters:
first last the range containing the sorted keys to be indexedpublic ApproxPos search(const K & key) constReturns the approximate position and the range where key can be found.
Parameters:
key the value of the element to search forReturns: a struct with the approximate position and bounds of the range
public size_t segments_count() constReturns the number of segments in the last level of the index.
Returns: the number of segments
public size_t height() constReturns the number of levels of the index.
Returns: the number of levels of the index
public size_t size_in_bytes() constReturns the size of the index in bytes.
Returns: the size of the index in bytes
pgm::DynamicPGMIndexA sorted associative container that contains key-value pairs with unique keys.
Parameters:
K the type of a key
V the type of a value
PGMType the type of PGMIndex to use in the container
MinIndexedLevel the minimum level (of size 2^MinIndexedLevel) on which a PGMType index is constructed
| Members | Descriptions |
|---|---|
public DynamicPGMIndex() |
Constructs an empty container. |
public template<> inline DynamicPGMIndex(Iterator first,Iterator last) |
Constructs the container on the sorted data in the range [first, last). |
public void insert_or_assign(const K & key,const V & value) |
Inserts an element into the container if key does not exists in the container. If key already exists, the corresponding value is updated with value. |
public void erase(const K & key) |
Removes the specified element from the container. |
public iterator find(const K & key) const |
Finds an element with key equivalent to key. |
public iterator lower_bound(const K & key) const |
Returns an iterator pointing to the first element that is not less than (i.e. greater or equal to) key. |
public bool empty() const |
Checks if the container has no elements, i.e. whether begin() == end(). |
public iterator begin() const |
Returns an iterator to the beginning. |
public iterator end() const |
Returns an iterator to the end. |
public size_t count(const K & key) const |
Returns the number of elements with key that compares equal to the specified argument key, which is either 1 or 0 since this container does not allow duplicates. |
public size_t size() const |
Returns the number of elements in the container. |
public size_t size_in_bytes() const |
Returns the size of the container in bytes. |
public size_t index_size_in_bytes() const |
Returns the size of the index used in this container in bytes. |
typedef key_type |
|
typedef mapped_type |
|
typedef value_type |
|
typedef size_type |
|
typedef iterator |
public DynamicPGMIndex()Constructs an empty container.
public template<> inline DynamicPGMIndex(Iterator first,Iterator last)Constructs the container on the sorted data in the range [first, last).
Parameters:
IteratorParameters:
first last the range containing the sorted elements to be indexedpublic void insert_or_assign(const K & key,const V & value)Inserts an element into the container if key does not exists in the container. If key already exists, the corresponding value is updated with value.
Parameters:
key element key to insert or update
value element value to insert
public void erase(const K & key)Removes the specified element from the container.
Parameters:
key key value of the element to removepublic iterator find(const K & key) constFinds an element with key equivalent to key.
Parameters:
key key value of the element to search forReturns:
an iterator to an element with key equivalent to key. If no such element is found, end() is returned
public iterator lower_bound(const K & key) constReturns an iterator pointing to the first element that is not less than (i.e. greater or equal to) key.
Parameters:
key key value to compare the elements toReturns:
an iterator to an element with key not less than key. If no such element is found, end() is returned
public bool empty() constChecks if the container has no elements, i.e. whether begin() == end().
Returns: true if the container is empty, false otherwise
public iterator begin() constReturns an iterator to the beginning.
Returns: an iterator to the beginning
public iterator end() constReturns an iterator to the end.
Returns: an iterator to the end
public size_t count(const K & key) constReturns the number of elements with key that compares equal to the specified argument key, which is either 1 or 0 since this container does not allow duplicates.
Parameters:
key key value of the elements to countReturns: number of elements with the given key, which is either 1 or 0.
public size_t size() constReturns the number of elements in the container.
Returns: the number of elements in the container
public size_t size_in_bytes() constReturns the size of the container in bytes.
Returns: the size of the container in bytes
public size_t index_size_in_bytes() constReturns the size of the index used in this container in bytes.
Returns: the size of the index used in this container in bytes
typedef key_typetypedef mapped_typetypedef value_typetypedef size_typetypedef iteratorpgm::CompressedPGMIndexA variant of PGMIndex that uses compression on the segments to reduce the space of the index.
Parameters:
K the type of the indexed keys
Epsilon controls the size of the returned search range
EpsilonRecursive controls the size of the search range in the internal structure
Floating the floating-point type to use for slopes
| Members | Descriptions |
|---|---|
public CompressedPGMIndex() = default |
Constructs an empty index. |
public explicit CompressedPGMIndex(const std::vector<K> & data) |
Constructs the compressed index on the given sorted vector. |
public template<> inline CompressedPGMIndex(Iterator first,Iterator last) |
Constructs the compressed index on the sorted elements in the range [first, last). |
public size_t size_in_bytes() const |
Returns the size of the index in bytes. |
public ApproxPos search(const K & key) const |
Returns the approximate position and the range where key can be found. |
public size_t segments_count() const |
Returns the number of segments in the last level of the index. |
public size_t height() const |
Returns the number of levels of the index. |
public CompressedPGMIndex() = defaultConstructs an empty index.
public explicit CompressedPGMIndex(const std::vector<K> & data)Constructs the compressed index on the given sorted vector.
Parameters:
data the vector of elements to be indexed, must be sortedpublic template<> inline CompressedPGMIndex(Iterator first,Iterator last)Constructs the compressed index on the sorted elements in the range [first, last).
Parameters:
first last the range containing the sorted elements to be indexedpublic size_t size_in_bytes() constReturns the size of the index in bytes.
Returns: the size of the index in bytes
public ApproxPos search(const K & key) constReturns the approximate position and the range where key can be found.
Parameters:
key the value of the element to search forReturns: a struct with the approximate position and bounds of the range
public size_t segments_count() constReturns the number of segments in the last level of the index.
Returns: the number of segments
public size_t height() constReturns the number of levels of the index.
Returns: the number of levels of the index
pgm::OneLevelPGMIndexA variant of PGMIndex that does not build a recursive structure but uses a binary search on the segments. It is defined as an alias for PGMIndex where the EpsilonRecursive template argument is set to zero.
pgm::BucketingPGMIndexA simple variant of OneLevelPGMIndex that builds a top-level lookup table to speed up the search on the segments.
The size of the top-level table is specified as a constructor argument. Additionally, the TopLevelBitSize template argument allows to specify the bit-size of the memory cells in the top-level table. It can be set either to a power of two or to 0. If set to 0, the bit-size of the cells will be determined dynamically so that the table is bit-compressed.
Parameters:
K the type of the indexed keys
Epsilon controls the size of the returned search range
TopLevelBitSize the bit-size of the cells in the top-level table, must be either 0 or a power of two
Floating the floating-point type to use for slopes
| Members | Descriptions |
|---|---|
public BucketingPGMIndex() = default |
Constructs an empty index. |
public BucketingPGMIndex(const std::vector<K> & data,size_t top_level_size) |
Constructs the index on the given sorted vector, with the specified top level size. |
public template<> inline BucketingPGMIndex(RandomIt first,RandomIt last,size_t top_level_size) |
Constructs the index on the sorted keys in the range [first, last), with the specified top level size. |
public BucketingPGMIndex(const std::vector<K> & data) |
Constructs the index on the given sorted vector, with a dynamically-set top level size. |
public template<> inline BucketingPGMIndex(RandomIt first,RandomIt last) |
Constructs the index on the sorted keys in the range [first, last), with a dynamically-set top level size. |
public ApproxPos search(const K & key) const |
Returns the approximate position and the range where key can be found. |
public size_t segments_count() const |
Returns the number of segments in the last level of the index. |
public size_t height() const |
Returns the number of levels of the index. |
public size_t size_in_bytes() const |
Returns the size of the index in bytes. |
public BucketingPGMIndex() = defaultConstructs an empty index.
public BucketingPGMIndex(const std::vector<K> & data,size_t top_level_size)Constructs the index on the given sorted vector, with the specified top level size.
Parameters:
data the vector of keys, must be sorted
top_level_size the number of cells allocated for the top-level table
public template<> inline BucketingPGMIndex(RandomIt first,RandomIt last,size_t top_level_size)Constructs the index on the sorted keys in the range [first, last), with the specified top level size.
Parameters:
first last the range containing the sorted keys to be indexed
top_level_size the number of cells allocated for the top-level table
public BucketingPGMIndex(const std::vector<K> & data)Constructs the index on the given sorted vector, with a dynamically-set top level size.
Parameters:
data the vector of keys, must be sortedpublic template<> inline BucketingPGMIndex(RandomIt first,RandomIt last)Constructs the index on the sorted keys in the range [first, last), with a dynamically-set top level size.
Parameters:
first last the range containing the sorted keys to be indexedpublic ApproxPos search(const K & key) constReturns the approximate position and the range where key can be found.
Parameters:
key the value of the element to search forReturns: a struct with the approximate position and bounds of the range
public size_t segments_count() constReturns the number of segments in the last level of the index.
Returns: the number of segments
public size_t height() constReturns the number of levels of the index.
Returns: the number of levels of the index
public size_t size_in_bytes() constReturns the size of the index in bytes.
Returns: the size of the index in bytes
pgm::EliasFanoPGMIndexA variant of OneLevelPGMIndex that builds a top-level succinct structure to speed up the search on the segments.
Parameters:
K the type of the indexed keys
Epsilon controls the size of the returned search range
Floating the floating-point type to use for slopes
| Members | Descriptions |
|---|---|
public EliasFanoPGMIndex() = default |
Constructs an empty index. |
public explicit EliasFanoPGMIndex(const std::vector<K> & data) |
Constructs the index on the given sorted vector. |
public template<> inline EliasFanoPGMIndex(RandomIt first,RandomIt last) |
Constructs the index on the sorted keys in the range [first, last). |
public ApproxPos search(const K & key) const |
Returns the approximate position and the range where key can be found. |
public size_t segments_count() const |
Returns the number of segments in the last level of the index. |
public size_t height() const |
Returns the number of levels of the index. |
public size_t size_in_bytes() const |
Returns the size of the index in bytes. |
public EliasFanoPGMIndex() = defaultConstructs an empty index.
public explicit EliasFanoPGMIndex(const std::vector<K> & data)Constructs the index on the given sorted vector.
Parameters:
data the vector of keys, must be sortedpublic template<> inline EliasFanoPGMIndex(RandomIt first,RandomIt last)Constructs the index on the sorted keys in the range [first, last).
Parameters:
first last the range containing the sorted keys to be indexedpublic ApproxPos search(const K & key) constReturns the approximate position and the range where key can be found.
Parameters:
key the value of the element to search forReturns: a struct with the approximate position and bounds of the range
public size_t segments_count() constReturns the number of segments in the last level of the index.
Returns: the number of segments
public size_t height() constReturns the number of levels of the index.
Returns: the number of levels of the index
public size_t size_in_bytes() constReturns the size of the index in bytes.
Returns: the size of the index in bytes
pgm::MappedPGMIndexclass pgm::MappedPGMIndex
: public pgm::PGMIndex< K, Epsilon, 4, float >
A disk-backed container storing a sorted sequence of numbers and a PGMIndex for fast search operations.
Parameters:
K the type of the indexed keys
Epsilon controls the size of the returned search range
EpsilonRecursive controls the size of the search range in the internal structure
Floating the floating-point type to use for slopes
| Members | Descriptions |
|---|---|
public template<> inline MappedPGMIndex(RandomIt first,RandomIt last,const std::string & out_filename) |
Constructs a new disk-backed container on the sorted keys in the range [first, last). |
public MappedPGMIndex(const std::string & in_filename,const std::string & out_filename) |
Constructs a new disk-backed container on the sorted keys from the given input file. |
public explicit MappedPGMIndex(const std::string & in_filename) |
Loads a disk-backed container from the given file. |
public ~MappedPGMIndex() |
Destructs the object and closes the file backing the container. |
public bool contains(const K & key) const |
Checks if there is an element with key equivalent to key in the container. |
public auto lower_bound(const K & key) const |
Returns an iterator pointing to the first element that is not less than (i.e. greater or equal to) key. |
public auto upper_bound(const K & key) const |
Returns an iterator pointing to the first element that is greater than key. |
public size_t count(const K & key) const |
Returns the number of elements with key equal to the specified argument. |
public size_t size() const |
Returns the number of elements in the container. |
public size_t file_size_in_bytes() const |
Returns the size in bytes of the file backing the container. |
public auto begin() const |
Returns an iterator to the first element of the container. |
public auto end() const |
Returns an iterator to the element following the last element of the container. |
public template<> inline MappedPGMIndex(RandomIt first,RandomIt last,const std::string & out_filename)Constructs a new disk-backed container on the sorted keys in the range [first, last).
Parameters:
first last the range containing the sorted keys to copy
out_filename the name of the output file
public MappedPGMIndex(const std::string & in_filename,const std::string & out_filename)Constructs a new disk-backed container on the sorted keys from the given input file.
The in_filename must be a binary file containing a sorted sequence numbers of type K with the same endianness of the CPU.
Parameters:
in_filename the name of the input file
out_filename the name of the output file
public explicit MappedPGMIndex(const std::string & in_filename)Loads a disk-backed container from the given file.
Parameters:
in_filename the name of the input filepublic ~MappedPGMIndex()Destructs the object and closes the file backing the container.
public bool contains(const K & key) constChecks if there is an element with key equivalent to key in the container.
Parameters:
key the value of the element to search forReturns:
true if there is such an element, otherwise false
public auto lower_bound(const K & key) constReturns an iterator pointing to the first element that is not less than (i.e. greater or equal to) key.
Parameters:
key value to compare the elements toReturns:
iterator to the first element that is not less than key, or end() if no such element is found
public auto upper_bound(const K & key) constReturns an iterator pointing to the first element that is greater than key.
Parameters:
key value to compare the elements toReturns:
iterator to the first element that is greater than key, or end() if no such element is found
public size_t count(const K & key) constReturns the number of elements with key equal to the specified argument.
Parameters:
key value of the elements to countReturns:
the number of elements with key equal to key
public size_t size() constReturns the number of elements in the container.
Returns: the number of elements in the container
public size_t file_size_in_bytes() constReturns the size in bytes of the file backing the container.
Returns: the size in bytes of the file backing the container
public auto begin() constReturns an iterator to the first element of the container.
Returns: an iterator to the first element of the container
public auto end() constReturns an iterator to the element following the last element of the container.
Returns: an iterator to the element following the last element of the container
pgm::MultidimensionalPGMIndexA multidimensional container that uses a PGMIndex for fast orthogonal range queries.
Parameters:
Dimensions the number of fields/dimensions
T the type of the stored elements
Epsilon the Epsilon parameter for the internal PGMIndex
EpsilonRecursive the EpsilonRecursive parameter for the internal PGMIndex
Floating the Floating parameter for the internal PGMIndex
| Members | Descriptions |
|---|---|
public MultidimensionalPGMIndex() = default |
Constructs an empty multidimensional container. |
public template<> inline MultidimensionalPGMIndex(RandomIt first,RandomIt last) |
Constructs the multidimensional container with the elements in the range [first, last). |
public size_t size_in_bytes() const |
Returns the size of the index in bytes. |
public bool contains(const value_type & p) |
Checks if there is an element equal to p in the container. |
public iterator begin() const |
Returns an iterator to the first element of the container. |
public iterator end() const |
Returns an iterator to the element following the last element of the container. |
public iterator range(const value_type& min,constvalue_type & max) |
Returns an iterator pointing to an element satisfying an orthogonal range query. |
typedef iterator |
|
typedef size_type |
|
typedef value_type |
public MultidimensionalPGMIndex() = defaultConstructs an empty multidimensional container.
public template<> inline MultidimensionalPGMIndex(RandomIt first,RandomIt last)Constructs the multidimensional container with the elements in the range [first, last).
Parameters:
first last the range to copy the elements frompublic size_t size_in_bytes() constReturns the size of the index in bytes.
Returns: the size of the index in bytes
public bool contains(const value_type & p)Checks if there is an element equal to p in the container.
Parameters:
p the element to search forReturns:
true if there is such an element, false otherwise
public iterator begin() constReturns an iterator to the first element of the container.
Returns: an iterator to the first element of the container
public iterator end() constReturns an iterator to the element following the last element of the container.
Returns: an iterator to the element following the last element of the container
public iterator range(const value_type& min,constvalue_type & max)Returns an iterator pointing to an element satisfying an orthogonal range query.
Equivalently, returns an iterator to an element lying inside the hyperrectangle defined by the extreme points min and max.
Successive elements can be retrieved by calling operator++() on the returned iterator it until it != end().
Parameters:
min the lower extreme of the query hyperrectangle
max the upper extreme of the query hyperrectangle, must be greater than or equal to min.
Returns: an iterator pointing to an element inside the query hyperrectangle
typedef iteratortypedef size_typetypedef value_typepgm::ApproxPosA struct that stores the result of a query to a PGMIndex, that is, a range [lo, hi) centered around an approximate position pos of the sought key.
| Members | Descriptions |
|---|---|
public size_t pos |
The approximate position of the key. |
public size_t lo |
The lower bound of the range. |
public size_t hi |
The upper bound of the range. |
public size_t posThe approximate position of the key.
public size_t loThe lower bound of the range.
public size_t hiThe upper bound of the range.
pgm::MultidimensionalPGMIndex::RangeIterator| Members | Descriptions |
|---|---|
public RangeIterator(internal_iterator it) |
|
public RangeIterator(internal_iterator it,const value_type & p) |
|
public RangeIterator(const decltype(super) super,const value_type& min,constvalue_type & max) |
|
public RangeIterator&operator++() |
|
public RangeIterator operator++(int) |
|
public reference operator*() const |
|
public pointer operator->() const |
|
public bool operator==(const iterator & rhs) const |
|
public bool operator!=(const iterator & rhs) const |
|
typedef value_type |
|
typedef difference_type |
|
typedef pointer |
|
typedef reference |
|
typedef iterator_category |
public RangeIterator(internal_iterator it)public RangeIterator(internal_iterator it,const value_type & p)public RangeIterator(const decltype(super) super,const value_type& min,constvalue_type & max)public RangeIterator&operator++()public RangeIterator operator++(int)public reference operator*() constpublic pointer operator->() constpublic bool operator==(const iterator & rhs) constpublic bool operator!=(const iterator & rhs) consttypedef value_typetypedef difference_typetypedef pointertypedef referencetypedef iterator_category