btrdb.stream¶
Module for Stream and related classes
-
class
btrdb.stream.
Stream
(btrdb, uuid, **db_values)¶ An object that represents a specific time series stream in the BTrDB database.
Parameters: - btrdb (BTrDB) – A reference to the BTrDB object connecting this stream back to the physical server.
- uuid (UUID) – The unique UUID identifier for this stream.
- db_values (kwargs) – Framework only initialization arguments. Not for developer use.
-
aligned_windows
(start, end, pointwidth, version=0)¶ Read statistical aggregates of windows of data from BTrDB.
Query BTrDB for aggregates (or roll ups or windows) of the time series with version between time start (inclusive) and end (exclusive) in nanoseconds. Each point returned is a statistical aggregate of all the raw data within a window of width 2**`pointwidth` nanoseconds. These statistical aggregates currently include the mean, minimum, and maximum of the data and the count of data points composing the window.
Note that start is inclusive, but end is exclusive. That is, results will be returned for all windows that start in the interval [start, end). If end < start+2^pointwidth you will not get any results. If start and end are not powers of two, the bottom pointwidth bits will be cleared. Each window will contain statistical summaries of the window. Statistical points with count == 0 will be omitted.
Parameters: - start (int or datetime like object) – The start time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - end (int or datetime like object) – The end time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - pointwidth (int) – Specify the number of ns between data points (2**pointwidth)
- version (int) – Version of the stream to query
Returns: Returns a tuple containing windows of data. Each window is a tuple containing data tuples. Each data tuple contains a StatPoint and the stream version.
Return type: tuple
Notes
As the window-width is a power-of-two, it aligns with BTrDB internal tree data structure and is faster to execute than windows().
- start (int or datetime like object) – The start time in nanoseconds for the range to be queried. (see
-
annotations
(refresh=False)¶ Returns a stream’s annotations
Annotations returns the annotations of the stream (and the annotation version). It will always require a round trip to the server. If you are ok with stale data and want a higher performance version, use Stream.CachedAnnotations().
Do not modify the resulting map.
Parameters: refresh (bool) – Indicates whether a round trip to the server should be implemented regardless of whether there is a local copy. Returns: A tuple containing a dictionary of annotations and an integer representing the version of the metadata (tuple(dict, int)). Return type: tuple
-
btrdb
¶ Returns the stream’s BTrDB object.
Parameters: None – Returns: The BTrDB database object. Return type: BTrDB
-
collection
¶ Returns the collection of the stream. It may require a round trip to the server depending on how the stream was acquired.
Parameters: None – Returns: the collection of the stream Return type: str
-
count
(start=-1152921504606846976, end=3458764513820540927, pointwidth=62, precise=False, version=0)¶ Compute the total number of points in the stream
Counts the number of points in the specified window and version. By default returns the latest total count of points in the stream. This helper method sums the counts of all StatPoints returned by
aligned_windows
. Because of this, note that the start and end timestamps may be adjusted if they are not powers of 2. For smaller windows of time, you may also need to adjust the pointwidth to ensure that the count granularity is captured appropriately.Alternatively you can set the precise argument to True which will give an exact count to the nanosecond but may be slower to execute.
Parameters: - start (int or datetime like object, default: MINIMUM_TIME) – The start time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - end (int or datetime like object, default: MAXIMUM_TIME) – The end time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - pointwidth (int, default: 62) – Specify the number of ns between data points (2**pointwidth). If the value is too large for the time window than the next smallest, appropriate pointwidth will be used.
- precise (bool, default: False) – Forces the use of a windows query instead of aligned_windows to determine exact count down to the nanosecond. This will be some amount slower than the aligned_windows version.
- version (int, default: 0) – Version of the stream to query
Returns: The total number of points in the stream for the specified window.
Return type: int
- start (int or datetime like object, default: MINIMUM_TIME) – The start time in nanoseconds for the range to be queried. (see
-
current
(version=0)¶ Returns the point that is closest to the current timestamp, e.g. the latest point in the stream up until now. Note that no future values will be returned. Returns None if errors during lookup or there are no values before now.
Parameters: version (int, default: 0) – Specify the version of the stream to query; if zero, queries the latest stream state rather than pinning to a version. Returns: The last data point in the stream up until now and the version of the stream the value was retrieved at (tuple(RawPoint, int)). Return type: tuple
-
delete
(start, end)¶ “Delete” all points between [start, end)
“Delete” all points between start (inclusive) and end (exclusive), both in nanoseconds. As BTrDB has persistent multiversioning, the deleted points will still exist as part of an older version of the stream.
Parameters: - start (int or datetime like object) – The start time in nanoseconds for the range to be deleted. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - end (int or datetime like object) – The end time in nanoseconds for the range to be deleted. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types)
Returns: The version of the new stream created
Return type: int
- start (int or datetime like object) – The start time in nanoseconds for the range to be deleted. (see
-
earliest
(version=0)¶ Returns the first point of data in the stream. Returns None if error encountered during lookup or no values in stream.
Parameters: version (int, default: 0) – Specify the version of the stream to query; if zero, queries the latest stream state rather than pinning to a version. Returns: The first data point in the stream and the version of the stream the value was retrieved at (tuple(RawPoint, int)). Return type: tuple
-
exists
()¶ Check if stream exists
Exists returns true if the stream exists. This is essential after using StreamFromUUID as the stream may not exist, causing a 404 error on later stream operations. Any operation that returns a stream from collection and tags will have ensured the stream exists already.
Parameters: None – Returns: Indicates whether stream is extant in the BTrDB server. Return type: bool
-
flush
()¶ Flush writes the stream buffers out to persistent storage.
-
insert
(data)¶ Insert new data in the form (time, value) into the series.
Inserts a list of new (time, value) tuples into the series. The tuples in the list need not be sorted by time. If the arrays are larger than appropriate, this function will automatically chunk the inserts. As a consequence, the insert is not necessarily atomic, but can be used with a very large array.
Parameters: data (list[tuple[int, float]]) – A list of tuples in which each tuple contains a time (int) and value (float) for insertion to the database Returns: The version of the stream after inserting new points. Return type: int
-
latest
(version=0)¶ Returns last point of data in the stream. Returns None if error encountered during lookup or no values in stream.
Parameters: version (int, default: 0) – Specify the version of the stream to query; if zero, queries the latest stream state rather than pinning to a version. Returns: The last data point in the stream and the version of the stream the value was retrieved at (tuple(RawPoint, int)). Return type: tuple
-
name
¶ Returns the stream’s name which is parsed from the stream tags. This may require a round trip to the server depending on how the stream was acquired.
Returns: The name of the stream. Return type: str
-
nearest
(time, version, backward=False)¶ Finds the closest point in the stream to a specified time.
Return the point nearest to the specified time in nanoseconds since Epoch in the stream with version while specifying whether to search forward or backward in time. If backward is false, the returned point will be >= time. If backward is true, the returned point will be < time. The version of the stream used to satisfy the query is returned.
Parameters: - time (int or datetime like object) – The time (in nanoseconds since Epoch) to search near (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - version (int) – Version of the stream to use in search
- backward (boolean) – True to search backwards from time, else false for forward
Returns: The closest data point in the stream and the version of the stream the value was retrieved at (tuple(RawPoint, int)).
Return type: tuple
- time (int or datetime like object) – The time (in nanoseconds since Epoch) to search near (see
-
obliterate
()¶ Obliterate deletes a stream from the BTrDB server. An exception will be raised if the stream could not be found.
Raises: BTrDBError [404] stream does not exist – The stream could not be found in BTrDB
-
refresh_metadata
()¶ Refreshes the locally cached meta data for a stream
Queries the BTrDB server for all stream metadata including collection, annotation, and tags. This method requires a round trip to the server.
Returns the stream’s tags.
Tags returns the tags of the stream. It may require a round trip to the server depending on how the stream was acquired.
Parameters: refresh (bool) – Indicates whether a round trip to the server should be implemented regardless of whether there is a local copy. Returns: A dictionary containing the tags. Return type: dict
-
unit
¶ Returns the stream’s unit which is parsed from the stream tags. This may require a round trip to the server depending on how the stream was acquired.
Returns: The unit for values of the stream. Return type: str
-
update
(tags=None, annotations=None, collection=None, encoder=<class 'btrdb.utils.conversion.AnnotationEncoder'>, replace=False)¶ Updates metadata including tags, annotations, and collection as an UPSERT operation.
By default, the update will only affect the keys and values in the specified tags and annotations dictionaries, inserting them if they don’t exist, or updating the value for the key if it does exist. If any of the update arguments (i.e. tags, annotations, collection) are None, they will remain unchanged in the database.
To delete either tags or annotations, you must specify exactly which keys and values you want set for the field and set replace=True. For .. rubric:: Example
>>> annotations, _ = stream.anotations() >>> del annotations["key_to_delete"] >>> stream.update(annotations=annotations, replace=True)
This ensures that all of the keys and values for the annotations are preserved except for the key to be deleted.
Parameters: - tags (dict, optional) – Specify the tag key/value pairs as a dictionary to upsert or replace. If None, the tags will remain unchanged in the database.
- annotations (dict, optional) – Specify the annotations key/value pairs as a dictionary to upsert or replace. If None, the annotations will remain unchanged in the database.
- collection (str, optional) – Specify a new collection for the stream. If None, the collection will remain unchanged.
- encoder (json.JSONEncoder or None) – JSON encoder class to use for annotation serialization. Set to None to prevent JSON encoding of the annotations.
- replace (bool, default: False) – Replace all annotations or tags with the specified dictionaries instead of performing the normal upsert operation. Specifying True is the only way to remove annotation keys.
Returns: The version of the metadata (separate from the version of the data) also known as the “property version”.
Return type: int
-
uuid
¶ Returns the stream’s UUID. The stream may nor may not exist yet, depending on how the stream object was obtained.
Returns: The unique identifier of the stream. Return type: UUID See also
stream.exists
-
values
(start, end, version=0)¶ Read raw values from BTrDB between time [a, b) in nanoseconds.
RawValues queries BTrDB for the raw time series data points between start and end time, both in nanoseconds since the Epoch for the specified stream version.
Parameters: - start (int or datetime like object) – The start time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - end (int or datetime like object) – The end time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - version (int) – The version of the stream to be queried
Returns: Returns a list of tuples containing a RawPoint and the stream version (list(tuple(RawPoint,int))).
Return type: list
Notes
Note that the raw data points are the original values at the sensor’s native sampling rate (assuming the time series represents measurements from a sensor). This is the lowest level of data with the finest time granularity. In the tree data structure of BTrDB, this data is stored in the vector nodes.
- start (int or datetime like object) – The start time in nanoseconds for the range to be queried. (see
-
version
()¶ Returns the current data version of the stream.
Version returns the current data version of the stream. This is not cached, it queries each time. Take care that you do not intorduce races in your code by assuming this function will always return the same vaue
Parameters: None – Returns: The version of the stream. Return type: int
-
windows
(start, end, width, depth=0, version=0)¶ Read arbitrarily-sized windows of data from BTrDB. StatPoint objects will be returned representing the data for each window.
Parameters: - start (int or datetime like object) – The start time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - end (int or datetime like object) – The end time in nanoseconds for the range to be queried. (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - width (int) – The number of nanoseconds in each window, subject to the depth parameter.
- depth (int) – The precision of the window duration as a power of 2 in nanoseconds. E.g 30 would make the window duration accurate to roughly 1 second
- version (int) – The version of the stream to query.
Returns: Returns a tuple containing windows of data. Each window is a tuple containing data tuples. Each data tuple contains a StatPoint and the stream version (tuple(tuple(StatPoint, int), …)).
Return type: tuple
Notes
Windows returns arbitrary precision windows from BTrDB. It is slower than AlignedWindows, but still significantly faster than RawValues. Each returned window will be width nanoseconds long. start is inclusive, but end is exclusive (e.g if end < start+width you will get no results). That is, results will be returned for all windows that start at a time less than the end timestamp. If (end - start) is not a multiple of width, then end will be decreased to the greatest value less than end such that (end - start) is a multiple of width (i.e., we set end = start + width * floordiv(end - start, width). The depth parameter is an optimization that can be used to speed up queries on fast queries. Each window will be accurate to 2^depth nanoseconds. If depth is zero, the results are accurate to the nanosecond. On a dense stream for large windows, this accuracy may not be required. For example for a window of a day, +- one second may be appropriate, so a depth of 30 can be specified. This is much faster to execute on the database side.
- start (int or datetime like object) – The start time in nanoseconds for the range to be queried. (see
-
class
btrdb.stream.
StreamSetBase
(streams)¶ A lighweight wrapper around a list of stream objects
-
aligned_windows
(pointwidth)¶ Stores the request for an aligned windowing operation when the query is eventually materialized.
Parameters: pointwidth (int) – The length of each returned window as computed by 2^pointwidth. Returns: Returns self Return type: StreamSet Notes
aligned_windows reads power-of-two aligned windows from BTrDB. It is faster than Windows(). Each returned window will be 2^pointwidth nanoseconds long, starting at start. Note that start is inclusive, but end is exclusive. That is, results will be returned for all windows that start in the interval [start, end). If end < start+2^pointwidth you will not get any results. If start and end are not powers of two, the bottom pointwidth bits will be cleared. Each window will contain statistical summaries of the window. Statistical points with count == 0 will be omitted.
-
clone
()¶ Returns a deep copy of the object. Attributes that cannot be copied will be referenced to both objects.
Parameters: None – Returns: Returns a new copy of the instance Return type: StreamSet
-
count
()¶ Compute the total number of points in the streams using filters.
Computes the total number of points across all streams using the specified filters. By default, this returns the latest total count of all points in the streams. The count is modified by start and end filters or by pinning versions.
Note that this helper method sums the counts of all StatPoints returned by
aligned_windows
. Because of this the start and end timestamps may be adjusted if they are not powers of 2. You can also set the pointwidth property for smaller windows of time to ensure that the count granularity is captured appropriately.Parameters: None – Returns: The total number of points in all streams for the specified filters. Return type: int
-
current
()¶ Returns the points of data in the streams closest to the current timestamp. If the current timestamp is outside of the filtered range of data, a ValueError is raised.
Parameters: None – Returns: The latest points of data found among all streams Return type: tuple
-
earliest
()¶ Returns earliest points of data in streams using available filters.
Parameters: None – Returns: The earliest points of data found among all streams Return type: tuple
-
filter
(start=None, end=None, collection=None, name=None, unit=None, tags=None, annotations=None)¶ Provides a new StreamSet instance containing stored query parameters and stream objects that match filtering criteria.
The collection, name, and unit arguments will be used to select streams from the original StreamSet object. If a string is supplied, then a case-insensitive exact match is used to select streams. Otherwise, you may supply a compiled regex pattern that will be used with re.search.
The tags and annotations arguments expect dictionaries for the desired key/value pairs. Any stream in the original instance that has the exact key/values will be included in the new StreamSet instance.
Parameters: - start (int or datetime like object) – the inclusive start of the query (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - end (int or datetime like object) – the exclusive end of the query (see
btrdb.utils.timez.to_nanoseconds()
for valid input types) - collection (str or regex) – string for exact (case-insensitive) matching of collection when filtering streams or a compiled regex expression for re.search of stream collections.
- name (str or regex) – string for exact (case-insensitive) matching of name when filtering streams or a compiled regex expression for re.search of stream names.
- unit (str or regex) – string for exact (case-insensitive) matching of unit when filtering streams or a compiled regex expression for re.search of stream units.
- tags (dict) – key/value pairs for filtering streams based on tags
- annotations (dict) – key/value pairs for filtering streams based on annotations
Returns: a new instance cloned from the original with filters applied
Return type: StreamSet
- start (int or datetime like object) – the inclusive start of the query (see
-
latest
()¶ Returns latest points of data in the streams using available filters.
Parameters: None – Returns: The latest points of data found among all streams Return type: tuple
-
pin_versions
(versions=None)¶ Saves the stream versions that future materializations should use. If no pin is requested then the first materialization will automatically pin the return versions. Versions can also be supplied through a dict object with key:UUID, value:stream.version().
Parameters: versions (dict[UUID: int]) – A dict containing the stream UUID and version ints as key/values Returns: Returns self Return type: StreamSet
-
rows
()¶ Returns a materialized list of tuples where each tuple contains the points from each stream at a unique time. If a stream has no value for that time than None is provided instead of a point object.
Parameters: None – Returns: A list of tuples containing a RawPoint (or StatPoint) and the stream version (list(tuple(RawPoint, int))). Return type: list
-
values
()¶ Returns a fully materialized list of lists for the stream values/points
-
values_iter
()¶ Must return context object which would then close server cursor on __exit__
-
versions
()¶ Returns a dict of the stream versions. These versions are the pinned values if previously pinned or the latest stream versions if not pinned.
Parameters: None – Returns: A dict containing the stream UUID and version ints as key/values Return type: dict
-
windows
(width, depth)¶ Stores the request for a windowing operation when the query is eventually materialized.
Parameters: - width (int) – The number of nanoseconds to use for each window size.
- depth (int) – The requested accuracy of the data up to 2^depth nanoseconds. A depth of 0 is accurate to the nanosecond.
Returns: Returns self
Return type: StreamSet
Notes
Windows returns arbitrary precision windows from BTrDB. It is slower than aligned_windows, but still significantly faster than values. Each returned window will be width nanoseconds long. start is inclusive, but end is exclusive (e.g if end < start+width you will get no results). That is, results will be returned for all windows that start at a time less than the end timestamp. If (end - start) is not a multiple of width, then end will be decreased to the greatest value less than end such that (end - start) is a multiple of width (i.e., we set end = start + width * floordiv(end - start, width). The depth parameter is an optimization that can be used to speed up queries on fast queries. Each window will be accurate to 2^depth nanoseconds. If depth is zero, the results are accurate to the nanosecond. On a dense stream for large windows, this accuracy may not be required. For example for a window of a day, +- one second may be appropriate, so a depth of 30 can be specified. This is much faster to execute on the database side.
-