Viewing Stream Data¶
At a high level, there are two options available when you are ready to retrieve the time series data in a stream. You may view the values directly by timestamp or you can view a window of data at a resolution of your choice. When viewing by window, there are further options available with different arguments and related performance benefits.
View Individual Data Points¶
To view the values directly, call the Stream.values
method which will
fully materialize the stream values at the stream version you specify (use the
default value of zero as the latest version). A start
and end
argument is required when making this request.
Calling Stream.values
will return a series of tuple
, with each item containing a
RawPoint
, and version of the stream (int
). As described in the
API reference, a RawPoint
has both a time
and value
property.
start = 1500000000000000000
end = 1547241923338098176
for point, _ in stream.values(start=start, end=end, version=133):
print(point)
>> RawPoint(1500000000000000000, 2.35)
>> RawPoint(1500000000100000000, 2.41)
>> RawPoint(1500000000200000000, 2.8)
>> RawPoint(1500000000300000000, 3.66)
...
Helpers for Dates/Times¶
If you are interested in finding the closest point to a particular datetime, there
is the Stream.nearest
method. Alternatively, if you want to know the first or
last points in a stream, you can call the Stream.earliest
and Stream.latest
methods. These two are often useful if you would like to view all of the data
within the stream using the Stream.windows
method below (it is not recommended that
you query for all the data using the Stream.values
method due to the memory
consumption implied). Each of these three methods returns a tuple containing a
RawPoint and the data version number. The exact timestamp can be obtained from the
RawPoint. Keep in mind that all of these methods accept a version
argument so that
you can ask for the earliest, latest, or nearest point from a previous version of the stream.
stream = db.stream_from_uuid("6f8ebaf0-78ea-416e-a0ff-5c3c5d83c279")
stream.earliest()
>> (RawPoint(1364860800000000000, 42516.03), 3934)
stream.earliest()[0].time
>> 1364860800000000000
View Windows of Data¶
If you don’t need to view every single point of data, then it is faster to view higher order representations of the data. BTrDB stores data in a tree structure such that the leaves of the tree contain actual values and higher nodes store statistical data (min, max, mean, etc.) summaries. In this schema viewing summaries of data involves reading from higher levels of the tree and therefore less nodes need to be read from disk.
This use case of wanting a high level summary of data is quite common. For example, when rendering the plot of a time series it will often be useful to present a view at the resolution of one hour, one day, or perhaps one year. With samples that occur at greater than 1Hz this requires you to summarize the values and plot the average (or min, max, etc.) values rather than each individual value.
Because BTrDB is usually providing summaries of data when windowing, it returns
instances of StatPoint
rather than RawPoint
. A StatPoint
contains statistical information about a range of time and specifically provides
properties for min
, mean
, max
, count
,
stddev
, and the start time
for which the statistical summaries
cover.
For statistical aggregates of your data, the Stream.aligned_windows
method is
the fastest way to query your data. Each point returned is a statistical
aggregate of all the raw data within a window of width 2^pointwidth
nanoseconds.
Note that start
is inclusive, but end
is exclusive. That is, results
will be returned for all windows that start in the interval [start, end).
If end < start+2^pointwidth you will not get any results. If start and
end are not powers of two, the bottom pointwidth bits will be cleared.
Each window will contain statistical summaries of the window. Statistical points
with count == 0 will be omitted.
start = 1500000000000000000
end = 1500000001000000000
# view underlying data for comparison
for point, _ in stream.values(start=start, end=end):
print(point)
>> RawPoint(1500000000000000000, 1.0)
>> RawPoint(1500000000100000000, 2.0)
>> RawPoint(1500000000200000000, 3.0)
>> RawPoint(1500000000300000000, 4.0)
>> RawPoint(1500000000400000000, 5.0)
>> RawPoint(1500000000500000000, 6.0)
>> RawPoint(1500000000600000000, 7.0)
>> RawPoint(1500000000700000000, 8.0)
>> RawPoint(1500000000800000000, 9.0)
>> RawPoint(1500000000900000000, 10.0)
# aggregate over 2^28 nanoseconds (268,435,456)
pointwidth = 28
# view data aggregates
for point, _ in stream.aligned_windows(start=start, end=end,
pointwidth=pointwidth):
print(point)
>> StatPoint(1499999999814008832, 1.0, 1.0, 1.0, 1, 0.0)
>> StatPoint(1500000000082444288, 2.0, 3.0, 4.0, 3, 0.816496580927726)
>> StatPoint(1500000000350879744, 5.0, 6.0, 7.0, 3, 0.816496580927726)
>> StatPoint(1500000000619315200, 8.0, 8.5, 9.0, 2, 0.5)
The Stream.windows
method of a Stream allows you to request windows of data
while specifying the precision of the data you require. Each window will cover
width
nanoseconds in length. Precision of the result is determined by
the depth
parameter such that each window will be accurate to
2^depth nanoseconds.
Using a larger depth value will result in faster query execution from the database. For instance, if you are viewing a 24 hours of data you may only require a precision of +/- 1 second and so a depth of 30 may be appropriate. A chart of sample depths are provided below.
Depth | Calculation | Precision in Nanoseconds | Time |
---|---|---|---|
0 | 2^0 | 1 | 1 nanosecond |
10 | 2^10 | 1024 | ~1 microsecond |
20 | 2^20 | 1048576 | ~1 millesecond |
30 | 2^30 | 1073741824 | ~1 second |
As usual when querying data from BTrDB, the start
time is inclusive
while the end
time is exclusive. Note that if your last window spans
across the end time then it will not be included in the results.
start = 1500000000000000000
end = 1500000001000000000
# view underlying data for comparison
for point, _ in stream.values(start=start, end=end):
print(point)
>> RawPoint(1500000000000000000, 1.0)
>> RawPoint(1500000000100000000, 2.0)
>> RawPoint(1500000000200000000, 3.0)
>> RawPoint(1500000000300000000, 4.0)
>> RawPoint(1500000000400000000, 5.0)
>> RawPoint(1500000000500000000, 6.0)
>> RawPoint(1500000000600000000, 7.0)
>> RawPoint(1500000000700000000, 8.0)
>> RawPoint(1500000000800000000, 9.0)
>> RawPoint(1500000000900000000, 10.0)
# each window spans 300 milleseconds
width = 300000000
# request a precision of roughly 1 millesecond
depth = 20
# view windowed data
for point, _ in stream.windows(start=start, end=end,
width=width, depth=depth):
>> StatPoint(1500000000000000000, 1.0, 2.0, 3.0, 3, 0.816496580927726)
>> StatPoint(1500000000300000000, 4.0, 5.0, 6.0, 3, 0.816496580927726)
>> StatPoint(1500000000600000000, 7.0, 8.0, 9.0, 3, 0.816496580927726)