Ensemble Architecture

We refer to devices as “Ensembles” in the parlance of TickTalk due to the large amount of heteogeniety in the hardware elements accessible to each device. Here, we describe the internal software architecture of an Ensemble at runtime, which is designed to support many SQs on devices with potentially multiple CPU cores by running multiple processes with well-defined and consistent interfaces.

The image below shows these processes on an ensemble. Note that a ‘Runtime Manager’ Process would only run on the Ensemble that has been designated as the runtime manager; it does not exist on most Ensembles in practice.

../_images/Ensemble_processes.png

These processes’ exact definitions can be found in TTInputTokenProcess, TTExecuteProcess, TTNetworkManager, and TTRuntimeManagerProcess.

We follow a common paradigm here: Each process receives all its input through a singular thread/process-safe queue, and each message is self-describing such that the process knows exactly how to interpret the message payload when it arrives (gracefully ignoring those it does not recognize). These processes do not directly share any memory, although the InputTokenProcess and ExecuteProcess both use the same syscalls to access the synchronized clock (of which the TTClocks derive their current timestamps from). This is an architectural decision meant to provide consistency and easily extensible interfaces. However, this abstraction does carry nontrivial overhead, particularly in terms of how long it takes to send data between processes (copying virtual memory, serializing objects, context-switching at the OS level). Implementing these processes as threads reduces context-switching and memory-sharing overhead, but prevents efficient use of multi-core processors since Python3 is otherwise single-threaded.

SQ Synchronization and Timing Control

The InputTokenProcess handles the more novel aspects of our timed dataflow model, in that it holds input TTTokens in intermediate storage (also called a waiting-matching section) while it checks the TTFiringRule. In timed dataflow, this includes looking a time intervals for overlaps that suggest concurrency in the sampled data.

This portion of the Ensemble also handles the majority of timing control mechanisms, such as when to sample for a stream again or when a deadline should expire. Usually, this entails creating an Message that will release a token some time in the future.

As inputs are synchronized such that the firing rule is satisfied, those tokens are generally evicted from the waiting-matching section. However, since we use a fuzzy and fungible mechanism for passing this synchronization barrier, we ask ourselves, “what happens with the tokens that fall through the cracks?” In Dataflow literature, this was avoided entirely by self-cleaning graphs, which operated by more idealized synchronization. We have no such privilege here. This issue is no different from a classic problem with dynamic memory – garbage collection. Our existing solution to this problem is to once again leverage time. In a stream-processing context, the older a token is, the less likely it is to be consumed and eventually, it is stale enough to garbage collect. In practice, we check against some staleness criteria whenever a firing rule passes to evict tokens that are too old to use.

SQ Function Execution

The TTExecuteProcess handles SQ execution and SQ execution only. As SQs arrive and are instantiated, this process creates a unique namespace for each SQ and pre-executes (Python exec) the SQ so that the actual function within the SQ is prepared for execution (this helps each SQ run slightly faster, especially for the first invocation).

When executing, this process grabs values from the tokens and provides them to the programmer-defined function. In some instances, the programmer may specify they need access to the full tokens (using an optional keyword argument TTExecuteOnFullToken=True in the function definition; see Instructions for an example). In this scenario, they can manipulate time intervals and values however they like, but are not free to manipulate clocks given the potential side effects for other SQs.

Some SQs require a new time-interval to be generated, such as the output of a stream-generating (i.e., STREAMify -ed) SQ. For these, the timestamp of sampling is approximated as the midpoint between when execution started and finished. The time-interval is then set to be that timestamp plus-minus some data-validity interval, which the programmer should have specified according to their needs (defaults to the period). Otherwise, the output tokens carry the same time-interval as the intersection of the input tokens’ intervals.

SQ Execution is also where any physical IO will occur. The programmer’s SQify'd code will do this. If this requires libraries to interact with lower-level mechanisms implemented within the ensemble, those libraries must be imported within user’s defined function (putting imports at the top of the file that SQify -s is not sufficient.)

SQ Output Forwarding and Network Management

The TTNetworkManager plays a central role in that it implements the TTNetworkInterface, such that any messages that arrive to the Ensemble come to this process before being unpacked and sent to the correct process. This also handles the Forwarding section of SQs.

In managing the TTNetwork, this process is also responsible for maintaining a routing table that helps associate ensemble names with the location to send TTNetworkMessages to. This routing table is generally constructed while Ensembles are joining the network with the help of the Runtime Manager. The format of these addresses depends on the specific TTNetworkInterface implementation, but each Ensemble should be able to send messages to another Ensemble without fear of an inaccessible address (Note that we take a decentralized approach at runtime. A less efficient but more robust strategy might be to use pub-sub like MQTT). Ensembles send TTNetworkMessages to each other by dereferencing a TTEnsemble’s name into an address that it can use to send the bytes of that message through the network interface that it implements.

Each TTNetworkMessage contains one or more Messages, which have clearly defined information to delineate who (among the set of local processes) and how the individual components of the Network Message should be handled.

The Runtime Manager

The Runtime Manager is a priveleged ensemble within the TickTalk Runtime Environment. For the time being, there is one of these per Runtime Environment. The Runtime Manager, as the name suggests, only handles operations within the ‘management’ plane at run time; it plays a minimal role during graph interpretation.

In essence, the Runtime Manager helps setup and teardown the system before and after graph interpretation. A Runtime Manager is any ensemble that implements the TTRuntimeManagerProcess. Its responsibilities include accepting new TTEnsembles, keeping track of all TTEnsembles that have connected thus far, propagating or updating routing information to help ensembles find each other in the network (typically using a reachable IP and port), mapping SQ, instantiating SQs, sending TTClock specifications to ensembles, injecting initial tokens into the graph along input arcs (to initiate graph interpretation), and logging of output tokens. The TTRuntimeManagerProcess only interacts with the TTNetworkManager

More details on the Runtime Manager and system-wide procedures are described in the TickTalk Runtime Environment page.