Faster, Better, more Reliable Dataflow Handling Telematic and Insurance Big Data Ingest
There are key business questions an Insurance company might be interested in answering when offering pay-by-the-mile and other real-time insurance products or discounts. Is my customer driving too long? too fast? breaking too hard? In one example environment pictured above we show the case of a quite complex schema whose need for throughput is significant. The example begins with embedded sensors talking on industrial CAN bus connections and act as a source of data in customers’ vehicles. That data can be converted in situ to JSON and at this point ingested by the SigmaX stack from the network. Data is moved to highly efficient Apache Arrow in memory columnar format and dropped into Apache Pulsar, organized into streams and topics to which client applications subscribe. All of this is done at scale appropriate to handle the millions of vehicles under coverage.
Telematic data streaming in at this scale can enable innovative product offerings but also challenges common to edge computing and big data analytics. High message volume, the need to reliably connect data with analytics clients which may exist in fog, enterprise or cloud environments, and a need for analytics efficiency are key data engineering concerns to address. The SigmaX Stack offers unique value to these industries by providing the fast, low latency and geo-replicated messaging capability as well as low latency, efficient data ingest to quickly and get data to client applications in distributed and heterogeneous compute enabled environments.
SigmaX offers a full stack solution, hardware and software plus services and support all under one roof. Deal with a single supplier instead of multiple point vendors to tie together a reliable streaming solution.
For example, consider, the node.IoT edge ingest includes FPGA Assisted data coercion to Apache Arrow format, which is a real-time, speed of the wire capability. Transformed data no longer needs indexing, AVRO or SERDES operations to compute it. This same FPGA resource offers customizable data pre-processing and alert generation to recognize priority data ingested in real-time. Data are moved into streams and topics within Pulsar. Normal or non-alert data can be withheld and either separately processed locally or can be moved out to long term cloud storage. Every layer in the stack offers innate geo-replicated reliability.
Reach out and talk with us more about how we can help you solve your data engineering challenges!