Databricks Accelerates Apache Spark’s Structured Streaming and Launches Production Platform

Company’s cloud platform operationalizes streaming with monitoring, debugging, lower latency and higher throughput


SAN FRANCISCO, June 06, 2017 (GLOBE NEWSWIRE) -- Databricks, the company founded by the creators of the popular Apache Spark project, today announced the general availability of Structured Streaming, a high-level API that enables stream processing at up to five times higher throughput than other engines, on its cloud platform. Databricks is also contributing new code to Apache Spark that lowers the latency of Structured Streaming to the sub-millisecond range and greatly accelerates its throughput.

“With Structured Streaming, customers can now get best-in-class latency while simultaneously benefitting from Spark’s much simpler streaming APIs and lowering the operational cost of their streaming applications by up to five times,” said Matei Zaharia, cofounder and chief technologist at Databricks. “We are excited to keep working with the open source community to build out Structured Streaming and to deliver continuous application capabilities to our customers.”

Available today on Databricks' managed cloud service when users choose "Databricks Runtime 3.0," Structured Streaming makes it easier for users to build end-to-end streaming applications that integrate with storage, serving systems and batch jobs in a consistent and fault-tolerant way. Additional features of Structured Streaming include:

  • Custom stateful processing for complex business logic such as sessionization;
  • Production monitoring for streaming jobs, alerting and management;
  • Connection to common data sources, including S3, Kinesis and Kafka.

Read more about this announcement in the blog post: https://databricks.com/blog/2017/06/06/simple-super-fast-streaming-engine-apache-spark.html

Access a trial of Databricks: databricks.com/try-databricks

About Databricks:
Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Founded by the team who created Apache Spark™, Databricks provides a Unified Analytics Platform for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership. Databricks, venture-backed by Andreessen Horowitz and NEA, has a global customer base that includes CapitalOne, Salesforce, Viacom, Amgen, Shell and HP.  For more information, visit www.databricks.com.

Media Contact:
Suzanne Block for Databricks
P: 415-247-1666
E: databricksmg@merrittgrp.com