Ebook Preview

Learning Spark Streaming

Best Practices For Scaling And Optimizing Apache Spark

François Garillot

Gerard Maas

Audience: Developers, Architects

Technical level: Introductory

To build analytics tools that provide faster insights, knowing how to process data in real time is a must, and moving from batch processing to stream processing is absolutely required. Fortunately, the Spark in-memory framework/platform for processing data has added an extension devoted to fault-tolerant stream processing: Spark Streaming. If you're familiar with Apache Spark and want to learn how to implement it for streaming jobs, this practical book is a must.

  • Understand how Spark Streaming fits in the big picture
  • Learn core concepts such as Spark RDDs, Spark Streaming clusters, and the fundamentals of a DStream
  • Discover how to create a robust deployment
  • Dive into streaming algorithmics
  • Learn how to tune, measure, and monitor Spark Streaming

Grab your copy

Please enter your information to receive your E-book chapter(s) of Learning Spark Streaming and be signed up for the Lightbend Newsletter. Once you've entered your information and submitted the form, the PDF will be emailed to your address.

*Required: The information you provide will be used in accordance with the terms of our privacy policy. **Required Opt-In: I would like to receive the monthly newsletter, educational resources (white papers, ebooks, webinars, reports), and event information via email. You can unsubscribe at anytime or manage your email preferences here.


François Garillot

François Garillot worked on Scala's type system in 2006, earned his PhD from the French École Polytechnique in 2011, and worked at Lightbend, formerly known as Typesafe after a brief stint in Internet advertising. He's worked on interactive interfaces to the Scala compiler, while nourishing a strong enthusiasm for data analytics in his spare time, until Apache Spark let him fulfill this passion as his main job. He received the first Spark Certification in November 2014, and worked in London and Philadelphia, among other places. In his spare time, he can be found practicing one of a half-dozen ways of making coffee, climbing up or skiing down a not-necessarily-Alpine mountain, or sailing a not-necessarily coastal course.

Gerard Maas

Gerard Maas is the lead engineer at Kensu.io, an early stage startup where he works on context management for big-data environments. Previous to that, he led the design and development of the data processing pipeline of Virdata.com, a startup building a cloud-native IoT platform, where Scala, Apache Spark and Spark Streaming were crucial building blocks. He enjoys contributing to open source projects, small and large. Through his career in technology companies like Alcatel-Lucent, Bell Labs, Sony and Technicolor, he has been mostly involved in the interaction of services and devices, from early days service adaptation when mobile screens only had few text lines, passing through multi-device interactions to IoT device management. He has a degree in Computer Engineering from the Simón Bolívar University, Venezuela.

About Lightbend

Lightbend (Twitter: @Lightbend) provides the leading Reactive application development platform for building distributed systems. Based on a message-driven runtime, these distributed systems, which include microservices and streaming fast data applications, can effortlessly scale on multi-core and cloud architectures. Many of the most admired brands around the globe are transforming their businesses with our platform, engaging billions of users every day through software that is changing the world. For more information on Lightbend, visit: lightbend.com