Ebook Preview

Learning Spark Streaming

Best Practices For Scaling And Optimizing Apache Spark

François Garillot

Gerard Maas

Audience: Developers, Architects

Technical level: Introductory

To build analytics tools that provide faster insights, knowing how to process data in real time is a must, and moving from batch processing to stream processing is absolutely required. Fortunately, the Spark in-memory framework/platform for processing data has added an extension devoted to fault-tolerant stream processing: Spark Streaming. If you're familiar with Apache Spark and want to learn how to implement it for streaming jobs, this practical book is a must.

  • Understand how Spark Streaming fits in the big picture
  • Learn core concepts such as Spark RDDs, Spark Streaming clusters, and the fundamentals of a DStream
  • Discover how to create a robust deployment
  • Dive into streaming algorithmics
  • Learn how to tune, measure, and monitor Spark Streaming

Grab your copy

Please enter your information to receive your E-book chapter(s) of Learning Spark Streaming and be signed up for the Lightbend Newsletter. Once you've entered your information and submitted the form, the PDF will be emailed to your address.

*Required: The information you provide will be used in accordance with the terms of our privacy policy. To opt-out of receiving educational resources, manage your preferences here.

ABOUT AUTHOR(S)

François Garillot

François Garillot worked on Scala's type system in 2006, earned his PhD from the French École Polytechnique in 2011, and worked at Lightbend, formerly known as Typesafe after a brief stint in Internet advertising. He's worked on interactive interfaces to the Scala compiler, while nourishing a strong enthusiasm for data analytics in his spare time, until Apache Spark let him fulfill this passion as his main job. He received the first Spark Certification in November 2014, and worked in London and Philadelphia, among other places. In his spare time, he can be found practicing one of a half-dozen ways of making coffee, climbing up or skiing down a not-necessarily-Alpine mountain, or sailing a not-necessarily coastal course.

Gerard Maas

Gerard Maas is the lead engineer at Kensu.io, an early stage startup where he works on context management for big-data environments. Previous to that, he led the design and development of the data processing pipeline of Virdata.com, a startup building a cloud-native IoT platform, where Scala, Apache Spark and Spark Streaming were crucial building blocks. He enjoys contributing to open source projects, small and large. Through his career in technology companies like Alcatel-Lucent, Bell Labs, Sony and Technicolor, he has been mostly involved in the interaction of services and devices, from early days service adaptation when mobile screens only had few text lines, passing through multi-device interactions to IoT device management. He has a degree in Computer Engineering from the Simón Bolívar University, Venezuela.

About Lightbend

Lightbend (@Lightbend) is leading the enterprise transformation toward real-time, cloud-native applications. Lightbend Platform provides scalable, high-performance microservices frameworks and streaming engines for building data-centric systems that are optimized to run on cloud-native infrastructure. The most admired brands around the globe are transforming their businesses with Lightbend, engaging billions of users every day through software that is changing the world. For more information, visit lightbend.com.