Himanshu Arora & Nityanand Yadav – 10 things I wish I’d known before using Spark in production
You have recently started working on Spark and your jobs are taking forever to finish ? This talk is for you! We have compiled many spark best practices, optimisation and tweaks that we have applied over the years in production to make our jobs faster and less resource consuming. In this talk we will learn about advanced spark tuning, data serialisation formats, storage formats, hardware tuning, better data locality, control over parallelism and GC tuning etc. We will also discover the appropriate usage of RDD, DataFrame and Dataset in order to take full advantage of spark internal optimisations.
Jakub Kozłowski – Conquering Concurrency with Functional Programming (45 minutes)
Some people claim functional programming is useless because in the end there will be side effects - or else your program won't do anything useful. That's not true (at least not in user code), and as it turns out purely functional programming is really good at solving some real-world problems, like concurrency. I'll talk about how shared mutable state, queues, and streams can be used in purely functional programs, and why such a solution might be preferred over over the classical ways of managing concurrent state.
John A. De Goes – Thinking Functionally (45 minutes)
In Thinking Functionally, you'll learn how a functional programmer thinks about problems as John live-refactors a concurrent imperative program to its purely functional equivalent—which is shorter, more powerful, more type-safe, and far easier to reason about and test. Don't miss this chance to witness both the "why" and the "how" of functional programming and learn more about ZIO, the hot new Scala library for asynchronous and concurrent programming that makes it easy to conquer the impossible.
Jon Pretty – How I rebuilt the Typelevel Ecosystem with Fury (45 minutes)
Fury is a new build tool and dependency manager for Scala, based on a radical model of source dependencies and distributed builds. The Typelevel ecosystem offers a platform of useful Scala projects which have been converted to use Fury. I will talk about the experience of building them with Fury. This talk will catalog the steps involved with writing Fury builds for a widely-used and coherent subset of the Scala ecosystem, covering the common cases, the biggest challenges, and any compromises that had to be made to support everything.
Julien Tournay – Data Processing @Spotify using Scio (45 minutes)
Two years ago, Spotify introduced Scio, an open-source Scala framework to develop data pipelines and deploy them on Google Dataflow. In this talk, we will discuss the evolution of Scio, and share the highlights of running Scio in production for two years. We will showcase several interesting data processing workflows ran at Spotify, what we learned from running them in production, and how we leverage that knowledge to make Scio faster, and safer and easier to use.