The promise of many distributed system frameworks (such as Spark, Flink, Heron to name a few) is that the programmer may write her software in such a way that treats a collection of physically discrete compute resources as a single giant computer. This abstraction is exactly the way that most developers want to interact with a collection of resources: the programmer will provide the computational logic and the framework will handle the messy business of work distribution, job scheduling, retries and computational result collation/reassembly. This attractive abstraction of a single giant computer has a number of lurking sharp edges and leaks - the gap between the promise and the reality of an implementation. This talk explores the gap between the pristine theoretical model of these computational frameworks, the messy real world implementations of those frameworks, and how we could find better ways to communicate the sharp edges of a leaky abstraction.
Mark has been developing software for over 20 years. In the past 6 years, he has been developing scalable software for distributed systems at his day job using functional languages and idioms. Mark also runs a functional programming meetup in Houston, Texas.