GOTO is a vendor independent international software development conference with more that 90 top speaker and 1300 attendees. The conference cover topics such as .Net, Java, Open Source, Agile, Architecture and Design, Web, Cloud, New Languages and Processes

Presentation: "KEYNOTE: A Monadic Model for Big Data"

Time: Tuesday 17:10 - 18:00 / Location: Effectenbeurszaal

Since its inception in 1970, Codd's relational model has fueled a multi-billion dollar industry that after 40 years still experiences double digit growth. Because the relational database and SQL are based on strong mathematical foundations of sets and relations, complementary producers such as educators, tool vendors, consultants, etc., can all target the same underlying conceptual model, creating a strong ecosystem of users and domain experts.

If we look at the relational model through our developer eyes, we notice that the relational algebra is one particular implementation of a more general interface, an interface that mathematicians call monads. By generalizing from relations to arbitrary monads, we are able to query many different kinds of data using a single query language, in particular we can formulate queries over data of any size, finite or infinite. Similarly, when we use another programmer’s trick to swap around fk/pk relationships between flat rows into pointers between nested structures, we can query both relational and pointer-based data, which includes documents, graphs, using a single monadic algebra of query operators. Lastly, by leveraging the mathematical trick of duality, we can implement our monadic interface over both push- as well as pull-based data.

By generalizing from sets and relations to monads we have created a three-dimensional design space for data spanned by the dimensions of Volume, Variety, and Velocity, together with a set of monadic standard query operators. Looking at data in the context of this “cube” we can categorize many data sources according to these three elementary dimensions. For example, a mouse is a database whose (a) volume is infinite, (b) whose variety is flat, and (c) whose velocity is push, or the typical document database is (a) finite, (b) nested, and (c) pull-based, etc. In other words, monads provide a mathematical and practical basis for what the industry nowadays calls “big data”.

In this talk we will explain how any programmer could have invented this unified model of big data herself, and perhaps even more importantly, how any modern programming language allows you to use these principles to simplify your day to day data programmability problems.

Erik Meijer, University of Delft

Erik Meijer

Biography: Erik Meijer

Erik Meijer is a Dutch computer scientist and entrepreneur. From 2000 to early 2013 he was a software architect for Microsoft where he headed the Cloud Programmability Team. Before that, he was an associate professor at Utrecht University. He received his Ph.D from Nijmegen University in 1992.

Meijer's research has included the areas of functional programming (particularly Haskell) compiler implementation, parsing, programming language design, XML, and foreign function interfaces.

His work at Microsoft included C#, Visual Basic, LINQ, Volta, and the Reactive programming framework (Reactive Extensions) for .NET. He has been involved in over 300 software patent applications of which 101 have been granted.

In 2009, he was the recipient of the Microsoft Outstanding Technical Leadership Award and in 2007 the Outstanding Technical Achievement Award as a member of the C# team.

Meijer lived in the Netherlands Antilles until the age 14 when his father retired from his current job and the family moved back to the Netherlands.

In 2011 Erik Meijer was appointed part-time professor of Cloud Programming within the Software Engineering Research Group at Delft University of Technology. He is also member of the ACM Queue Editorial Board.

Twitter: @headinthebox