Presentation: Tweet"Big Data in the Browser - Live Coding Terabytes in a Notebook"
The web browser has become the de facto user interface for pretty much anything that isn't consumed on a smart phone. Yet, as software engineers we are more inclined to work with text editors and operating system shells, because these allow us to write code instead of clicking on things. Here's news for you: you can do that in the browser too! Plus, the browser can directly show interactive visualizations, neatly layed out output and include code comments with markup. Put all of this on top of Hadoop and you can interactively code against big data sets to extract information and create visualisations on the fly.
In this talk and live coding session, we'll show you how to use the popular browser based notebook environment Jupyter on top of Hadoop/Spark to process large data sets with Scala or Python for data visualisations and interactive reports. Examples include performing web analytics on click stream data and mining Stackoverflow data for trending topics.
Download slides