GOTO is a vendor independent international software development conference with more that 90 top speaker and 1300 attendees. The conference cover topics such as .Net, Java, Open Source, Agile, Architecture and Design, Web, Cloud, New Languages and Processes
Artem Aliev, TweetSoftware Engineer at DataStax
Biography: Artem Aliev
Artem Aliev is a software developer in the DataStax Enterprise Analytics team. He works on integrating Apache Cassandra noSQL database with analytics solution like Spark and Hive. efore that he works as Big Data Solution Architect, Developer of Apache Harmony J2SE implementation and as a lead of performance optimisation team for enterprise storage software at EMC corporation. o he can talk about the big data processing pipeline: from data on disks to machine learning and visualisation.
Twitter: @__ali
Presentation: TweetSolving classical data analytic task by using modern distributed databases
- Apache Spark benefits, architecture and Scala API. (Don't be afraid of Scala, we are here to help you)
- Load and store data from Cassandra NoSQL database
- Data enrichments and joins
- Spark Machine learning and graph algorithms
Workshop: Intro to Apache Spark Tweet
This one day session features a mix of hands-on technical exercises, brief lectures, demos, and case studies – structured to get developers up to speed leveraging Apache Spark for a range of use cases.
Topics:
- Overview of Big Data and Spark
- Installing Spark Locally
- Using Spark’s Core APIs in Scala, Java, & Python
- Building Spark Applications
- Deploying on a Big Data Cluster
- Combining SQL, Machine Learning, and Streaming for Unified Pipelines
Target Audience:
This class is intended for developers who have some background developing apps in Java, Python, or Scala, but are not already familiar with Spark.