2:45 PM Sunday Room: VPA-115
Today, Microsoft Azure allows us to deploy a functional Apache Spark cluster in minutes. That means we can focus on analyzing vast amounts of information and applying machine learning algorithms on that data to develop valuable predictions. In this session, we will pick a fairly sized machine learning competition from kaggle.com, create a Spark machine learning pipeline to preprocess data and make predictions, as well as evaluate different algorithms against each other. We will also touch on several several integration points between Spark and Azure that make Spark on Azure a good choice for data analytics in the cloud.