Catalogue » Software » Programming & Web Development » Product details
Price comparison product image Using R for Big Data with Spark - Training DVD

Using R for Big Data with Spark - Training DVD

by O'Reilly Media
New from:
US $49.99
Shipping:
see website
Prices may incl. VAT *
Last refresh Feb/17/2018 03:09 PM
or
or
EAN
5055197638265
Brand
O'Reilly Media
Number of Videos:2.5 hours - 20 lessons
Ships on: DVD-ROM
User Level:Intermediate
Data analysts familiar with R will learn to leverage the power of Spark, distributed computing and cloud storage in this course that shows you how to use your R skills in a big data environment.

You'll learn to create Spark clusters on the Amazon Web Services (AWS) platform; perform cluster based data modeling using Gaussian generalized linear models, binomial generalized linear models, Naive Bayes, and K-means modeling; access data from S3 Spark DataFrames and other formats like CSV, Json, and HDFS; and do cluster based data manipulation operations with tools like SparkR and SparkSQL. By course end, you'll be capable of working with massive data sets not possible on a single computer. This hands-on class requires each learner to set-up their own extremely low-cost, easily terminated AWS account. Discover how to use your R skills in a big data distributed cloud computing cluster environment Gain hands-on experience setting up Spark clusters on Amazon's AWS cloud services platform Understand how to control a cloud instance on AWS using SSH or PuTTY Explore basic distributed modeling techniques like GLM, Naive Bayes, and K-means Learn to do cloud based data manipulation and processing using SparkR and SparkSQL Understand how to access data from the CSV, Json, HDFS, and S3 formats Manuel Amunategui is a data science practitioner, consultant, teacher, and author with 16+ years of data science experience. A former quantitative analyst for a Wall Street brokerage firm, he now serves as the lead data scientist for Providence Health & Services in Portland, Oregon. In his free time, Manuel does competitive data modeling on Kaggle.com, CrowdANALYTIX.com, Datascience.net, and DrivenData.org.

Latest products for Price Comparison

* The prices and shipping costs may have changed since the last update. It is technically not possible to update the prices in real time. The time of purchase on the Website of the seller is used as the reference.