Back to Search

Mastering Spark with R: The Complete Guide to Large-Scale Analysis and Modeling

AUTHOR Ruiz, Edgar; Kuo, Kevin; Luraschi, Javier
PUBLISHER O'Reilly Media (11/19/2019)
PRODUCT TYPE Paperback (Paperback)

Description

If you're like most R users, you have deep knowledge and love for statistics. But as your organization continues to collect huge amounts of data, adding tools such as Apache Spark makes a lot of sense. With this practical book, data scientists and professionals working with large-scale data applications will learn how to use Spark from R to tackle big data and big compute problems.

Authors Javier Luraschi, Kevin Kuo, and Edgar Ruiz show you how to use R with Spark to solve different data analysis problems. This book covers relevant data science topics, cluster computing, and issues that should interest even the most advanced users.

  • Analyze, explore, transform, and visualize data in Apache Spark with R
  • Create statistical models to extract information and predict outcomes; automate the process in production-ready workflows
  • Perform analysis and modeling across many machines using distributed computing techniques
  • Use large-scale data from multiple sources and different formats with ease from within Spark
  • Learn about alternative modeling frameworks for graph processing, geospatial analysis, and genomics at scale
  • Dive into advanced topics including custom transformations, real-time data processing, and creating custom Spark extensions
Show More
Product Format
Product Details
ISBN-13: 9781492046370
ISBN-10: 149204637X
Binding: Paperback or Softback (Trade Paperback (Us))
Content Language: English
More Product Details
Page Count: 293
Carton Quantity: 13
Product Dimensions: 7.00 x 0.62 x 9.19 inches
Weight: 1.05 pound(s)
Feature Codes: Bibliography, Index, Price on Product
Country of Origin: US
Subject Information
BISAC Categories
Computers | Data Science - Data Analytics
Computers | Languages - General
Computers | Computer Engineering
Dewey Decimal: 006.312
Library of Congress Control Number: 2020276781
Descriptions, Reviews, Etc.
publisher marketing

If you're like most R users, you have deep knowledge and love for statistics. But as your organization continues to collect huge amounts of data, adding tools such as Apache Spark makes a lot of sense. With this practical book, data scientists and professionals working with large-scale data applications will learn how to use Spark from R to tackle big data and big compute problems.

Authors Javier Luraschi, Kevin Kuo, and Edgar Ruiz show you how to use R with Spark to solve different data analysis problems. This book covers relevant data science topics, cluster computing, and issues that should interest even the most advanced users.

  • Analyze, explore, transform, and visualize data in Apache Spark with R
  • Create statistical models to extract information and predict outcomes; automate the process in production-ready workflows
  • Perform analysis and modeling across many machines using distributed computing techniques
  • Use large-scale data from multiple sources and different formats with ease from within Spark
  • Learn about alternative modeling frameworks for graph processing, geospatial analysis, and genomics at scale
  • Dive into advanced topics including custom transformations, real-time data processing, and creating custom Spark extensions
Show More
List Price $55.99
Your Price  $55.43
Paperback