Algorithmic Aspects of Parallel Data Processing
| AUTHOR | Saliho?lu, Semih; Salihoglu, Semih; Koutris, Paraschos et al. |
| PUBLISHER | Now Publishers (02/22/2018) |
| PRODUCT TYPE | Paperback (Paperback) |
Description
The last decade has seen a huge and growing interest in processing large data sets on large distributed clusters. This trend began with the MapReduce framework, and has been widely adopted by several other systems, including PigLatin, Hive, Scope, Dremmel, Spark and Myria to name a few. While the applications of such systems are diverse (for example, machine learning, data analytics), most involve relatively standard data processing tasks like identifying relevant data, cleaning, filtering, joining, grouping, transforming, extracting features, and evaluating results. This has generated great interest in the study of algorithms for data processing on large distributed clusters. Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing. It uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model where the only cost is given by the amount of communication and the number of communication rounds. The survey studies several algorithms for multi-join queries, sorting, and matrix multiplication. It discusses their relationships and common techniques applied across the different data processing tasks.
Show More
Product Format
Product Details
ISBN-13:
9781680834062
ISBN-10:
1680834061
Binding:
Paperback or Softback (Trade Paperback (Us))
Content Language:
English
More Product Details
Page Count:
144
Carton Quantity:
54
Product Dimensions:
6.14 x 0.31 x 9.21 inches
Weight:
0.47 pound(s)
Country of Origin:
US
Subject Information
BISAC Categories
Computers | Database Administration & Management
Computers | Computer Science
Descriptions, Reviews, Etc.
publisher marketing
The last decade has seen a huge and growing interest in processing large data sets on large distributed clusters. This trend began with the MapReduce framework, and has been widely adopted by several other systems, including PigLatin, Hive, Scope, Dremmel, Spark and Myria to name a few. While the applications of such systems are diverse (for example, machine learning, data analytics), most involve relatively standard data processing tasks like identifying relevant data, cleaning, filtering, joining, grouping, transforming, extracting features, and evaluating results. This has generated great interest in the study of algorithms for data processing on large distributed clusters. Algorithmic Aspects of Parallel Data Processing discusses recent algorithmic developments for distributed data processing. It uses a theoretical model of parallel processing called the Massively Parallel Computation (MPC) model, which is a simplification of the BSP model where the only cost is given by the amount of communication and the number of communication rounds. The survey studies several algorithms for multi-join queries, sorting, and matrix multiplication. It discusses their relationships and common techniques applied across the different data processing tasks.
Show More
List Price $95.00
Your Price
$94.05
