Conference: "Performance Characterization of Apache Spark based Data Analytics on Scale-up Servers"

Saltar al contingut Menu

Masters in Computer Science and Engineering

FACULTAT D'INFORMÀTICA
DE BARCELONA

Conference: "Performance Characterization of Apache Spark based Data Analytics on Scale-up Servers"

Introduced: 01-10-2015

HPC (CAP) research group invites you to attend the talk.
Speaker: Ahsan Javed Awan (KTH-UPC/BSC)
Date: Mon, 5/Oct/2015, 10:00
Room: C6-E106

ABSTRACT
With a deluge in the volume and variety of data collected, large-scale web enterprises (such as Yahoo, Facebook, and Google) run big data analytic applications using clusters of commodity servers. However, it has been recently reported that using clusters is a case of over-provisioning since a majority of analytic jobs do not process huge data sets and that modern scale-up servers are adequate to run analytic jobs. Additionally, commonly used predictive analytics such as machine learning algorithms work on filtered datasets that easily fit into memory of modern scale-up servers. Therefore, modern scale-up servers are becoming an important processing platform for big data analytics. In this seminar, I will talk about lessons learned from deploying Apache Spark based data analysis workloads on scale-up server. I will explain, Why Spark based applications do not scale-up, What are the bottlenecks at the application, thread, JVM and micro-architectural level. We will also discuss potential techniques to improve the single node performance of Apache Spark.

Conference Information

News
Agenda

RSS
This website uses cookies to offer you the best experience and service. If you continue browsing, it is understood that you accept our cookies policy.
Classic version Mobile version