Saltar al contingut Menu
Mapa
  • Inici
  • Informació
  • Contacte
  • Mapa

Conferčncia: "Performance Characterization of Apache Spark based Data Analytics on Scale-up Servers"

Compartir
Introduïda: 01-10-2015
HPC (CAP) research group invites you to attend the talk.
Speaker: Ahsan Javed Awan (KTH-UPC/BSC)
Date: Mon, 5/Oct/2015, 10:00
Room: C6-E106
ABSTRACT
With a deluge in the volume and variety of data collected, large-scale web enterprises (such as Yahoo, Facebook, and Google) run big data analytic applications using clusters of commodity servers. However, it has been recently reported that using clusters is a case of over-provisioning since a majority of analytic jobs do not process huge data sets and that modern scale-up servers are adequate to run analytic jobs. Additionally, commonly used predictive analytics such as machine learning algorithms work on filtered datasets that easily fit into memory of modern scale-up servers. Therefore, modern scale-up servers are becoming an important processing platform for big data analytics. In this seminar, I will talk about lessons learned from deploying Apache Spark based data analysis workloads on scale-up server. I will explain, Why Spark based applications do not scale-up, What are the bottlenecks at the application, thread, JVM and micro-architectural level. We will also discuss potential techniques to improve the single node performance of Apache Spark.

Conference Information


Compartir

 
logo FIB © Facultat d'Informàtica de Barcelona - Contacte - RSS
Aquest web utilitza cookies prňpies per oferir una millor experičncia i servei. En continuar amb la navegació entenem que acceptes la nostra política de cookies.
Versió clŕssica Versió mňbil