Cluster management system design for big data infrastructures

Shekhar Gupta

doi:10.4233/uuid:de1d4543-9bbe-4a2f-ac9a-f648f4066d0f

Cluster management system design for big data infrastructures

Shekhar Gupta

Algorithmics

Research output: Thesis › Dissertation (TU Delft)

77 Downloads (Pure)

Abstract

In recent years,we have seen amajor shift in computing systems: data volumes are growing very fast, but hardware capabilities to store, process, and transfer the massive data are not speeding up at the same rate. Today, data are generated from a variety of sources, such as social networking websites, business transactions, banking sectors, etc. These data are valuable and contain lots of vital information if they are analyzed efficiently. The processing capabilities of single machines, however, are not sufficient enough, which
makes it harder to use them for data analysis. As a result, most web companies, but also the traditional business organizations, research labs, and universities, are scaling out their major computational frameworks to clusters of thousands of machines. To find the hidden and interesting insights from the data, in addition to simple queries, also complex machine learning algorithms and graphs processing are becoming a common choice in many areas. Nowadays, the problem to collect, store and analyze these data is called the Big Data problem.

Original language	English
Awarding Institution	Delft University of Technology
Supervisors/Advisors	Witteveen, C., Supervisor de Kleer, J, Advisor, External person
Award date	14 Dec 2016
Print ISBNs	978-94-6186-757-5
DOIs	https://doi.org/10.4233/uuid:de1d4543-9bbe-4a2f-ac9a-f648f4066d0f
Publication status	Published - 2016

Access to Document

10.4233/uuid:de1d4543-9bbe-4a2f-ac9a-f648f4066d0f

shekhar-thesisFinal published version, 3.54 MB

Cite this

@phdthesis{de1d45439bbe4a2fac9af648f4066d0f,

title = "Cluster management system design for big data infrastructures",

abstract = "In recent years,we have seen amajor shift in computing systems: data volumes are growing very fast, but hardware capabilities to store, process, and transfer the massive data are not speeding up at the same rate. Today, data are generated from a variety of sources, such as social networking websites, business transactions, banking sectors, etc. These data are valuable and contain lots of vital information if they are analyzed efficiently. The processing capabilities of single machines, however, are not sufficient enough, whichmakes it harder to use them for data analysis. As a result, most web companies, but also the traditional business organizations, research labs, and universities, are scaling out their major computational frameworks to clusters of thousands of machines. To find the hidden and interesting insights from the data, in addition to simple queries, also complex machine learning algorithms and graphs processing are becoming a common choice in many areas. Nowadays, the problem to collect, store and analyze these data is called the Big Data problem.",

author = "Shekhar Gupta",

year = "2016",

doi = "10.4233/uuid:de1d4543-9bbe-4a2f-ac9a-f648f4066d0f",

language = "English",

isbn = "978-94-6186-757-5",

type = "Dissertation (TU Delft)",

school = "Delft University of Technology",

}

TY - THES

T1 - Cluster management system design for big data infrastructures

AU - Gupta, Shekhar

PY - 2016

Y1 - 2016

N2 - In recent years,we have seen amajor shift in computing systems: data volumes are growing very fast, but hardware capabilities to store, process, and transfer the massive data are not speeding up at the same rate. Today, data are generated from a variety of sources, such as social networking websites, business transactions, banking sectors, etc. These data are valuable and contain lots of vital information if they are analyzed efficiently. The processing capabilities of single machines, however, are not sufficient enough, whichmakes it harder to use them for data analysis. As a result, most web companies, but also the traditional business organizations, research labs, and universities, are scaling out their major computational frameworks to clusters of thousands of machines. To find the hidden and interesting insights from the data, in addition to simple queries, also complex machine learning algorithms and graphs processing are becoming a common choice in many areas. Nowadays, the problem to collect, store and analyze these data is called the Big Data problem.

AB - In recent years,we have seen amajor shift in computing systems: data volumes are growing very fast, but hardware capabilities to store, process, and transfer the massive data are not speeding up at the same rate. Today, data are generated from a variety of sources, such as social networking websites, business transactions, banking sectors, etc. These data are valuable and contain lots of vital information if they are analyzed efficiently. The processing capabilities of single machines, however, are not sufficient enough, whichmakes it harder to use them for data analysis. As a result, most web companies, but also the traditional business organizations, research labs, and universities, are scaling out their major computational frameworks to clusters of thousands of machines. To find the hidden and interesting insights from the data, in addition to simple queries, also complex machine learning algorithms and graphs processing are becoming a common choice in many areas. Nowadays, the problem to collect, store and analyze these data is called the Big Data problem.

UR - http://resolver.tudelft.nl/uuid:de1d4543-9bbe-4a2f-ac9a-f648f4066d0f

U2 - 10.4233/uuid:de1d4543-9bbe-4a2f-ac9a-f648f4066d0f

DO - 10.4233/uuid:de1d4543-9bbe-4a2f-ac9a-f648f4066d0f

M3 - Dissertation (TU Delft)

SN - 978-94-6186-757-5

ER -

Cluster management system design for big data infrastructures

Abstract

Access to Document

Other files and links

Fingerprint

Cite this