Standard

Big data software analytics with Apache Spark. / Gousios, Georgios.

Proceedings of the 40th International Conference on Software Engineering, ICSE '18: Companion Proceedings. Vol. Part F137351 New York, NY : Association for Computing Machinery (ACM), 2018. p. 542-543.

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

Harvard

Gousios, G 2018, Big data software analytics with Apache Spark. in Proceedings of the 40th International Conference on Software Engineering, ICSE '18: Companion Proceedings. vol. Part F137351, Association for Computing Machinery (ACM), New York, NY, pp. 542-543, ICSE 2018, Gothenburg, Sweden, 27/05/18. https://doi.org/10.1145/3183440.3183458

APA

Gousios, G. (2018). Big data software analytics with Apache Spark. In Proceedings of the 40th International Conference on Software Engineering, ICSE '18: Companion Proceedings (Vol. Part F137351, pp. 542-543). New York, NY: Association for Computing Machinery (ACM). https://doi.org/10.1145/3183440.3183458

Vancouver

Gousios G. Big data software analytics with Apache Spark. In Proceedings of the 40th International Conference on Software Engineering, ICSE '18: Companion Proceedings. Vol. Part F137351. New York, NY: Association for Computing Machinery (ACM). 2018. p. 542-543 https://doi.org/10.1145/3183440.3183458

Author

Gousios, Georgios. / Big data software analytics with Apache Spark. Proceedings of the 40th International Conference on Software Engineering, ICSE '18: Companion Proceedings. Vol. Part F137351 New York, NY : Association for Computing Machinery (ACM), 2018. pp. 542-543

BibTeX

@inproceedings{b074ba6632c0415eb694ad0f632aa27c,
title = "Big data software analytics with Apache Spark",
abstract = "At the beginning of every research effort, researchers in empirical software engineering have to go through the processes of extracting data from raw data sources and transforming them to what their tools expect as inputs. This step is time consuming and error prone, while the produced artifacts (code, intermediate datasets) are usually not of scientific value. In the recent years, Apache Spark has emerged as a solid foundation for data science and has taken the big data analytics domain by storm. We believe that the primitives exposed by Apache Spark can help software engineering researchers create and share reproducible, high-performance data analysis pipelines. In our technical briefing, we discuss how researchers can profit from Apache Spark, through a hands-on case study.",
keywords = "Apache Spark, Big data, Data analytics",
author = "Georgios Gousios",
year = "2018",
doi = "10.1145/3183440.3183458",
language = "English",
volume = "Part F137351",
pages = "542--543",
booktitle = "Proceedings of the 40th International Conference on Software Engineering, ICSE '18",
publisher = "Association for Computing Machinery (ACM)",
address = "United States",

}

RIS

TY - GEN

T1 - Big data software analytics with Apache Spark

AU - Gousios, Georgios

PY - 2018

Y1 - 2018

N2 - At the beginning of every research effort, researchers in empirical software engineering have to go through the processes of extracting data from raw data sources and transforming them to what their tools expect as inputs. This step is time consuming and error prone, while the produced artifacts (code, intermediate datasets) are usually not of scientific value. In the recent years, Apache Spark has emerged as a solid foundation for data science and has taken the big data analytics domain by storm. We believe that the primitives exposed by Apache Spark can help software engineering researchers create and share reproducible, high-performance data analysis pipelines. In our technical briefing, we discuss how researchers can profit from Apache Spark, through a hands-on case study.

AB - At the beginning of every research effort, researchers in empirical software engineering have to go through the processes of extracting data from raw data sources and transforming them to what their tools expect as inputs. This step is time consuming and error prone, while the produced artifacts (code, intermediate datasets) are usually not of scientific value. In the recent years, Apache Spark has emerged as a solid foundation for data science and has taken the big data analytics domain by storm. We believe that the primitives exposed by Apache Spark can help software engineering researchers create and share reproducible, high-performance data analysis pipelines. In our technical briefing, we discuss how researchers can profit from Apache Spark, through a hands-on case study.

KW - Apache Spark

KW - Big data

KW - Data analytics

UR - http://www.scopus.com/inward/record.url?scp=85049675827&partnerID=8YFLogxK

U2 - 10.1145/3183440.3183458

DO - 10.1145/3183440.3183458

M3 - Conference contribution

VL - Part F137351

SP - 542

EP - 543

BT - Proceedings of the 40th International Conference on Software Engineering, ICSE '18

PB - Association for Computing Machinery (ACM)

CY - New York, NY

ER -

ID: 45771907