Abstract
Git repositories are an important source of empirical software engineering product and process data. Running the Git command-line tool and processing its output with other Unix tools allows the incremental construction of sophisticated data processing pipelines. Git data analytics on the command-line can be systematically presented through a pattern that involves fetching, selection, processing, summarization, and reporting. For each part of the processing pipeline, we examine the tools and techniques that can be most effectively used to perform the task at hand. The presented techniques can be easily applied, first to get a feeling of version control repository data at hand and then also for extracting empirical results.
Original language | English |
---|---|
Title of host publication | Proceedings of the 40th International Conference on Software Engineering, ICSE '18 |
Subtitle of host publication | Companion Proceedings |
Place of Publication | New York, NY |
Publisher | Association for Computing Machinery (ACM) |
Pages | 540-541 |
Number of pages | 2 |
Volume | Part F137351 |
ISBN (Electronic) | 978-1-4503-5663-3 |
DOIs | |
Publication status | Published - 2018 |
Event | ICSE 2018: 40th International Conference on Software Engineering - Gothenburg, Sweden Duration: 27 May 2018 → 3 Jun 2018 Conference number: 40 https://www.icse2018.org/ |
Conference
Conference | ICSE 2018 |
---|---|
Country/Territory | Sweden |
City | Gothenburg |
Period | 27/05/18 → 3/06/18 |
Internet address |
Keywords
- Command-line tools
- Data analytics
- empirical software engineering
- Git
- Pipes and filters