Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code comments enhance the read- ability of the code. Nevertheless, not all the comments have the same goal and target audience. In this paper, we investigate how six diverse Java OSS projects use code comments, with the aim of understanding what is their purpose. Through our analysis, we produce a taxonomy of source code comments comprising 16 classes; subsequently we investigate how often each category occur, by manually classifying more than 2,000 code comments from the aforementioned projects. In addition, we conduct an initial evaluation on how to automatically classify code comments at line level into our taxonomy using machine learning; initial results are promising and suggest that an accurate classification is within reach.
Original languageEnglish
Title of host publicationMSR 2017 (14th International Conference on Mining Software Repositories)
Number of pages11
StateAccepted/In press - 2017

