The application of data mining in astronomical surveys, such as the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) survey, provides an effective approach to automati-cally analyze a large amount ...The application of data mining in astronomical surveys, such as the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) survey, provides an effective approach to automati-cally analyze a large amount of complex survey data. Unsupervised clustering could help astronomersfind the associations and outliers in a big data set. In this paper, we employ the k-means methodto perform clustering for the line index of LAMOST spectra with the powerful software AstroStat.Implementing the line index approach for analyzing astronomical spectra is an effective way to extractspectral features for low resolution spectra, which can represent the main spectral characteristics ofstars. A total of 144 340 line indices for A type stars is analyzed through calculating their intra and interdistances between pairs of stars. For intra distance, we use the definition of Mahalanobis distance to ex-plore the degree of clustering for each class, while for outlier detection, we define a local outlier factorfor each spectrum. AstroStat furnishes a set of visualization tools for illustrating the analysis results.Checking the spectra detected as outliers, we find that most of them are problematic data and only a fewcorrespond to rare astronomical objects. We show two examples of these outliers, a spectrum with ab-normal continuum and a spectrum with emission lines. Our work demonstrates that line index clusteringis a good method for examining data quality and identifying rare objects.展开更多
基金partially supported by the National Key Basic Research Program of China(2014CB845700) China Postdoctoral Science Foundation(2016M600850)+1 种基金 the National Natural Science Foundation of China(No.11443006) Joint Research Fund in Astronomy(Nos.U1531244 and U1631236)
基金supported by the National Key Research and Development Program of China(Grant No.2016YFE0100300) the Joint Research Fund in Astronomy(Grant Nos.U1531132,U1631129 and U1231205)under cooperative agreement between the National Natural Science Foundation of China(NSFC)+1 种基金 the Chinese Academy of Sciences(CAS) the National Natural Science Foundation of China(Grant Nos.11603044,11703044,11503042,11403009and 11463003)
文摘The application of data mining in astronomical surveys, such as the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) survey, provides an effective approach to automati-cally analyze a large amount of complex survey data. Unsupervised clustering could help astronomersfind the associations and outliers in a big data set. In this paper, we employ the k-means methodto perform clustering for the line index of LAMOST spectra with the powerful software AstroStat.Implementing the line index approach for analyzing astronomical spectra is an effective way to extractspectral features for low resolution spectra, which can represent the main spectral characteristics ofstars. A total of 144 340 line indices for A type stars is analyzed through calculating their intra and interdistances between pairs of stars. For intra distance, we use the definition of Mahalanobis distance to ex-plore the degree of clustering for each class, while for outlier detection, we define a local outlier factorfor each spectrum. AstroStat furnishes a set of visualization tools for illustrating the analysis results.Checking the spectra detected as outliers, we find that most of them are problematic data and only a fewcorrespond to rare astronomical objects. We show two examples of these outliers, a spectrum with ab-normal continuum and a spectrum with emission lines. Our work demonstrates that line index clusteringis a good method for examining data quality and identifying rare objects.