TY - GEN
T1 - Comparing Generalisation Using Crowd-Sourced vs Expert Labels for Galaxies Classification
AU - Variawa, M. Z.
AU - Van Zyl, T. L.
AU - Woolway, M.
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11/14
Y1 - 2020/11/14
N2 - Increasingly large amounts of data are available to astronomers. The Sloan Digital Sky Survey notes that manually classifying all captured galaxies would be an intractable task taking in the order of a hundred years to complete. It follows that scientists have turned to crowd-sourced labelling and machine learning, such as galaxy Zoo 2, to automate this task. If galaxy classification is to be automated in this way, then understanding how these results hold up to expert Hubble tuning classification schemes such as the academically accepted Revised Shapley-Ames catalogue of bright galaxies is timely. We use the Galaxy Zoo 2 data to train a base ResNet-50 model for transfer learning which we use to predict the Hubble types of Galaxies in the Revised Shapley-Ames catalogue. This result demonstrates how well the Galaxy Zoo 2 responses generalise to an expertly labelled catalogue of galaxies. We also train a ResNet-50 model on the EFIGI catalogue, which contains expertly labelled galaxies, and test on the Revised Shapley-Ames catalogue to draw a comparison to the crowd-sourced Galaxy Zoo 2's performance. Our results show the effectiveness of using transfer learning for galaxy classification. Further, we reinforce that more recent architectures such as ResNet-50 improve on state of the art in this task. We also demonstrated that an expertly labelled dataset of galaxies is marginally better for training a model to predict Hubble types of galaxies than a non-expert, citizen-science labelled dataset of galaxies. However, the generalisation to the Revised Shapley-Ames is not sufficient for real-world application and further investigation is required.
AB - Increasingly large amounts of data are available to astronomers. The Sloan Digital Sky Survey notes that manually classifying all captured galaxies would be an intractable task taking in the order of a hundred years to complete. It follows that scientists have turned to crowd-sourced labelling and machine learning, such as galaxy Zoo 2, to automate this task. If galaxy classification is to be automated in this way, then understanding how these results hold up to expert Hubble tuning classification schemes such as the academically accepted Revised Shapley-Ames catalogue of bright galaxies is timely. We use the Galaxy Zoo 2 data to train a base ResNet-50 model for transfer learning which we use to predict the Hubble types of Galaxies in the Revised Shapley-Ames catalogue. This result demonstrates how well the Galaxy Zoo 2 responses generalise to an expertly labelled catalogue of galaxies. We also train a ResNet-50 model on the EFIGI catalogue, which contains expertly labelled galaxies, and test on the Revised Shapley-Ames catalogue to draw a comparison to the crowd-sourced Galaxy Zoo 2's performance. Our results show the effectiveness of using transfer learning for galaxy classification. Further, we reinforce that more recent architectures such as ResNet-50 improve on state of the art in this task. We also demonstrated that an expertly labelled dataset of galaxies is marginally better for training a model to predict Hubble types of galaxies than a non-expert, citizen-science labelled dataset of galaxies. However, the generalisation to the Revised Shapley-Ames is not sufficient for real-world application and further investigation is required.
KW - galaxy classification
KW - galaxy zoo
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85100347648&partnerID=8YFLogxK
U2 - 10.1109/ISCMI51676.2020.9311606
DO - 10.1109/ISCMI51676.2020.9311606
M3 - Conference contribution
AN - SCOPUS:85100347648
T3 - 2020 7th International Conference on Soft Computing and Machine Intelligence, ISCMI 2020
SP - 158
EP - 162
BT - 2020 7th International Conference on Soft Computing and Machine Intelligence, ISCMI 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th International Conference on Soft Computing and Machine Intelligence, ISCMI 2020
Y2 - 14 November 2020 through 15 November 2020
ER -