Comparing Generalisation Using Crowd-Sourced vs Expert Labels for Galaxies Classification

M. Z. Variawa, T. L. Van Zyl, M. Woolway

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Increasingly large amounts of data are available to astronomers. The Sloan Digital Sky Survey notes that manually classifying all captured galaxies would be an intractable task taking in the order of a hundred years to complete. It follows that scientists have turned to crowd-sourced labelling and machine learning, such as galaxy Zoo 2, to automate this task. If galaxy classification is to be automated in this way, then understanding how these results hold up to expert Hubble tuning classification schemes such as the academically accepted Revised Shapley-Ames catalogue of bright galaxies is timely. We use the Galaxy Zoo 2 data to train a base ResNet-50 model for transfer learning which we use to predict the Hubble types of Galaxies in the Revised Shapley-Ames catalogue. This result demonstrates how well the Galaxy Zoo 2 responses generalise to an expertly labelled catalogue of galaxies. We also train a ResNet-50 model on the EFIGI catalogue, which contains expertly labelled galaxies, and test on the Revised Shapley-Ames catalogue to draw a comparison to the crowd-sourced Galaxy Zoo 2's performance. Our results show the effectiveness of using transfer learning for galaxy classification. Further, we reinforce that more recent architectures such as ResNet-50 improve on state of the art in this task. We also demonstrated that an expertly labelled dataset of galaxies is marginally better for training a model to predict Hubble types of galaxies than a non-expert, citizen-science labelled dataset of galaxies. However, the generalisation to the Revised Shapley-Ames is not sufficient for real-world application and further investigation is required.

Original languageEnglish
Title of host publication2020 7th International Conference on Soft Computing and Machine Intelligence, ISCMI 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages158-162
Number of pages5
ISBN (Electronic)9781728175591
DOIs
Publication statusPublished - 14 Nov 2020
Event7th International Conference on Soft Computing and Machine Intelligence, ISCMI 2020 - Virtual, Stockholm, Sweden
Duration: 14 Nov 202015 Nov 2020

Publication series

Name2020 7th International Conference on Soft Computing and Machine Intelligence, ISCMI 2020

Conference

Conference7th International Conference on Soft Computing and Machine Intelligence, ISCMI 2020
Country/TerritorySweden
CityVirtual, Stockholm
Period14/11/2015/11/20

Keywords

  • galaxy classification
  • galaxy zoo
  • transfer learning

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Computational Mathematics
  • Modeling and Simulation
  • Numerical Analysis

Fingerprint

Dive into the research topics of 'Comparing Generalisation Using Crowd-Sourced vs Expert Labels for Galaxies Classification'. Together they form a unique fingerprint.

Cite this