ADVERTISEMENT

A Computer Vision Application For Galaxy Detection

Computer Vision is an interdisciplinary field that combines knowledge from disciplines such as Physics, Computer Science, and Electrical Engineering. Its main goal is to develop algorithms and systems capable to reproduce human vision skills. The fields most closely related to computer vision are image processing, image analysis, and machine vision.

The core applications of computer vision have been historically in the healthcare, automotive, and agriculture industries, mostly because of the large investments required for the development and deployment of these systems. During the last 8 years, the situation has changed dramatically: the barriers to entry have decreased, and open-source libraries have proliferated. Nowadays, students and professionals from low-income countries have access to a large stack of computer vision resources and can develop high impact applications in short timescales.

ADVERTISEMENT
Figure 1: Related disciplines to Computer Vision. Credit: Roberto E. Gonzalez.

Observational astronomy is a division of astronomy that is concerned with recording data about the observable Universe. Ground-based and space telescopes are used nightly to observe planets and distant galaxies. Specialized telescope instruments collect raw data that is stored in remote servers and later processed using several image processing and analysis pipelines.

The usual tasks related to the processing of astronomical images are systematics effect removal, point source detection, and image enhancement. These tasks are available in several applications, such as IRAF1 and libraries such as Astropy2, and they are routinely used by astronomers and engineers.

Nowadays, most of the data acquisition and processing tasks are fully automated. The community has developed several data reduction pipelines and frameworks that are freely available and can be easily used by professionals working at telescopes, universities, and outside academia.

Figure 2: A map showing some of what the Sloan Digital Sky Survey has discovered over the last twenty years. Image Credit: V. Belokurov, M. R. Blanton, A. Bonaca, X. Fan, M. C. Geha, R. H. Lupton, the SDSS Collaboration (https://www.sdss.org)

The Sloan Digital Sky Survey (SDSS3) is the largest astronomical survey ever executed, producing a catalog that contains about 500 million sources. The entire dataset weighs more than 100 TB and it includes images, spectra, and catalogs from one-third of the celestial sky. The data reduction and analysis of the data were done initially using customized pipelines developed by astronomers, data scientists, and engineers from several universities and institutes in the USA, and was later expanded by professionals from all over the world.

ADVERTISEMENT

Although data reduction and preparation is mostly done using classical image processing methods, there still a lot of space for improvements in the areas of data analysis and visualization. Computer vision looks like a promising solution to facilitate the analysis of the big data in Astronomy and to accelerate the discovery of structures and phenomena in the Universe. However, this is not an easy task; the introduction of new methods or techniques coming from different fields(interdisciplinarity) is slow and usually is delayed several years from the state-of-the-art. The reason for this may be explained by two factors: one is that there are not may interdisciplinary scientists who bring knowledge from other fields; the second factor is that knowledge spreads into other fields after it becomes mature and well-developed. As an example, we haveĀ  computer vision techniques developed in the field of Computer Science, which are associated with machine- or deep-learning, arrive into Astrophysics ~4-5 years after they are developed, and it can be easily seen by counting the number of paper publications related to computer vision/deep learning/machine learning, which are fewer than 300, and concepts such as Deep Learning, Faster-CNN, and SSD have only just appeared in papers since ~2017-2018.

In this context, AstroCV4 repository appears to be an invitation to join efforts to reduce this time delay in the knowledge transfer from Computer Vision into Astrophysics, especially now with the overwhelming growth of knowledge in Computer Vision and access to new development frameworks and cheaper GPU computational power.

As part of the AstroCV initiative, we train a galaxy detection and identification model using state-of-the-art SSD neural networks framework(Darknet), and we develop a new data augmentation procedure to make this robust against images coming from different filters and instruments. The training set is built from the Galaxy Zoo5 database, with a classification of elliptical, spiral, edge-on, and merge galaxies. Data augmentation is very important for any model training scenario; it helps to improve the results of small training sets and make models more reliable in different conditions. In particular, astronomy images are taken in multiple filters and in FITS format with raw CCD data for each pixel, then data conversion from FITS to a RGB image is not unique and depends on the telescopeā€™s camera, band filters, reduction schema, and on the conversion method used to scale photon counts to color scale.

We produced a data augmentation schema including several color conversion methods on the same objects, resulting in an important improvement in detection for images coming from different telescopes/instruments, taking into account we used a training set from SDSS instrument only. In Figure 3, we show results for images from SDSS reaching a recall ratio of 90%. However, for images taken from different color filters and telescope, results are not that good, and performance may drop down to even 20% recall performance. Including our data augmentation procedure, we get up to 3x better recall results. In Figure 4, we show results for an image taken from the Hubble Deep Field.

Figure 3: Galaxies found using our model in a typical SDSS image. Credit: Roberto Gonzalez
Figure 4: Galaxies in a Hubble Deep field image with data augmentation. (without data augmentation we could find one third of the galaxies only). Credit: Roberto Gonzalez

Roberto Gonzalez and Roberto MuƱoz are formerly astronomers and moved to the Computer Vision Industry for a Chilean company MetricArts6, so knowledge transfer between Astrophysics, Computer Science and the Industry has become a daily basis process for them. They think that interdisciplinarity and collaboration between the technology industry and academics are fundamental to lead in the Computer Vision and AI fields. However, it requires a change of thinking from a traditional academy, and from traditional industry, where interdisciplinarity and knowledge transfer have a low value, especially in less developed countries.

ADVERTISEMENT
These findings are described in the article entitled Galaxy detection and identification using deep learning and data augmentation, recently published in the journal Astronomy and Computing.

References:

  1. Image Reduction and Analysis Facility http://iraf.noao.edu/
  2. http://www.astropy.org/
  3. https://www.sdss.org/
  4. https://github.com/astroCV
  5. https://www.galaxyzoo.org
  6. www.metricarts.com

Comments

READ THIS NEXT

Discovery Of Oldest Stone Tools Outside Of Africa Hints At Unidentified Human Ancestors

It was long thought that Africa was where the human species first developed, but now, a new discovery published in […]

The Impact Of Storage Conditions On The Recovery Efficiency Of ATES Systems

Aquifer Thermal Energy Storage (ATES) systems provide sustainable heating and cooling for buildings by seasonally storing heat in underground aquifers. […]

Inhibiting Ferroptosis ā€” A New Hope For Intracerebral Hemorrhage Therapy

Strokes can be divided into two main subtypes ā€” ischemic and hemorrhagic. Ischemic stroke is the more common type and […]

Observing How Latrodectus Spiders Protect Their Eggs

The act of producing offspring is one of the most energetically costly things an animal has to do in its […]

Can Immunotherapy Conquer Triple-Negative Breast Cancer?

Triple-negative breast cancer (TNBC) is a subtype of breast cancer, the most common cancer in women. TNBC is clinically negative […]

Wild Winds On Neptune

Neptuneā€™s upper atmosphere contains some of the fastest winds in the solar system, reaching speeds upwards of 400 m/s (900 […]

How Do We Make Sense Of -omic Datasets? A Strategy Integrating Gene Perturbation And Computational Analysis May Help

Neurological disorders remain one of the last frontiers of medicine. Available treatments are few; of those that exist, most only […]

Science Trends is a popular source of science news and education around the world. We cover everything from solar power cell technology to climate change to cancer research. We help hundreds of thousands of people every month learn about the world we live in and the latest scientific breakthroughs. Want to know more?