Welcome to astroNN’s documentation!
astroNN is a python package to do various kinds of neural networks with targeted application in astronomy by using Keras API as model and training prototyping, but at the same time take advantage of Tensorflow’s flexibility.
For non-astronomy applications, astroNN contains custom loss functions and layers which are compatible with Tensorflow. The custom loss functions mostly designed to deal with incomplete labels. astroNN contains demo for implementing Bayesian Neural Net with Dropout Variational Inference in which you can get reasonable uncertainty estimation and other neural nets.
For astronomy applications, astroNN contains some tools to deal with APOGEE, Gaia and LAMOST data. astroNN is mainly designed to apply neural nets on APOGEE spectra analysis and predicting luminosity from spectra using data from Gaia parallax with reasonable uncertainty from Bayesian Neural Net. Generally, astroNN can handle 2D and 2D colored images too. Currently astroNN is a python package being developed by the main author to facilitate his research project on deep learning application in stellar and galactic astronomy using SDSS APOGEE, Gaia and LAMOST data.
For learning purpose, astroNN includes a deep learning toy dataset for astronomer - Galaxy10 DECals Dataset.
Indices, tables and astroNN structure
astroNN/
├── apogee/
│ ├── apogee_shared.py [shared codes across apogee module]
│ ├── chips.py [functions to deal with apogee detectors and spectra]
│ ├── downloader.py [functions to downlaod apogee data]
│ └── plotting.py [functions to plot apogee data]
├── data/
│ └── ... [multiple pre-compiled data in numpy format]
├── datasets/
│ ├── apogee_distances.py
│ ├── apogee_rc.py
│ ├── apokasc.py
│ ├── galaxy10.py [astroNN's galaxy10 related codes]
│ ├── h5.py
│ └── xmatch.py [coordinates cross matching]
├── gaia/
│ ├── downloader.py [functions to downlaod gaia data]
│ └── gaia_shared.py [function related to astrometry and magnitude]
├── lamost/
│ ├── chips.py [functions to deal with lamost detectors and spectra]
│ └── lamost_shared.py [shared codes across lamost module]
├── models/ [contains neural network models]
│ └── ... [NN models codes and modules]
├── nn/
│ ├── callbacks.py [Keras's callbacks]
│ ├── layers.py [Tensorflow layers]
│ ├── losses.py [Tensorflow losses]
│ ├── metrics.py [Tensorflow metrics]
│ └── numpy.py [handy numpy implementation of NN tools]
└── shared/ [shared codes across modules]
Galaxy10 DECals Dataset
Welcome! Galaxy10 DECals is a much improved version of our original Galaxy10. The source code is here: https://github.com/henrysky/Galaxy10
The original Galaxy10 dataset was created with Galaxy Zoo (GZ) Data Release 2 where volunteers classify ~270k of SDSS galaxy images where ~22k of those images
were selected in 10 broad classes using volunteer votes. GZ later utilized images from DESI Legacy Imaging Surveys (DECals) with much better resolution and image quality.
Galaxy10 DECals has combined all three (GZ DR2 with DECals images instead of SDSS images and DECals campaign ab, c) results in ~441k of unique galaxies covered by DECals
where ~18k of those images were selected in 10 broad classes using volunteer votes with more rigorous filtering. Galaxy10 DECals had its 10 broad classes tweaked a bit so that
each class is more distinct from each other and Edge-on Disk with Boxy Bulge
class with only 17 images in original Galaxy10 was abandoned. The source code for this dataset is released
under this repositary so you are welcome to play around if you like, otherwise you can use the compiled Galaxy10 DECals with dowload link below.
Download Galaxy10 DECals
Galaxy10_DECals.h5
: https://www.astro.utoronto.ca/~hleung/shared/Galaxy10/Galaxy10_DECals.h5
SHA256: 19AEFC477C41BB7F77FF07599A6B82A038DC042F889A111B0D4D98BB755C1571
Size: 2.54 GB
Introduction
Galaxy10 DECals is a dataset contains 17736 256x256 pixels colored galaxy images (g, r and z band) separated in 10 classes.
Galaxy10_DECals.h5
have columns images
with shape (17736, 256, 256, 3)
, ans
, ra
, dec
,
redshift
and pxscale
in unit of arcsecond per pixel
Galaxy10 DECals images come from DESI Legacy Imaging Surveys and labels come from Galaxy Zoo.
Galaxy10 dataset (17736 images)
├── Class 0 (1081 images): Disturbed Galaxies
├── Class 1 (1853 images): Merging Galaxies
├── Class 2 (2645 images): Round Smooth Galaxies
├── Class 3 (2027 images): In-between Round Smooth Galaxies
├── Class 4 ( 334 images): Cigar Shaped Smooth Galaxies
├── Class 5 (2043 images): Barred Spiral Galaxies
├── Class 6 (1829 images): Unbarred Tight Spiral Galaxies
├── Class 7 (2628 images): Unbarred Loose Spiral Galaxies
├── Class 8 (1423 images): Edge-on Galaxies without Bulge
└── Class 9 (1873 images): Edge-on Galaxies with Bulge
For more information on the original Galaxy Zoo 2 classification tree: Galaxy Zoo Decision Tree

Load with astroNN
1from astroNN.datasets import load_galaxy10
2from tensorflow.keras import utils
3import numpy as np
4
5# To load images and labels (will download automatically at the first time)
6# First time downloading location will be ~/.astroNN/datasets/
7images, labels = load_galaxy10()
8
9# To convert the labels to categorical 10 classes
10labels = utils.to_categorical(labels, 10)
11
12# To convert to desirable type
13labels = labels.astype(np.float32)
14images = images.astype(np.float32)
OR Load with Python & h5py
You should download Galaxy10_DECals.h5
first and open python at the same location and run the following to open it:
1import h5py
2import numpy as np
3from tensorflow.keras import utils
4
5# To get the images and labels from file
6with h5py.File('Galaxy10_DECals.h5', 'r') as F:
7 images = np.array(F['images'])
8 labels = np.array(F['ans'])
9
10# To convert the labels to categorical 10 classes
11labels = utils.to_categorical(labels, 10)
12
13# To convert to desirable type
14labels = labels.astype(np.float32)
15images = images.astype(np.float32)
Split into train and test set
1import numpy as np
2from sklearn.model_selection import train_test_split
3
4train_idx, test_idx = train_test_split(np.arange(labels.shape[0]), test_size=0.1)
5train_images, train_labels, test_images, test_labels = images[train_idx], labels[train_idx], images[test_idx], labels[test_idx]
Lookup Galaxy10 Class
You can lookup Galaxy10 class to the corresponding name by
1from astroNN.datasets.galaxy10 import galaxy10cls_lookup
2galaxy10cls_lookup(# a class number here to get back the name)
Acknowledgments
For astroNN acknowledgment, please refers to Acknowledging astroNN
Galaxy10 dataset classification labels come from Galaxy Zoo
Galaxy10 dataset images come from DESI Legacy Imaging Surveys
Galaxy Zoo is described in Lintott et al. 2008, the GalaxyZoo Data Release 2 is described in Lintott et al. 2011, Galaxy Zoo DECals Campaign is described in Walmsley M. et al. 2021, DESI Legacy Imaging Surveys is described in Dey A. et al., 2019
The Legacy Surveys consist of three individual and complementary projects: the Dark Energy Camera Legacy Survey (DECaLS; Proposal ID #2014B-0404; PIs: David Schlegel and Arjun Dey), the Beijing-Arizona Sky Survey (BASS; NOAO Prop. ID #2015A-0801; PIs: Zhou Xu and Xiaohui Fan), and the Mayall z-band Legacy Survey (MzLS; Prop. ID #2016A-0453; PI: Arjun Dey). DECaLS, BASS and MzLS together include data obtained, respectively, at the Blanco telescope, Cerro Tololo Inter-American Observatory, NSF’s NOIRLab; the Bok telescope, Steward Observatory, University of Arizona; and the Mayall telescope, Kitt Peak National Observatory, NOIRLab. The Legacy Surveys project is honored to be permitted to conduct astronomical research on Iolkam Du’ag (Kitt Peak), a mountain with particular significance to the Tohono O’odham Nation.
Some papers that used Galaxy 10
- DeepAstroUDA: Semi-Supervised Universal Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly DetectionAleksandra Ćiprijanović et al (2023)
- Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly DetectionAleksandra Ćiprijanović et al (2022)
- Equivariance-aware Architectural Optimization of Neural NetworksKaitlin Maile et al (2022)
- Machine learning in introductory astrophysics laboratory activitiesAlireza Vafaei Sadr (2022)
- Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative ModelingNaoya Takeishi et al (2021)
- SetGAN: Improving the stability and diversity of generative models through a permutation invariant architectureAlessandro Ferrero et al (2019)
- Input Selection for Bandwidth-Limited Neural Network InferenceStefan Oehmcke et al (2019)
Galaxy10 SDSS Dataset
Note
This page has been renamed to Galaxy10 SDSS. A new Galaxy10 using DECals is available at https://github.com/henrysky/Galaxy10
Introduction
Galaxy10 SDSS is a dataset contains 21785 69x69 pixels colored galaxy images (g, r and i band) separated in 10 classes. Galaxy10 SDSS images come from Sloan Digital Sky Survey and labels come from Galaxy Zoo.
Galaxy10 dataset (21785 images)
├── Class 0 (3461 images): Disk, Face-on, No Spiral
├── Class 1 (6997 images): Smooth, Completely round
├── Class 2 (6292 images): Smooth, in-between round
├── Class 3 (394 images): Smooth, Cigar shaped
├── Class 4 (1534 images): Disk, Edge-on, Rounded Bulge
├── Class 5 (17 images): Disk, Edge-on, Boxy Bulge
├── Class 6 (589 images): Disk, Edge-on, No Bulge
├── Class 7 (1121 images): Disk, Face-on, Tight Spiral
├── Class 8 (906 images): Disk, Face-on, Medium Spiral
└── Class 9 (519 images): Disk, Face-on, Loose Spiral
These classes are mutually exclusive, but Galaxy Zoo relies on human volunteers to classify galaxy images and the volunteers do not agree on all images. For this reason, Galaxy10 only contains images for which more than 55% of the votes agree on the class. That is, more than 55% of the votes among 10 classes are for a single class for that particular image. If none of the classes get more than 55%, the image will not be included in Galaxy10 as no agreement was reached. As a result, 21785 images after the cut.
The justification of 55% as the threshold is based on validation. Galaxy10 is meant to be an alternative to MNIST or Cifar10 as a deep learning toy dataset for astronomers. Thus astroNN.models.Cifar10_CNN is used with Cifar10 as a reference. The validation was done on the same astroNN.models.Cifar10_CNN. 50% threshold will result a poor neural network classification accuracy although around 36000 images in the dataset, many are probably misclassified and neural network has a difficult time to learn. 60% threshold result is similar to 55% , both classification accuracy is similar to Cifar10 dataset on the same network, but 55% threshold will have more images be included in the dataset. Thus 55% was chosen as the threshold to cut data.
The original images are 424x424, but were cropped to 207x207 centered at the images and then downscaled 3 times via bilinear interpolation to 69x69 in order to make them manageable on most computer and graphics card memory.
There is no guarantee on the accuracy of the labels. Moreover, Galaxy10 is not a balanced dataset and it should only be used for educational or experimental purpose. If you use Galaxy10 for research purpose, please cite Galaxy Zoo and Sloan Digital Sky Survey.
For more information on the original classification tree: Galaxy Zoo Decision Tree

Download Galaxy10 SDSS
Galaxy10.h5: http://www.astro.utoronto.ca/~bovy/Galaxy10/Galaxy10.h5
SHA256: 969A6B1CEFCC36E09FFFA86FEBD2F699A4AA19B837BA0427F01B0BC6DED458AF
Size: 200 MB (210,234,548 bytes)
Or see below to load (and download automatically) the dataset with astroNN
TL;DR for Beginners
You can view the Jupyter notebook in here: https://github.com/henrysky/astroNN/blob/master/demo_tutorial/galaxy10/Galaxy10_Tutorial.ipynb
OR you can train with astroNN and just copy and paste the following script to get and train a simple neural network on Galaxy10
Basically first we load the Galaxy10 with astroNN and split into train and test set. astroNN will split the training set into training data and validation data as well as normalizing them automatically.
Glaxy10CNN is a simple 4 layered convolutional neural network consisted of 2 convolutional layers and 2 dense layers.
1# import everything we need first
2from tensorflow.keras import utils
3import numpy as np
4from sklearn.model_selection import train_test_split
5import pylab as plt
6
7from astroNN.models import Galaxy10CNN
8from astroNN.datasets import load_galaxy10sdss
9from astroNN.datasets.galaxy10sdss import galaxy10cls_lookup, galaxy10_confusion
10
11# To load images and labels (will download automatically at the first time)
12# First time downloading location will be ~/.astroNN/datasets/
13images, labels = load_galaxy10sdss()
14
15# To convert the labels to categorical 10 classes
16labels = utils.to_categorical(labels, 10)
17
18# Select 10 of the images to inspect
19img = None
20plt.ion()
21print('===================Data Inspection===================')
22for counter, i in enumerate(range(np.random.randint(0, labels.shape[0], size=10).shape[0])):
23 img = plt.imshow(images[i])
24 plt.title('Class {}: {} \n Random Demo images {} of 10'.format(np.argmax(labels[i]), galaxy10cls_lookup(labels[i]), counter+1))
25 plt.draw()
26 plt.pause(2.)
27plt.close('all')
28print('===============Data Inspection Finished===============')
29
30# To convert to desirable type
31labels = labels.astype(np.float32)
32images = images.astype(np.float32)
33
34# Split the dataset into training set and testing set
35train_idx, test_idx = train_test_split(np.arange(labels.shape[0]), test_size=0.1)
36train_images, train_labels, test_images, test_labels = images[train_idx], labels[train_idx], images[test_idx], labels[test_idx]
37
38# To create a neural network instance
39galaxy10net = Galaxy10CNN()
40
41# set maximium epochs the neural network can run, set 5 to get quick result
42galaxy10net.max_epochs = 5
43
44# To train the nerual net
45# astroNN will normalize the data by default
46galaxy10net.train(train_images, train_labels)
47
48# print model summary before training
49galaxy10net.keras_model.summary()
50
51# After the training, you can test the neural net performance
52# Please notice predicted_labels are labels predicted from neural network. test_labels are ground truth from the dataset
53predicted_labels = galaxy10net.test(test_images)
54
55# Convert predicted_labels to class
56prediction_class = np.argmax(predicted_labels, axis=1)
57
58# Convert test_labels to class
59test_class = np.argmax(test_labels, axis=1)
60
61# Prepare a confusion matrix
62confusion_matrix = np.zeros((10,10))
63
64# create the confusion matrix
65for counter, i in enumerate(prediction_class):
66 confusion_matrix[i, test_class[counter]] += 1
67
68# Plot the confusion matrix
69galaxy10_confusion(confusion_matrix)
Load with astroNN
1from astroNN.datasets import load_galaxy10sdss
2from tensorflow.keras import utils
3import numpy as np
4
5# To load images and labels (will download automatically at the first time)
6# First time downloading location will be ~/.astroNN/datasets/
7images, labels = load_galaxy10sdss()
8
9# To convert the labels to categorical 10 classes
10labels = utils.to_categorical(labels, 10)
11
12# To convert to desirable type
13labels = labels.astype(np.float32)
14images = images.astype(np.float32)
OR Load with Python & h5py
You should download Galaxy10.h5 first and open python at the same location and run the following to open it:
1import h5py
2import numpy as np
3from tensorflow.keras import utils
4
5# To get the images and labels from file
6with h5py.File('Galaxy10.h5', 'r') as F:
7 images = np.array(F['images'])
8 labels = np.array(F['ans'])
9
10# To convert the labels to categorical 10 classes
11labels = utils.to_categorical(labels, 10)
12
13# To convert to desirable type
14labels = labels.astype(np.float32)
15images = images.astype(np.float32)
Split into train and test set
1import numpy as np
2from sklearn.model_selection import train_test_split
3
4train_idx, test_idx = train_test_split(np.arange(labels.shape[0]), test_size=0.1)
5train_images, train_labels, test_images, test_labels = images[train_idx], labels[train_idx], images[test_idx], labels[test_idx]
Lookup Galaxy10 Class
You can lookup Galaxy10 class to the corresponding name by
1from astroNN.datasets.galaxy10sdss import galaxy10cls_lookup
2galaxy10cls_lookup(# a class number here to get back the name)
Acknowledgments
For astroNN acknowledgment, please refers to Acknowledging astroNN
Galaxy10 dataset classification labels come from Galaxy Zoo
Galaxy10 dataset images come from Sloan Digital Sky Survey (SDSS)
Galaxy Zoo is described in Lintott et al. 2008, MNRAS, 389, 1179 and the data release is described in Lintott et al. 2011, 410, 166
Funding for the SDSS and SDSS-II has been provided by the Alfred P. Sloan Foundation, the Participating Institutions, the National Science Foundation, the U.S. Department of Energy, the National Aeronautics and Space Administration, the Japanese Monbukagakusho, the Max Planck Society, and the Higher Education Funding Council for England. The SDSS Web Site is http://www.sdss.org/.
The SDSS is managed by the Astrophysical Research Consortium for the Participating Institutions. The Participating Institutions are the American Museum of Natural History, Astrophysical Institute Potsdam, University of Basel, University of Cambridge, Case Western Reserve University, University of Chicago, Drexel University, Fermilab, the Institute for Advanced Study, the Japan Participation Group, Johns Hopkins University, the Joint Institute for Nuclear Astrophysics, the Kavli Institute for Particle Astrophysics and Cosmology, the Korean Scientist Group, the Chinese Academy of Sciences (LAMOST), Los Alamos National Laboratory, the Max-Planck-Institute for Astronomy (MPIA), the Max-Planck-Institute for Astrophysics (MPA), New Mexico State University, Ohio State University, University of Pittsburgh, University of Portsmouth, Princeton University, the United States Naval Observatory, and the University of Washington.
Getting Started
astroNN is developed on GitHub. You can download astroNN from its Github.
But the easiest way to install is via pip
: astroNN on Python PyPI
pip install astroNN
For latest version, you can clone the latest commit of astroNN from github
git clone --depth=1 https://github.com/henrysky/astroNN
and run the following command to install after you open a command line window in the package folder to install:
python -m pip install .
or to develop:
python -m pip install -e .
Prerequisites
Latest version of Anaconda is recommended, but generally the use of Anaconda is still highly recommended
Python 3.7 or above
Tensorflow (the latest version is recommended)
Tensorflow-Probability (the latest version is recommended)
CUDA and CuDNN (optional)
graphviz and pydot are required to plot the model architecture
scikit-learn, tqdm, pandas, h5py and astroquery required for astroNN functions
Since Tensorflow and Tensorflow-Probability are rapidly developing packages and astroNN heavily depends on Tensorflow. The support policy of astroNN to these packages is only the last two official versions are supported (i.e. the latest version and the second latest version are included in test suite). Generally using the latest version of Tensorflow, Tensorflow-Probability are recommended. The current supporting status (i.e. included in test suites) are
Tensorflow 2.12.x (correspond to Tensorflow-Probability 0.19.x)
Tensorflow 2.11.x (correspond to Tensorflow-Probability 0.19.x)
Note
Due to bugs in Tensorflow 1.12.x: https://github.com/tensorflow/tensorflow/issues/22952, 1.14.x: https://github.com/tensorflow/tensorflow/issues/27543 or 2.5.x: https://github.com/tensorflow/tensorflow/pull/47957, you have to patch a few lines in order for astroNN to work proporly. You can patch Tensorflow by running the following code
from astroNN.config import tf_patch
tf_patch()
You can also unpatch Tensorflow to undo changes made by astroNN by running the following code
from astroNN.config import tf_unpatch
tf_unpatch()
For instruction on how to install Tensorflow, please refers to their official website Installing TensorFlow
Recommended system requirement:
64-bits operating system
CPU which supports AVX2 (List of CPUs: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2)
16GB RAM or above
NVIDIA Graphics card (Optional, GTX 10 series or above) or Apple Silicon
(If using NVIDIA GPU): At least 4GB VRAM on GPU
Using astroNN on Google Colab
To use the latest commit of astroNN on Google colab, you can copy and paste the following
!pip install tensorflow
!pip install tensorflow_probability
!pip install git+https://github.com/henrysky/astroNN.git
Basic FAQ
My hardware or software cannot meet the prerequisites, what should I do?
The hardware and software requirement is just an estimation. It is entirely possible to run astroNN without those requirement. But generally, python 3.6 or above (as Tensorflow only supports py36 or above) and mid-to-high end hardware.
Can I contribute to astroNN?
You can contact me (Henry: henrysky.leung [at] utoronto.ca) or refer to Contributor and Issue Reporting guide.
I have found a bug in astorNN
Please try to use the latest commit of astroNN. If the issue persists, please report to https://github.com/henrysky/astroNN/issues
I keep receiving warnings on APOGEE and Gaia environment variables
If you are not dealing with APOGEE or Gaia data, please ignore those warnings. If error raised to prevent you to use some of astroNN functionality, please report it as a bug to https://github.com/henrysky/astroNN/issues
If you don’t want those warnings to be shown again, go to astroNN’s configuration file and set environmentvariablewarning
to False
I have installed pydot and graphviz but still fail to plot the model
if you are encountering this issue, please uninstall both pydot
and graphviz
and run the following command
pip install pydot
conda install graphviz
Then if you are using Mac, run the following command
brew install graphviz
If you are using Windows, go to https://graphviz.gitlab.io/_pages/Download/Download_windows.html to download the Windows package and add the package to the PATH environment variable.
Configuration File
astroNN configuration file is located at ~/.astroNN/config.ini
which contains a few astroNN settings.
Currently, the default configuration file should look like this
[Basics]
magicnumber = -9999.0
multiprocessing_generator = False
environmentvariablewarning = True
[NeuralNet]
custommodelpath = None
cpufallback = False
gpu_mem_ratio = True
magicnumber
refers to the Magic Number which representing missing labels/data, default is -9999. Please do not change
this value if you rely on APOGEE data.
multiprocessing_generator
refers to whether enable multiprocessing in astroNN data generator. Default is False
except on Linux and MacOS.
environmentvariablewarning
refers to whether you will be warned about not setting APOGEE and Gaia environment variable.
custommodelpath
refers to a list of custom models, path to the folder containing custom model (.py files),
multiple paths can be separated by ;
.
Default value is None meaning no additional path will be searched when loading model.
Or for example: /users/astroNN/custom_models/;/local/some_other_custom_models/
if you have self defined model in those locations.
cpufallback
refers to whether force to use CPU. No effect if you are using tensorflow instead of tensorflow-gpu
gpu_mem_ratio
refers to GPU management. Set True
to dynamically allocate memory which is astroNN default or enter a float between 0 and 1
to set the maximum ratio of GPU memory to use or set None
to let Tensorflow pre-occupy all of available GPU memory
which is a designed default behavior from Tensorflow.
For whatever reason if you want to reset the configure file:
1from astroNN.config import config_path
2
3# astroNN will reset the config file if the flag = 2
4config_path(flag=2)
Folder Structure for astroNN, APOGEE, Gaia and LAMOST data
This code depends on environment variables and folders for APOGEE, Gaia and LAMOST data. The environment variables are
SDSS_LOCAL_SAS_MIRROR
: top-level directory that will be used to (selectively) mirror the SDSS Science Archive Server (SAS)GAIA_TOOLS_DATA
: top-level directory under which the Gaia data will be stored.LASMOT_DR5_DATA
: top-level directory under which the LASMOST DR5 data will be stored.
How to set environment variable on different operating system: Guide here
$SDSS_LOCAL_SAS_MIRROR/
├── dr14/
│ ├── apogee/spectro/redux/r8/stars/
│ │ ├── apo25m/
│ │ │ ├── 4102/
│ │ │ │ ├── apStar-r8-2M21353892+4229507.fits
│ │ │ │ ├── apStar-r8-**********+*******.fits
│ │ │ │ └── ****/
│ │ ├── apo1m/
│ │ │ ├── hip/
│ │ │ │ ├── apStar-r8-2M00003088+5933348.fits
│ │ │ │ ├── apStar-r8-**********+*******.fits
│ │ │ │ └── ***/
│ │ ├── l31c/l31c.2/
│ │ │ ├── allStar-l30e.2.fits
│ │ │ ├── allVisit-l30e.2.fits
│ │ │ ├── 4102/
│ │ │ │ ├── aspcapStar-r8-l30e.2-2M21353892+4229507.fits
│ │ │ │ ├── aspcapStar-r8-l30e.2-**********+*******.fits
│ │ │ │ └── ****/
│ │ │ └── Cannon/
│ │ │ └── allStarCannon-l31c.2.fits
└── dr13/
└── *similar to dr14 above/*
$GAIA_TOOLS_DATA/
└── Gaia/
├── gdr1/tgas_source/fits/
│ ├── TgasSource_000-000-000.fits
│ ├── TgasSource_000-000-001.fits
│ └── ***.fits
└── gdr2/gaia_source_with_rv/fits/
├── GaiaSource_2851858288640_1584379458008952960.fits
├── GaiaSource_1584380076484244352_2200921635402776448.fits
└── ***.fits
$LASMOT_DR5_DATA/
└── DR5/
├── LAMO5_2MS_AP9_SD14_UC4_PS1_AW_Carlin_M.fits
├── 20111024
│ ├── F5902
│ │ ├──spec-55859-F5902_sp01-001.fits.gz
│ │ └── ****.fits.gz
│ └── ***/
├── 20111025
│ ├── B6001
│ │ ├──spec-55860-B6001_sp01-001.fits.gz
│ │ └── ****.fits.gz
│ └── ***/
└── ***/
Note
The APOGEE and Gaia folder structure should be consistent with APOGEE and gaia_tools python package by Jo Bovy, tools for dealing with APOGEE and Gaia data
A dedicated project folder is recommended to run astroNN, always run astroNN under the root of project folder. So that astroNN will always create folder for every neural network you run under the same place. Just as below
Contributor and Issue Reporting guide
When contributing to this repository, please first discuss the big changes you wish to make via opening issue, email, or any other method with the maintainers of this repository.
Submitting bug reports and feature requests
Bug reports and feature requests should be submitted by creating an issue on https://github.com/henrysky/astroNN
Pull Request
This is a general guideline to make pull request (PR).
Go to https://github.com/henrysky/astroNN, click the
Fork
button.Download your own astroNN fork to your computer by
$git clone https://github.com/your_username/astroNN
Create a new branch with a short simple name that represents the change you want to make
Make commits locally in that new branch, and push to your own astroNN fork repository
Create a pull request by clicking the
New pull request
button.
New Model Proposal guide
astroNN acts as a platform to share astronomy-oriented neural networks, so you are welcome to do so.
To add new models:
Import your model in
astroNN\models\__init__.py
and add the model class name to__all__
Add a documentation page for the new model and add link it appropriately in
docs\source\index.rst
Add the new model to the tree diagram and API under appropriate class in
docs\souce\neuralnets\basic_usage.rst
Add the new model to the release history in
docs\source\history.rst
If your new model is proposed along with a paper, add your model to the test suite in tests\test_paper_models.rst
just to make sure your model works fine against future changes in astroNN.
Possible New Features and Improvement in the future
GPU/performance related issues
Data reduction pipeline on GPU?
Multiple GPU support!
Training on large dataset that cant fit into memory?
Neural Network related issues
Currently the Bayesian NN models only use Dropout VI, maybe introduce more methods especially from TF-Probability
Have some nice VAE or GAN thing, maybe on spectroscopic data first
History
v1.1 series
v1.1.0 (26 April 2023)
This release mainly targeted to the paper A variational encoder-decoder approach to precise spectroscopic age estimation for large Galactic surveys
available at
[arXiv:2302.05479]
[ADS]
Added models:
ApogeeKplerEchelle
andApokascEncoderDecoder
Input data can now be a dict, such as
nn.train({'input': input_data, 'input': aux_input_data}, {'output': labels, 'output_aux': aux_labels})
Added numerical integrator for NeuralODE
tqdm progress bar for model prediction
Added a new improved version
Galaxy10
Added multiple metrics based on median
Added functions
transfer_weights
forr transfer learning
Fully compatible with Tensorflow 2
Model training/inference should be much faster by using Tensorflow v2 eager execution (see: https://github.com/tensorflow/tensorflow/issues/33024#issuecomment-551184305)
Improved continuous integration testing with Github Actions, now actually test models learn properly with real world data instead of checking no syntax error with random data
Support sample_weight in all losss functions and training
Improved catalog coordinates matching
New documentation webpages
~15% faster in Bayesian neural network inference by using parallelized loop
Loss/metrics functions and normalizer now check for NaN too
Updated many of notebooks to be compable with the latest Tensorflow
Deprecated support for all Tensorflow 1.x
Tested with Tensorflow 2.11 and 2.12
Python 3.8 or above only
Incompatible to Tensorflow 1.x and <=2.3 due to necessary changes for Tensorflow eager execution API
Renamed neural network models
train()
,test()
,train_on_batch()
method tofit()
,predict()
,fit_on_batch()
Old
Galaxy10
has been renamed toGalaxy10 SDSS
and the new version will replace and callGalaxy10
v1.0 series
v1.0.1 (5 March 2019)
This release mainly targeted to the paper Simultaneous calibration of spectro-photometric distances and the Gaia DR2 parallax zero-point offset with deep learning
available at
[arXiv:1902.08634]
[ADS]
Documentation for this version is available at https://astronn.readthedocs.io/en/v1.0.1/
Better and faster with IPython tab auto-completion
Added models :
ApogeeDR14GaiaDR2BCNN
Improved data pipeline to generate data for NNs
Tested with Tensorflow 1.11.0/1.12.0/1.13.1 and Keras 2.2.0/2.2.4
v1.0.0 (16 August 2018)
This is the first release of astroNN. This release mainly targeted to the paper Deep learning of multi-element abundances from high-resolution spectroscopic data
available at
[arXiv:1804.08622]
[ADS]
Documentation for this version is available at https://astronn.readthedocs.io/en/v1.0.0/
Initial Release!!
Tested with Tensorflow 1.8.0/1.9.0 and Keras 2.2.0/2.2.2
Python 3.6 or above only
v0.0 series
v0.0.0 (13 October 2017)
First commit of astroNN on Github!!!
Publications using astroNN
- Deep learning of multi-element abundances from high-resolution spectroscopic dataHenry W. Leung, Jo Bovy (2019)Original
astroNN
paper - Dynamical heating across the Milky Way disc using APOGEE and GaiaJ. Ted Mackereth, Jo Bovy, Henry W. Leung, et al. (2019)Use
ApogeeBCNN
to infer spectroscopic age - Simultaneous calibration of spectro-photometric distances and the Gaia DR2 parallax zero-point offset with deep learningHenry W. Leung, Jo Bovy (2019)Use
ApogeeDR14GaiaDR2BCNN
to infer spectro-photometric distances - Solar image denoising with convolutional neural networksC. J. Díaz Baso, J. de la Cruz Rodríguez, S. Danilovic (2019)
- A variational encoder-decoder approach to precise spectroscopic age estimation for large Galactic surveysHenry W. Leung, Jo Bovy, J. Ted Mackereth, Andrea Miglio (2023)Use
ApokascEncoderDecoder
to infer spectroscopic age trained on APOGEE and Kepler
Publication figure style
astroNN
contains a function that helps me to standardize maplotlib figure style used in my publication.
Such function can be used by simply calling it before using matplotlib to plot any figure
1import matplotlib.pylab as plt
2from astroNN.shared import pylab_style
3
4pylab_style(paper=True)
5
6# matplotlib code goes here
If you do not have \(\LaTeX\) installed on your computer, you can set the paper option to False like pylab_style(paper=False)
Here are a figure that compare different styles using the following matplotlib code
1plt.figure(figsize=(5, 5))
2plt.plot([0, 1], [0, 1], label="Test")
3plt.xlabel("x")
4plt.ylabel("y")
5plt.legend()

Loss Functions and Metrics
astroNN provides modified loss functions under astroNN.nn.losses
module which are capable to deal with incomplete labels which are represented by magicnumber
in astroNN configuration file or Magic Number
in equations below.
Since they are built on Tensorflow and follows Keras API requirement, all astroNN loss functions are fully compatible
with Keras with Tensorflow backend, as well as directly be imported and used with Tensorflow, for most loss functions, the
first argument is ground truth tensor and the second argument is prediction tensor from neural network.
Note
Always make sure when you are normalizing your data, keep the magic number as magic number. If you use astroNN normalizer, astroNN will take care of that.
Here are some explanations on variables in the following loss functions:
\(y_i\) means the ground truth labels, always represented by python variable y_true
in astroNN
\(\hat{y_i}\) means the prediction from neural network, always represented by python variable y_pred
in astroNN
Correction Term for Magic Number
- astroNN.nn.losses.magic_correction_term(y_true)[source]
Calculate a correction term to prevent the loss being “lowered” by magic_num or NaN
- Parameters
y_true (tf.Tensor) – Ground Truth
- Returns
Correction Term
- Return type
tf.Tensor
- History
- 2018-Jan-30 - Written - Henry Leung (University of Toronto)2018-Feb-17 - Updated - Henry Leung (University of Toronto)
Since astroNN deals with magic number by assuming the prediction from neural network for those ground truth with Magic Number is right, so we need a correction term.
The correction term in astroNN is defined by the following equation and we call the equation \(\mathcal{F}_{correction}\)
In case of no labels with Magic Number is presented, \(\mathcal{F}_{correction}\) will equal to 1
Mean Squared Error
- astroNN.nn.losses.mean_squared_error(y_true, y_pred, sample_weight=None)[source]
Calculate mean square error losses
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Mean Squared Error
- Return type
tf.Tensor
- History
2017-Nov-16 - Written - Henry Leung (University of Toronto)
MSE is based on the equation
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=mean_squared_error, ...)
Mean Absolute Error
- astroNN.nn.losses.mean_absolute_error(y_true, y_pred, sample_weight=None)[source]
Calculate mean absolute error, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Mean Absolute Error
- Return type
tf.Tensor
- History
2018-Jan-14 - Written - Henry Leung (University of Toronto)
MAE is based on the equation
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=mean_absolute_error, ...)
Mean Error
- astroNN.nn.losses.mean_error(y_true, y_pred, sample_weight=None)[source]
Calculate mean error as a way to get the bias in prediction, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Mean Error
- Return type
tf.Tensor
- History
2018-May-22 - Written - Henry Leung (University of Toronto)
Mean Error is a metrics to evaluate the bias of prediction and is based on the equation
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=mean_error, ...)
Regression Loss and Predictive Variance Loss for Bayesian Neural Net
- astroNN.nn.losses.robust_mse(y_true, y_pred, variance, labels_err, sample_weight=None)[source]
Calculate predictive variance, and takes account of labels error in Bayesian Neural Network
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
variance (Union(tf.Tensor, tf.Variable)) – Log Predictive Variance
labels_err (Union(tf.Tensor, tf.Variable)) – Known labels error, give zeros if unknown/unavailable
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Robust Mean Squared Error, can be used directly with Tensorflow
- Return type
tf.Tensor
- History
2018-April-07 - Written - Henry Leung (University of Toronto)
- astroNN.nn.losses.mse_lin_wrapper(var, labels_err)[source]
Calculate predictive variance, and takes account of labels error in Bayesian Neural Network
- Parameters
var (Union(tf.Tensor, tf.Variable)) – Predictive Variance
labels_err (Union(tf.Tensor, tf.Variable)) – Known labels error, give zeros if unknown/unavailable
- Returns
Robust MSE function for labels prediction neurones, which matches Keras losses API
- Return type
function
- Returned Funtion Parameter
- function(y_true, y_pred)- y_true (tf.Tensor): Ground Truth- y_pred (tf.Tensor): PredictionReturn (tf.Tensor): Robust Mean Squared Error
- History
2017-Nov-16 - Written - Henry Leung (University of Toronto)
- astroNN.nn.losses.mse_var_wrapper(lin, labels_err)[source]
Calculate predictive variance, and takes account of labels error in Bayesian Neural Network
- Parameters
lin (Union(tf.Tensor, tf.Variable)) – Prediction
labels_err (Union(tf.Tensor, tf.Variable)) – Known labels error, give zeros if unknown/unavailable
- Returns
Robust MSE function for predictive variance neurones which matches Keras losses API
- Return type
function
- Returned Funtion Parameter
- function(y_true, y_pred)- y_true (tf.Tensor): Ground Truth- y_pred (tf.Tensor): Predictive VarianceReturn (tf.Tensor): Robust Mean Squared Error
- History
2017-Nov-16 - Written - Henry Leung (University of Toronto)
It is based on the equation implemented as robust_mse(), please notice \(s_i\) is representing \(log((\sigma_{predictive, i})^2 + (\sigma_{known, i})^2)\). Neural network not predicting variance directly to avoid numerical instability but predicting \(log((\sigma_{i})^2)\)
And thus the loss for mini-batch is
They basically do the same things and can be used with Keras, you just have to import the functions from astroNN
1def keras_model():
2 # Your keras_model define here
3
4 # model for the training process
5 model = Model(inputs=[input_tensor, labels_err_tensor], outputs=[output, variance_output])
6
7 # model for the prediction
8 model_prediction = Model(inputs=input_tensor, outputs=[output, variance_output])
9
10 variance_output = Dense(name='variance_output', ...)
11 output = Dense(name='output', ...)
12
13 predictive_variance_loss = mse_var_wrapper(output, labels_err_tensor)
14 output_loss = mse_lin_wrapper(predictive_variance, labels_err_tensor)
15
16 return model, model_prediction, output_loss, predictive_variance_loss
17
18model, model_prediction, output_loss, predictive_variance_loss = keras_model()
19# remember to import astroNN loss function first
20model.compile(loss={'output': output_loss, 'variance_output': predictive_variance_loss}, ...)
To better understand this loss function, you can see the following plot of Loss vs Variance colored by squared difference which is \((\hat{y_i}-y_i)^2\)

Mean Squared Logarithmic Error
- astroNN.nn.losses.mean_squared_logarithmic_error(y_true, y_pred, sample_weight=None)[source]
Calculate mean squared logarithmic error, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Mean Squared Logarithmic Error
- Return type
tf.Tensor
- History
2018-Feb-17 - Written - Henry Leung (University of Toronto)
MSLE will first clip the values of prediction from neural net for the sake of numerical stability,
Then MSLE is based on the equation
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=mean_squared_logarithmic_error, ...)
Mean Absolute Percentage Error
- astroNN.nn.losses.mean_absolute_percentage_error(y_true, y_pred, sample_weight=None)[source]
Calculate mean absolute percentage error, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Mean Absolute Percentage Error
- Return type
tf.Tensor
- History
2018-Feb-17 - Written - Henry Leung (University of Toronto)
Mean Absolute Percentage Error will first clip the values of prediction from neural net for the sake of numerical stability,
Then Mean Absolute Percentage Error is based on the equation
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=mean_absolute_percentage_error, ...)
Mean Percentage Error
- astroNN.nn.losses.mean_percentage_error(y_true, y_pred, sample_weight=None)[source]
Calculate mean percentage error, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Mean Percentage Error
- Return type
tf.Tensor
- History
2018-Jun-06 - Written - Henry Leung (University of Toronto)
Mean Percentage Error will first clip the values of prediction from neural net for the sake of numerical stability,
Then Mean Percentage Error is based on the equation
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=mean_percentage_error, ...)
Categorical Cross-Entropy
- astroNN.nn.losses.categorical_crossentropy(y_true, y_pred, sample_weight=None, from_logits=False)[source]
Categorical cross-entropy between an output tensor and a target tensor, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
from_logits (boolean) – From logits space or not. If you want to use logits, please use from_logits=True
- Returns
Categorical Cross-Entropy
- Return type
tf.Tensor
- History
2018-Jan-14 - Written - Henry Leung (University of Toronto)
Categorical Cross-Entropy will first clip the values of prediction from neural net for the sake of numerical stability if the prediction is not coming from logits (before softmax activated)
and then based on the equation
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=categorical_crossentropy(from_logits=False), ...)
Binary Cross-Entropy
- astroNN.nn.losses.binary_crossentropy(y_true, y_pred, sample_weight=None, from_logits=False)[source]
Binary cross-entropy between an output tensor and a target tensor, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
from_logits (boolean) – From logits space or not. If you want to use logits, please use from_logits=True
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Binary Cross-Entropy
- Return type
tf.Tensor
- History
2018-Jan-14 - Written - Henry Leung (University of Toronto)
Binary Cross-Entropy will first clip the values of prediction from neural net for the sake of numerical stability if
from_logits=False
and then based on the equation
to avoid numerical instability if from_logits=True
, we can reformulate it as
And thus the loss for mini-batch is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=binary_crossentropy(from_logits=False), ...)
Categorical Cross-Entropy and Predictive Logits Variance for Bayesian Neural Net
- astroNN.nn.losses.robust_categorical_crossentropy(y_true, y_pred, logit_var, sample_weight)[source]
Calculate categorical accuracy, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction in logits space
logit_var (Union(tf.Tensor, tf.Variable)) – Predictive variance in logits space
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
categorical cross-entropy
- Return type
tf.Tensor
- History
2018-Mar-15 - Written - Henry Leung (University of Toronto)
- astroNN.nn.losses.bayesian_categorical_crossentropy_wrapper(logit_var)[source]
- Categorical crossentropy between an output tensor and a target tensor for Bayesian Neural Networkequation (12) of arxiv:1703.04977
- Parameters
logit_var (Union(tf.Tensor, tf.Variable)) – Predictive variance
- Returns
Robust categorical_crossentropy function for predictive variance neurones which matches Keras losses API
- Return type
function
- Returned Function Parameter
- function(y_true, y_pred)- y_true (tf.Tensor): Ground Truth- y_pred (tf.Tensor): Prediction in logits spaceReturn (tf.Tensor): Robust categorical crossentropy
- History
2018-Mar-15 - Written - Henry Leung (University of Toronto)
- astroNN.nn.losses.bayesian_categorical_crossentropy_var_wrapper(logits)[source]
- Categorical crossentropy between an output tensor and a target tensor for Bayesian Neural Networkequation (12) of arxiv:1703.04977
- Parameters
logits (Union(tf.Tensor, tf.Variable)) – Prediction in logits space
- Returns
Robust categorical_crossentropy function for predictive variance neurones which matches Keras losses API
- Return type
function
- Returned Function Parameter
- function(y_true, y_pred)- y_true (tf.Tensor): Ground Truth- y_pred (tf.Tensor): Predictive variance in logits spaceReturn (tf.Tensor): Robust categorical crossentropy
- History
2018-Mar-15 - Written - Henry Leung (University of Toronto)
It is based on Equation 12 from arxiv:1703.04977. \(s_i\) is representing the predictive variance of logits
where Distorted Categorical Cross-Entropy is defined as
And thus the loss for mini-batch is
bayesian_categorical_crossentropy_wrapper is for the prediction neurones
bayesian_categorical_crossentropy_var_wrapper is for the predictive variance neurones
They basically do the same things and can be used with Keras, you just have to import the functions from astroNN
1def keras_model():
2 # Your keras_model define here
3
4 # model for the training process
5 model = Model(inputs=[input_tensor], outputs=[output, variance_output])
6
7 # model for the prediction
8 model_prediction = Model(inputs=input_tensor, outputs=[output, variance_output])
9
10 variance_output = Dense(name='predictive_variance', ...)
11 output = Dense(name='output', ...)
12
13 predictive_variance_loss = bayesian_categorical_crossentropy_var_wrapper(output)
14 output_loss = bayesian_categorical_crossentropy_wrapper(predictive_variance)
15
16 return model, model_prediction, output_loss, predictive_variance_loss
17
18model, model_prediction, output_loss, predictive_variance_loss = keras_model()
19# remember to import astroNN loss function first
20model.compile(loss={'output': output_loss, 'variance_output': predictive_variance_loss}, ...)
Binary Cross-Entropy and Predictive Logits Variance for Bayesian Neural Net
- astroNN.nn.losses.robust_binary_crossentropy(y_true, y_pred, logit_var, sample_weight)[source]
Calculate binary accuracy, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction in logits space
logit_var (Union(tf.Tensor, tf.Variable)) – Predictive variance in logits space
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
categorical cross-entropy
- Return type
tf.Tensor
- History
2018-Mar-15 - Written - Henry Leung (University of Toronto)
- astroNN.nn.losses.bayesian_binary_crossentropy_wrapper(logit_var)[source]
- Binary crossentropy between an output tensor and a target tensor for Bayesian Neural Networkequation (12) of arxiv:1703.04977
- Parameters
logit_var (Union(tf.Tensor, tf.Variable)) – Predictive variance
- Returns
Robust binary_crossentropy function for predictive variance neurones which matches Keras losses API
- Return type
function
- Returned Function Parameter
- function(y_true, y_pred)- y_true (tf.Tensor): Ground Truth- y_pred (tf.Tensor): Prediction in logits spaceReturn (tf.Tensor): Robust binary crossentropy
- History
2018-Mar-15 - Written - Henry Leung (University of Toronto)
- astroNN.nn.losses.bayesian_binary_crossentropy_var_wrapper(logits)[source]
- Binary crossentropy between an output tensor and a target tensor for Bayesian Neural Networkequation (12) of arxiv:1703.04977
- Parameters
logits (Union(tf.Tensor, tf.Variable)) – Prediction in logits space
- Returns
Robust binary_crossentropy function for predictive variance neurones which matches Keras losses API
- Return type
function
- Returned Function Parameter
- function(y_true, y_pred)- y_true (tf.Tensor): Ground Truth- y_pred (tf.Tensor): Predictive variance in logits spaceReturn (tf.Tensor): Robust binary crossentropy
- History
2018-Mar-15 - Written - Henry Leung (University of Toronto)
It is based on Equation 12 from arxiv:1703.04977. \(s_i\) is representing the predictive variance of logits
where Distorted Binary Cross-Entropy is defined as
And thus the loss for mini-batch is
bayesian_binary_crossentropy_wrapper is for the prediction neurones
bayesian_binary_crossentropy_var_wrapper is for the predictive variance neurones
They basically do the same things and can be used with Keras, you just have to import the functions from astroNN
1def keras_model():
2 # Your keras_model define here
3
4 # model for the training process
5 model = Model(inputs=[input_tensor], outputs=[output, variance_output])
6
7 # model for the prediction
8 model_prediction = Model(inputs=input_tensor, outputs=[output, variance_output])
9
10 variance_output = Dense(name='predictive_variance', ...)
11 output = Dense(name='output', ...)
12
13 predictive_variance_loss = bayesian_binary_crossentropy_var_wrapper(output)
14 output_loss = bayesian_binary_crossentropy_wrapper(predictive_variance)
15
16 return model, model_prediction, output_loss, predictive_variance_loss
17
18model, model_prediction, output_loss, predictive_variance_loss = keras_model()
19# remember to import astroNN loss function first
20model.compile(loss={'output': output_loss, 'variance_output': predictive_variance_loss}, ...)
Categorical Classification Accuracy
- astroNN.nn.losses.categorical_accuracy(y_true, y_pred)[source]
Calculate categorical accuracy, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
- Returns
Categorical Classification Accuracy
- Return type
tf.Tensor
- History
2018-Jan-21 - Written - Henry Leung (University of Toronto)
Categorical Classification Accuracy will first deal with Magic Number
Then based on the equation
And thus the accuracy for is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's metrics function first
7model.compile(metrics=categorical_accuracy, ...)
Note
Please make sure you use categorical_accuracy when using categorical_crossentropy as the loss function
Binary Classification Accuracy
- astroNN.nn.losses.binary_accuracy(*args, **kwargs)[source]
Calculate binary accuracy, ignoring the magic number
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
- Returns
Binary accuracy
- Return type
tf.Tensor
- History
2018-Jan-31 - Written - Henry Leung (University of Toronto)
Binary Classification Accuracy will round the values of prediction if from_logits=False
or will apply sigmoid
first and then round the values of prediction if from_logits=True
and then based on the equation
And thus the accuracy for is
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's metrics function first
7model.compile(metrics=binary_accuracy(from_logits=False), ...)
Note
Please make sure you use binary_accuracy when using binary_crossentropy as the loss function
Zeros Loss
- astroNN.nn.losses.zeros_loss(y_true, y_pred, sample_weight=None)[source]
Always return zeros
- Parameters
y_true (Union(tf.Tensor, tf.Variable)) – Ground Truth
y_pred (Union(tf.Tensor, tf.Variable)) – Prediction
sample_weight (Union(tf.Tensor, tf.Variable, list)) – Sample weights
- Returns
Zeros
- Return type
tf.Tensor
- History
2018-May-24 - Written - Henry Leung (University of Toronto)
zeros_loss
is a loss function that will always return zero loss and the function matches Keras API. It is mainly
designed to do testing or experiments.
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5model = keras_model()
6# remember to import astroNN's loss function first
7model.compile(loss=zeros_loss, ...)
Layers
astroNN provides some customized layers under astroNN.nn.layers
module which built on tensorflow.keras. You can just treat astroNN customized layers as conventional Keras layers.
Monte Carlo Dropout Layer
- class astroNN.nn.layers.MCDropout(*args, **kwargs)[source]
Dropout Layer for Bayesian Neural Network, this layer will always on regardless the learning phase flag
- Parameters
rate (float) – Dropout Rate between 0 and 1
disable (boolean) – Dropout on or off
- Returns
A layer
- Return type
- History
2018-Feb-05 - Written - Henry Leung (University of Toronto)
MCDropout is basically Keras’s Dropout layer without seed argument support. Moreover, the layer will ignore Keras’s learning phase flag, so the layer will always stays on even in prediction phase.
Dropout can be described by the following formula, lets say we have \(i\) neurones after activation with value \(y_i\)
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 b_dropout = MCDropout(0.2)(some_keras_layer)
4 return model
If you really want to disable the dropout, you do it by
1# Your keras_model define here, assuming you are using functional API
2b_dropout = MCDropout(0.2, disable=True)(some_keras_layer)
Monte Carlo Dropout with Continuous Relaxation Layer Wrapper
- class astroNN.nn.layers.MCConcreteDropout(*args, **kwargs)[source]
- Monte Carlo Dropout with Continuous Relaxation Layer Wrapper This layer will learn the dropout probabilityarXiv:1705.07832
- Parameters
layer (keras.layers.Layer) – The layer to be applied concrete dropout
- Returns
A layer
- Return type
- History
2018-Mar-04 - Written - Henry Leung (University of Toronto)
MCConcreteDropout is an implementation of arXiv:1705.07832, modified from the original implementation here. Moreover, the layer will ignore Keras’s learning phase flag, so the layer will always stays on even in prediction phase. This layer should be only used for experimental purpose only as it has not been tested rigorously. MCConcreteDropout is technically a layer wrapper instead of a standard layer, so it needs to take a layer as an input argument.
The main difference between MCConcreteDropout and standard bernoulli dropout is MCConcreteDropout learns dropout rate during training instead of a fixed probability. Turning/learning dropout rate is not a novel idea, it can be traced back to one of the original paper arXiv:1506.02557 on variational dropout. But MCConcreteDropout focuses on the role and importance of dropout with Bayesian technique.
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 c_dropout = MCConcreteDropout(some_keras_layer)(previous_layer)
4 return model
If you really want to disable the dropout, you do it by
1# Your keras_model define here, assuming you are using functional API
2c_dropout = MCConcreteDropout((some_keras_layer), disable=True)(previous_layer)
Monte Carlo Spatial Dropout Layer
MCSpatialDropout1D should be used with Conv1D and MCSpatialDropout2D should be used with Conv2D
- class astroNN.nn.layers.MCSpatialDropout1D(*args, **kwargs)[source]
Spatial 1D version of Dropout of Dropout Layer for Bayesian Neural Network, this layer will always regardless the learning phase flag
- Parameters
rate (float) – Dropout Rate between 0 and 1
disable (boolean) – Dropout on or off
- Returns
A layer
- Return type
- History
2018-Mar-07 - Written - Henry Leung (University of Toronto)
- call(inputs, training=None)
- Note
Equivalent to __call__()
- Parameters
inputs (tf.Tensor) – Tensor to be applied
- Returns
Tensor after applying the layer
- Return type
tf.Tensor
- class astroNN.nn.layers.MCSpatialDropout2D(*args, **kwargs)[source]
Spatial 2D version of Dropout of Dropout Layer for Bayesian Neural Network, this layer will always regardless the learning phase flag
- Parameters
rate (float) – Dropout Rate between 0 and 1
disable (boolean) – Dropout on or off
- Returns
A layer
- Return type
- History
2018-Mar-07 - Written - Henry Leung (University of Toronto)
- call(inputs, training=None)
- Note
Equivalent to __call__()
- Parameters
inputs (tf.Tensor) – Tensor to be applied
- Returns
Tensor after applying the layer
- Return type
tf.Tensor
MCSpatialDropout1D and MCSpatialDropout2D are basically Keras’s Spatial Dropout layer without seed and noise_shape argument support. Moreover, the layers will ignore Keras’s learning phase flag, so the layers will always stays on even in prediction phase.
This version performs the same function as Dropout, however it drops entire 1D feature maps instead of individual elements. If adjacent frames within feature maps are strongly correlated (as is normally the case in early convolution layers) then regular dropout will not regularize the activations and will otherwise just result in an effective learning rate decrease. In this case, SpatialDropout1D will help promote independence between feature maps and should be used instead.
For technical detail, you can refer to the original paper arXiv:1411.4280
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 b_dropout = MCSpatialDropout1D(0.2)(keras_conv_layer)
4 return model
If you really want to disable the dropout, you do it by
1# Your keras_model define here, assuming you are using functional API
2b_dropout = MCSpatialDropout1D(0.2, disable=True)(keras_conv_layer)
Monte Carlo Gaussian Dropout Layer
- class astroNN.nn.layers.MCGaussianDropout(*args, **kwargs)[source]
Dropout Layer for Bayesian Neural Network, this layer will always on regardless the learning phase flag standard deviation sqrt(rate / (1 - rate))
- Parameters
rate (float) – Dropout Rate between 0 and 1
disable (boolean) – Dropout on or off
- Returns
A layer
- Return type
- History
2018-Mar-07 - Written - Henry Leung (University of Toronto)
MCGaussianDropout is basically Keras’s Dropout layer without seed argument support. Moreover, the layer will ignore Keras’s learning phase flag, so the layer will always stays on even in prediction phase.
MCGaussianDropout should be used with caution for Bayesian Neural Network: https://arxiv.org/abs/1711.02989
Gaussian Dropout can be described by the following formula, lets say we have \(i\) neurones after activation with value \(y_i\)
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 b_dropout = MCGaussianDropout(0.2)(some_keras_layer)
4 return model
If you really want to disable the dropout, you do it by
1# Your keras_model define here, assuming you are using functional API
2b_dropout = MCGaussianDropout(0.2, disable=True)(some_keras_layer)
Monte Carlo Batch Normalization Layer
- class astroNN.nn.layers.MCBatchNorm(*args, **kwargs)[source]
Monte Carlo Batch Normalization Layer for Bayesian Neural Network
- Parameters
disable (boolean) – Dropout on or off
- Returns
A layer
- Return type
- History
2018-Apr-12 - Written - Henry Leung (University of Toronto)
MCBatchNorm is a layer doing Batch Normalization originally described in arViX: https://arxiv.org/abs/1502.03167
MCBatchNorm should be used with caution for Bayesian Neural Network: https://openreview.net/forum?id=BJlrSmbAZ
Batch Normalization can be described by the following formula, lets say we have \(N\) neurones after activation for a layer
MCBatchNorm can be imported by
1from astroNN.nn.layers import MCBatchNorm
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 b_dropout = MCBatchNorm()(some_keras_layer)
4 return model
Error Propagation Layer
- class astroNN.nn.layers.ErrorProp(*args, **kwargs)[source]
Propagate Error Layer by adding gaussian noise (mean=0, std=err) during testing phase from
input_err
tensor- Returns
A layer
- Return type
- History
2018-Feb-05 - Written - Henry Leung (University of Toronto)
ErrorProp is a layer designed to do error propagation in neural network. It will acts as an identity transformation layer during training phase but add gaussian noise to input during test phase. The idea is if you have known uncertainty in input, and you want to understand how input uncertainty (more specifically this layer assuming the uncertainty is Gaussian) affects the output. Since this layer add random known Gaussian uncertainty to the input, you can run model prediction a few times to get some predictions, mean of those predictions will be the final prediction and standard derivation of the predictions will be the propagated uncertainty.
ErrorProp can be imported by
1from astroNN.nn.layers import ErrorProp
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 input = Input(.....)
4 input_with_error = ErrorProp()([input, input_error])
5 return model
KL-Divergence Layer for Variational Autoencoder
- class astroNN.nn.layers.KLDivergenceLayer(*args, **kwargs)[source]
- Identity transform layer that adds KL divergence to the final model losses.KL divergence used to force the latent space match the prior (in this case its unit gaussian)
- Returns
A layer
- Return type
- History
2018-Feb-05 - Written - Henry Leung (University of Toronto)
KLDivergenceLayer is a layer designed to be used in Variational Autoencoder. It will acts as an identity transformation layer but will add KL-divergence to the total loss.
KLDivergenceLayer can be imported by
1from astroNN.nn.layers import KLDivergenceLayer
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 z_mu = Encoder_Mean_Layer(.....)
4 z_log_var = Encoder_Var_Layer(.....)
5 z_mu, z_log_var = KLDivergenceLayer()([z_mu, z_log_var])
6 # And then decoder or whatever
7 return model
Polynomial Fitting Layer
- class astroNN.nn.layers.PolyFit(*args, **kwargs)[source]
n-deg polynomial fitting layer which acts as an neural network layer to be optimized
- Parameters
deg (int) – degree of polynomial
output_units (int) – number of output neurons
use_xbias (bool) – If True, then fitting output=P(inputs)+inputs, else fitting output=P(inputs)
init_w (Union[NoneType, list]) – [Optional] list of initial weights if there is any, the list should be [n-degree, input_size, output_size]
name (Union[NoneType, str]) – [Optional] name of the layer
activation (Union[NoneType, str]) – [Optional] activation, default is ‘linear’
kernel_regularizer (Union[NoneType, str]) – [Optional] kernel regularizer
kernel_constraint (Union[NoneType, str]) – [Optional] kernel constraint
- Returns
A layer
- Return type
- History
2018-Jul-24 - Written - Henry Leung (University of Toronto)
PolyFit is a layer designed to do n-degree polynomial fitting in a neural network style by treating coefficient as neural network weights and optimize them by neural network optimizer. The fitted polynomial(s) are in the following form (you can specify initial weights by init_w=[[[\(w_0\)]], [[\(w_1\)]], …, [[\(w_n\)]]]) for a single input and output value
For multiple i input values and j output values and n-deg polynomial (you can specify initial weights by init_w=[[[\(w_{0, 1, 0}\), \(w_{0, 1, 1}\), …, \(w_{0, 1, j}\)], [\(w_{0, 2, 0}\), \(w_{0, 2, 1}\), …, \(w_{0, 2, j}\)], … [\(w_{0, i, 0}\), \(w_{0, i, 1}\), …, \(w_{0, i, j}\)]], …, [[\(w_{n, 1, 0}\), \(w_{n, 1, 1}\), …, \(w_{n, 1, j}\)], [\(w_{n, 2, 0}\), \(w_{n, 2, 1}\), …, \(w_{n, 2, j}\)], … [\(w_{n, i, 0}\), \(w_{n, i, 1}\), …, \(w_{n, i, j}\)]]])
and the polynomial is as the following form for For multiple i input values and j output values and n-deg polynomial
PolyFit can be imported by
1from astroNN.nn.layers import PolyFit
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 input = Input(.....)
4 output = PolyFit(deg=1)(input)
5 return model(inputs=input, outputs=output)
To show it works as a polynomial, you can refer the following example:
1import numpy as np
2from astroNN.nn.layers import PolyFit
3
4from astroNN.shared.nn_tools import cpu_fallback
5from tensorflow import keras
6
7cpu_fallback() # force tf to use CPU
8
9Input = keras.layers.Input
10Model = keras.models.Model
11
12# Data preparation
13polynomial_coefficient = [0.1, -0.05]
14random_xdata = np.random.normal(0, 3, (100, 1))
15random_ydata = polynomial_coefficient[1] * random_xdata + polynomial_coefficient[0]
16
17input = Input(shape=[1, ])
18# set initial weights
19output = PolyFit(deg=1, use_xbias=False, init_w=[[[0.1]], [[-0.05]]], name='polyfit')(input)
20model = Model(inputs=input, outputs=output)
21
22# predict without training (i.e. without gradient updates)
23np.allclose(model.predict(random_xdata), random_ydata)
24>>> True # True means prediction approx close enough
Mean and Variance Calculation Layer for Bayesian Neural Net
- class astroNN.nn.layers.FastMCInferenceMeanVar(*args, **kwargs)[source]
Take mean and variance of the results of a TimeDistributed layer, assuming axis=1 is the timestamp axis
- Returns
A layer
- Return type
- History
- 2018-Feb-02 - Written - Henry Leung (University of Toronto)2018-Apr-13 - Update - Henry Leung (University of Toronto)
If you wnat fast MC inference on GPU and you are using keras models, you should just use FastMCInference.
FastMCInferenceMeanVar is a layer designed to be used with Bayesian Neural Network with Dropout Variational Inference. FastMCInferenceMeanVar should be used with FastMCInference in general. The advantage of FastMCInferenceMeanVar layer is you can copy the data and calculate the mean and variance on GPU (if any) when you are doing dropout variational inference.
FastMCInferenceMeanVar can be imported by
1from astroNN.nn.layers import FastMCInferenceMeanVar
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 input = Input(.....)
4 monte_carlo_dropout = FastMCInference(mc_num_here)
5 # some layer here, you should use MCDropout from astroNN instead of Dropout from Tensorflow:)
6 result_mean_var = FastMCInferenceMeanVar()(previous_layer_here)
7 return model
8
9model.compile(loss=loss_func_here, optimizer=optimizer_here)
10
11# Use the model to predict
12output = model.predict(x)
13
14# with dropout variational inference
15# prediction and model uncertainty (variance) from the model
16mean = output[0]
17variance = output[1]
Repeat Vector Layer for Bayesian Neural Net
- class astroNN.nn.layers.FastMCRepeat(*args, **kwargs)[source]
Prepare data to do inference, Repeats the input n times at axis=1
- Parameters
n (int) – Number of Monte Carlo integration
- Returns
A layer
- Return type
- History
- 2018-Feb-02 - Written - Henry Leung (University of Toronto)2018-Apr-13 - Update - Henry Leung (University of Toronto)
If you wnat fast MC inference on GPU and you are using keras models, you should just use FastMCInference.
FastMCRepeat is a layer to repeat training data to do Monte Carlo integration required by Bayesian Neural Network.
FastMCRepeat is a layer designed to be used with Bayesian Neural Network with Dropout Variational Inference. FastMCRepeat should be used with FastMCInferenceMeanVar in general. The advantage of FastMCRepeat layer is you can copy the data and calculate the mean and variance on GPU (if any) when you are doing dropout variational inference.
FastMCRepeat can be imported by
1from astroNN.nn.layers import FastMCRepeat
And here is an example of usage
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 input = Input(.....)
4 monte_carlo_dropout = FastMCRepeat(mc_num_here)
5 # some layer here, you should use MCDropout from astroNN instead of Dropout from Tensorflow:)
6 result_mean_var = FastMCInferenceMeanVar()(previous_layer_here)
7 return model
8
9model.compile(loss=loss_func_here, optimizer=optimizer_here)
10
11# Use the model to predict
12output = model.predict(x)
13
14# with dropout variational inference
15# prediction and model uncertainty (variance) from the model
16mean = output[0]
17variance = output[1]
Fast Monte Carlo Integration Layer for Keras Model
- class astroNN.nn.layers.FastMCInference(n, **kwargs)[source]
Turn a model for fast Monte Carlo (Dropout, Flipout, etc) Inference on GPU
- Parameters
n (int) – Number of Monte Carlo integration
- Returns
A layer
- Return type
- History
- 2018-Apr-13 - Written - Henry Leung (University of Toronto)2021-Apr-14 - Updated - Henry Leung (University of Toronto)
FastMCInference is a layer designed for fast Monte Carlo Inference on GPU. One of the main challenge of MC integration on GPU is you want the data stay on GPU and you do MC integration on GPU entirely, moving data from drives to GPU is a very expensive operation. FastMCInference will create a new keras model such that it will replicate data on GPU, do Monte Carlo integration and calculate mean and variance on GPU, and get back the result.
Benchmark (Nvidia GTX1060 6GB): 98,000 7514 pixles APOGEE Spectra, traditionally the 25 forward pass spent ~270 seconds, by using FastMCInference, it only spent ~65 seconds to do the exact same task.
It can only be used with Keras model. If you are using customised model purely with Tensorflow, you should use FastMCRepeat and FastMCInferenceMeanVar
You can import the function from astroNN by
1from astroNN.nn.layers import FastMCInference
2
3# keras_model is your keras model with 1 output which is a concatenation of labels prediction and predictive variance
4keras_model = Model(....)
5
6# fast_mc_model is the new keras model capable to do fast monte carlo integration on GPU
7fast_mc_model = FastMCInference(keras_model)
8
9# You can just use keras API with the new model such as
10result = fast_mc_model.predict(.....)
11
12# here is the result dimension
13predictions = result[:, :(result.shape[1] // 2), 0] # mean prediction
14mc_dropout_uncertainty = result[:, :(result.shape[1] // 2), 1] * (self.labels_std ** 2) # model uncertainty
15predictions_var = np.exp(result[:, (result.shape[1] // 2):, 0]) * (self.labels_std ** 2) # predictive uncertainty
Gradient Stopping Layer
- class astroNN.nn.layers.StopGrad(*args, **kwargs)[source]
Stop gradient backpropagation via this layer during training, act as an identity layer during testing by default.
- Parameters
always_on (bool) – Default False which will on during train time and off during test time. True to enable it in every situation
- Returns
A layer
- Return type
- History
2018-May-23 - Written - Henry Leung (University of Toronto)
It uses tf.stop_gradient
and acts as a Keras layer.
StopGrad can be imported by
1from astroNN.nn.layers import StopGrad
It can be used with keras or tensorflow.keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 input = Input(.....)
4 # some layers ...
5 stopped_grad_layer = StopGrad()(...)
6 # some layers ...
7 return model
For example, if you have a model with multiple branches and you only want error backpropagate to one but not the other,
1from astroNN.nn.layers import StopGrad
2# we use zeros loss just to demonstrate StopGrad works and no error backprop from StopGrad layer
3from astroNN.nn.losses import zeros_loss
4import numpy as np
5from astroNN.shared.nn_tools import cpu_fallback
6from tensorflow import keras
7
8cpu_fallback() # force tf to use CPU
9
10Input = keras.layers.Input
11Dense = keras.layers.Dense
12concatenate = keras.layers.concatenate
13Model = keras.models.Model
14
15# Data preparation
16random_xdata = np.random.normal(0, 1, (100, 7514))
17random_ydata = np.random.normal(0, 1, (100, 25))
18input2 = Input(shape=[7514])
19dense1 = Dense(100, name='normaldense')(input2)
20dense2 = Dense(25, name='wanted_dense')(input2)
21dense2_stopped = StopGrad(name='stopgrad', always_on=True)(dense2)
22output2 = Dense(25, name='wanted_dense2')(concatenate([dense1, dense2_stopped]))
23model2 = Model(inputs=input2, outputs=[output2, dense2])
24model2.compile(optimizer=keras.optimizers.SGD(lr=0.1),
25 loss={'wanted_dense2': 'mse', 'wanted_dense': zeros_loss})
26weight_b4_train = model2.get_layer(name='wanted_dense').get_weights()[0]
27weight_b4_train2 = model2.get_layer(name='normaldense').get_weights()[0]
28model2.fit(random_xdata, [random_ydata, random_ydata])
29weight_a4_train = model2.get_layer(name='wanted_dense').get_weights()[0]
30weight_a4_train2 = model2.get_layer(name='normaldense').get_weights()[0]
31
32print(np.all(weight_b4_train == weight_a4_train))
33>>> True # meaning all the elements from Dense with StopGrad layer are equal due to no gradient update
34print(np.all(weight_b4_train2 == weight_a4_train2))
35>>> False # meaning not all the elements from normal Dense layer are equal due to gradient update
Boolean Masking Layer
- class astroNN.nn.layers.BoolMask(*args, **kwargs)[source]
Boolean Masking layer, please notice it is best to flatten input before using BoolMask
- Parameters
mask (np.ndarray) – numpy boolean array as a mask for incoming tensor
- Returns
A layer
- Return type
- History
2018-May-28 - Written - Henry Leung (University of Toronto)
BoolMask takes numpy boolean array as layer initialization and mask the input tensor.
BoolMask can be imported by
1from astroNN.nn.layers import BoolMask
It can be used with keras or tensorflow.keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here, assuming you are using functional API
3 input = Input(.....)
4 # some layers ...
5 stopped_grad_layer = BoolMask(mask=....)(...)
6 # some layers ...
7 return model
TensorInput Layer
- class astroNN.nn.layers.TensorInput(*args, **kwargs)[source]
TensorInput layer
- Parameters
tensor (tf.Tensor) – tensor, usually is a tensor generating random number
- Returns
A layer
- Return type
- History
2020-May-3 - Written - Henry Leung (University of Toronto)
TensorInput takes tensorflow tensor as layer initialization and return the tensor.
TensorInput can be imported by
1from astroNN.nn.layers import TensorInput
For example, if you want to generate random tensor as other layers input and do not want it to register it as model input, you can
1from astroNN.nn.layers import TensorInput
2# we use zeros loss just to demonstrate StopGrad works and no error backprop from StopGrad layer
3from astroNN.nn.losses import zeros_loss
4import numpy as np
5from astroNN.shared.nn_tools import cpu_fallback
6import tensorflow as tf
7from tensorflow import keras
8
9cpu_fallback() # force tf to use CPU
10
11Input = keras.layers.Input
12Dense = keras.layers.Dense
13concatenate = keras.layers.concatenate
14Model = keras.models.Model
15
16# Data preparation
17random_xdata = np.random.normal(0, 1, (100, 7514))
18random_ydata = np.random.normal(0, 1, (100, 25))
19input1 = Input(shape=[7514])
20input2 = TensorInput(tensor=tf.random.normal(mean=0., stddev=1., shape=tf.shape(input1)))([])
21output = Dense(25, name='dense')(concatenate([input1, input2]))
22model = Model(inputs=input1, outputs=output)
23model.compile(optimizer=keras.optimizers.SGD(lr=0.1),
24 loss='mse')
25print(model.input_names)
26>>> ['input_1'] # only input_1 as input_2 is not really an input we requiring user to input
Callbacks and Utilities
A callback is a set of functions under astroNN.nn.callbacks
and astroNN.nn.utilities
modules to be applied at given stages of the training procedure.
astroNN provides some customized callbacks which built on tensorflow.keras. You can just treat astroNN customized callbacks as conventional Keras callbacks.
astroNN also contains some handy utilities for data processing
Virtual CSVLogger (Callback)
- class astroNN.nn.callbacks.VirutalCSVLogger(filename='training_history.csv', separator=',', append=False)[source]
A modification of keras’ CSVLogger, but not actually write a file until you call method to save
- Parameters
- Returns
callback instance
- Return type
- History
- 2018-Feb-22 - Written - Henry Leung (University of Toronto)2018-Mar-12 - Update - Henry Leung (University of Toronto)
VirutalCSVLogger is basically Keras’s CSVLogger without Python 2 support and won’t write the file to disk until savefile() method is called after the training where Keras’s CSVLogger will write to disk immediately.
VirutalCSVLogger can be imported by
1from astroNN.nn.callbacks import VirutalCSVLogger
It can be used with Keras, you just have to import the function from astroNN
1def keras_model():
2 # Your keras_model define here
3 return model
4
5# Create a Virtual_CSVLogger instance first
6csvlogger = VirutalCSVLogger()
7
8# Default filename is training_history.csv
9# You have to set filename first before passing to Keras
10csvlogger.filename = 'training_history.csv'
11
12model = keras_model()
13model.compile(....)
14
15model.fit(...,callbacks=[csvlogger])
16
17# Save the file to current directory
18csvlogger.savefile()
19
20# OR to save the file to other directory
21csvlogger.savefile(folder_name='some_folder')
Raising Error on Nan (Callback)
- class astroNN.nn.callbacks.ErrorOnNaN(monitor='loss')[source]
Callback that raise error when a NaN is encountered.
- Returns
callback instance
- Return type
- History
- 2018-May-07 - Written - Henry Leung (University of Toronto)2021-Apr-22 - Written - Henry Leung (University of Toronto)
ErrorOnNaN is basically Keras’s TerminateOnNaN but will raise ValueError on Nan, its useful for python unittest to make sure you can catch the error and know something is wrong.
Normalizer (Utility)
astroNN Normalizer is called when train() method is called and involved pre_training_checklist_master() method
defined in NeuralNetMaster Class. Normalizer will not normalize data/labels equal to magicnumber
defined in configuration file.
So that astroNN loss function can recognize those missing/bad data.
Normalizer consists of a few modes that you can, but the mode will minus mean and divide standard derivation to the data.
Mode 0 means normalizing data with mean=0 and standard derivation=1 (same as doing nothing)
1# If we have some data
2data = np.array([[1,2,3], [9,8,7]])
3
4# THe normalized data, mean std will as follow by this mode
5norm_data = array([[1,2,3], [9,8,7]])
6# the mean and standard derivation used to do the normalization
7mean = [0.]
8std = [1.]
Mode 1 means normalizing data with a single mean and a single standard derivation of the data
1# If we have some data
2data = np.array([[1,2,3], [9,8,7]])
3
4# THe normalized data, mean std will as follow by this mode
5norm_data = array([[-1.28653504, -0.96490128, -0.64326752], [ 1.28653504, 0.96490128, 0.64326752]])
6# the mean and standard derivation used to do the normalization
7mean = [5.0]
8std = [3.11]
Mode 2 means normalizing data with pixelwise means and pixelwise standard derivations of the data
1# If we have some data
2data = np.array([[1,2,3], [9,8,7]])
3
4# THe normalized data, mean std will as follow by this mode
5norm_data = array([[-4., -3., -2.], [ 4., 3., 2.]])
6# the mean and standard derivation used to do the normalization
7mean = [5., 5., 5.]
8std = [4., 3., 2.]
Mode 3 means normalizing data with featurewise mean and standard derivation=1 the data (only centered the data), it is useful for normalizing spectra
1# If we have some data
2data = array([[1,2,3], [9,8,7]])
3
4# THe normalized data, mean std will as follow by this mode
5norm_data = array([[-1., -1., -1.], [ 1., 1., 1.]])
6# the mean and standard derivation used to do the normalization
7mean = [5., 5., 5.]
8std = [1.]
Mode 3s means normalizing data with featurewise mean and standard derivation=1 the data (only centered the data), then apply sigmoid for normalization or sigmoid inverse for denormalization. It is useful for normalizing spectra for Variational Autoencoder with Negative Log Likelihood objective.
Mode 255 means normalizing data with mean=127.5 and standard derivation=127.5, this mode is designed to normalize 8bit images
1# If we have some data
2data = np.array([[255,125,100], [99,87,250]])
3
4# THe normalized data, mean std will as follow by this mode
5norm_data = array([[ 1. , -0.01960784, -0.21568627], [-0.22352941, -0.31764706, 0.96078431]])
6# the mean and standard derivation used to do the normalization
7mean = [127.5]
8std = [127.5]
You can set the mode from a astroNN neural net instance before called train() method by
1# To set the normalization mode for input and labels
2astronn_neuralnet.input_norm_mode = ...
3astronn_neuralnet.labels_norm_mode = ...
You can use Normalizer() independently to take advantage of this function won’t touch data equal magicnumber
.
Normalizer() always return you the normalized data, the mean and standard derivation used to do the normalization
1from astroNN.nn.utilities.normalizer import Normalizer
2import numpy as np
3
4# Make some data up
5data = np.array([[1.,2.,3.], [9.,8.,7.]])
6
7# Setup a normalizer instance with a mode, lets say mode 1
8normer = Normalizer(mode=1)
9
10# Use the instance method normalize to normalize the data
11norm_data = normer.normalize(data)
12
13print(norm_data)
14>>> array([[-1.28653504, -0.96490128, -0.64326752], [ 1.28653504, 0.96490128, 0.64326752]])
15print(normer.mean_labels)
16>>> 5.0
17print(normer.std_labels)
18>>> 3.1091263510296048
19
20# You can use the same instance (with same mean and std and mode) to demoralize data
21denorm_data = normer.denormalize(data)
22
23print(denorm_data)
24>>> array([[1.,2.,3.], [9.,8.,7.]])
Useful Handy Tensorflow function - astroNN.nn
- astroNN.nn.reduce_var(x, axis=None, keepdims=False)[source]
Calculate variance using Tensorflow (as opposed to tf.nn.moment which return both variance and mean)
- Parameters
x (tf.Tensor) – Data
axis (int) – Axis
keepdims (boolean) – Keeping variance dimension as data or not
- Returns
Variance
- Return type
tf.Tensor
- History
2018-Mar-04 - Written - Henry Leung (University of Toronto)
- astroNN.nn.intpow_avx2(x, n)[source]
Calculate integer power of float (including negative) even with Tensorflow compiled with AVX2 since –fast-math compiler flag aggressively optimize float operation which is common with AVX2 flag
- Parameters
x (tf.Tensor) – identifier
n (int) – an integer power (a float will be casted to integer!!)
- Returns
powered float(s)
- Return type
tf.Tensor
- History
2018-Aug-13 - Written - Henry Leung (University of Toronto)
1from astroNN.nn import intpow_avx2
2import tensorflow as tf
3
4print(intpow_avx2(tf.constant([-1.2]), 2))
5>>> tf.Tensor([1.44], shape=(1,), dtype=float32)
6
7print(tf.pow(tf.constant([-1.2]), 2))
8# if your tensorflow is compiled with AVX2 or --fast-math
9>>> tf.Tensor([nan], shape=(1,), dtype=float32)
10# if your tensorflow is NOT compiled with AVX2 or --fast-math
11>>> tf.Tensor([1.44], shape=(1,), dtype=float32)
NumPy Implementation of Tensorflow function - astroNN.nn.numpy
astroNN has some handy numpy implementation of a number of tensorflow functions. The list of available functions are
- astroNN.nn.numpy.kl_divergence(x, y)[source]
NumPy implementation of tf.distributions.kl_divergence
Either both x and y are ndarray or both x and y are astropy.Quatity, return without astropy units in all case
- astroNN.nn.numpy.mean_absolute_error(x, y, axis=None)[source]
NumPy implementation of tf.keras.metrics.mean_absolute_error with capability to deal with
magicnumber
and astropy QuantityEither both x and y are ndarray or both x and y are astropy.Quatity, return without astropy units in all case
- Parameters
- Raise
TypeError when only either x or y contains astropy units. Both x, y should carry/not carry astropy units at the same time
- Returns
Mean Absolute Error
- Return type
Union[ndarray, float]
- History
2018-Apr-11 - Written - Henry Leung (University of Toronto)
- astroNN.nn.numpy.mean_absolute_percentage_error(x, y, axis=None)[source]
- NumPy implementation of tf.keras.metrics.mean_absolute_percentage_error with capability to deal with
magicnumber
and astropy QuantityEither both x and y are ndarray or both x and y are astropy.Quatity, return has no astropy units in all case- Parameters
- Raise
TypeError when only either x or y contains astropy units. Both x, y should carry/not carry astropy units at the same time
- Returns
Mean Absolute Percentage Error
- Return type
Union[ndarray, float]
- History
2018-Apr-11 - Written - Henry Leung (University of Toronto)
- astroNN.nn.numpy.median_absolute_error(x, y, axis=None)[source]
NumPy implementation of a median version of tf.keras.metrics.mean_absolute_error with capability to deal with
magicnumber
and astropy QuantityEither both x and y are ndarray or both x and y are astropy.Quatity, return without astropy units in all case
- Parameters
- Raise
TypeError when only either x or y contains astropy units. Both x, y should carry/not carry astropy units at the same time
- Returns
Median Absolute Error
- Return type
Union[ndarray, float]
- History
2018-May-13 - Written - Henry Leung (University of Toronto)
- astroNN.nn.numpy.median_absolute_percentage_error(x, y, axis=None)[source]
- NumPy implementation of a median version of tf.keras.metrics.mean_absolute_percentage_error with capability todeal with
magicnumber
and astropy QuantityEither both x and y are ndarray or both x and y are astropy.Quatity, return has no astropy units in all case- Parameters
- Raise
TypeError when only either x or y contains astropy units. Both x, y should carry/not carry astropy units at the same time
- Returns
Median Absolute Percentage Error
- Return type
Union[ndarray, float]
- History
2018-May-13 - Written - Henry Leung (University of Toronto)
NeuralODE
Neural ODE (astroNN.neuralODE
; Neural Ordinary Differential Equation) module provides numerical integrator implemented in Tensorflow
for solutions of an ODE system, and can calculate gradient.
Numerical Integrator
astroNN
implemented numerical integrator in Tensorflow
- astroNN.neuralode.odeint.odeint(func=None, x=None, t=None, aux=None, method='dop853', precision=tf.float32, *args, **kwargs)[source]
To computes the numerical solution of a system of first order ordinary differential equations y’=f(x,y). Default precision at float32.
- Parameters
func (callable) – function of the differential equation, usually take func([position, velocity], time) and return velocity, acceleration
x (Union([tf.Tensor, numpy.ndarray, list])) – initial x, usually is [position, velocity]
t (Union([tf.Tensor, numpy.ndarray, list])) – set of times at which one wants the result
method (str) – numerical integrator to use, available integrators are [‘dop853’, ‘rk4’]
precision (type) – float precision, tf.float32 or tf.float64
t – set of times at which one wants the result
- Returns
integrated result
- Return type
tf.Tensor
- History
2020-May-31 - Written - Henry Leung (University of Toronto)
An example integration an ODE for sin(x)
1import time
2import pylab as plt
3import numpy as np
4import tensorflow as tf
5from astroNN.shared.nn_tools import cpu_fallback, gpu_memory_manage
6from astroNN.neuralode import odeint
7
8cpu_fallback()
9gpu_memory_manage()
10
11# time array
12t = tf.constant(np.linspace(0, 100, 10000))
13# initial condition
14true_y0 = tf.constant([0., 1.])
15# analytical ODE system for sine wave [x, t] -> [v, a]
16ode_func = lambda y, t: tf.stack([tf.cos(t), tf.sin(t)])
17
18start_t = time.time()
19true_y = odeint(ode_func, true_y0, t, method='dop853')
20print(time.time() - start_t) # approx. 4.3 seconds on i7-9750H GTX1650
21
22# plot the solution and compare
23plt.figure(dpi=300)
24plt.title("sine(x)")
25plt.plot(t, np.sin(t), label='Analytical')
26plt.plot(t, true_y[:, 0], ls='--', label='astroNN odeint')
27plt.legend(loc='best')
28plt.xlabel("t")
29plt.ylabel("y")
30plt.show()

Moreover odeint
supports numerically integration in parallel, the example below integration the sin(x)
for 50 initial
conditions. You can see the execution time is the same!!
1start_t = time.time()
2# initial conditions, 50 of them instead of a single initial condition
3true_y0sss = tf.random.normal((50, 2), 0, 1)
4# time array, 50 of them instead of the same time array for every initial condition
5tsss = tf.random.normal((50, 10000), 0, 1)
6true_y = odeint(ode_func, true_y0sss, tsss, method='dop853')
7print(time.time() - start_t) # also approx. 4.3 seconds on i7-9750H GTX1650
Neural Network model with Numerical Integrator
You can use odeint
along with neural network model, below is an example
1import numpy as np
2import tensorflow as tf
3from astroNN.shared.nn_tools import gpu_memory_manage, cpu_fallback
4from astroNN.neuralode import odeint
5
6cpu_fallback()
7gpu_memory_manage()
8
9t = tf.constant(np.linspace(0, 1, 20))
10# initial condition
11true_y0 = tf.constant([0., 1.])
12
13class MyModel(tf.keras.Model):
14 def __init__(self):
15 super(MyModel, self).__init__()
16 self.dense1 = tf.keras.layers.Dense(2, activation=tf.nn.relu)
17 self.dense2 = tf.keras.layers.Dense(16, activation=tf.nn.relu)
18 self.dense3 = tf.keras.layers.Dense(2)
19
20 def call(self, inputs, t, *args):
21 inputs = tf.expand_dims(inputs, axis=0)
22 x = self.dense2(self.dense1(inputs))
23 return tf.squeeze(self.dense3(x))
24
25model = MyModel()
26
27with tf.GradientTape() as g:
28 g.watch(true_y0)
29 y = odeint(model, true_y0, t)
30# gradient of the result w.r.t. model's weights
31g.gradient(y, model.trainable_variables) # well define, no None, no inf or no NaN
Neural Nets Classes and Basic Usage
Available astroNN Neural Net Classes
All astroNN Neural Nets are inherited from some child classes which inherited NeuralNetMaster, NeuralNetMaster also relies relies on two major component, Normalizer and GeneratorMaster
Normalizer (astroNN.nn.utilities.normalizer.Normalizer)
GeneratorMaster (astroNN.nn.utilities.generator.GeneratorMaster)
├── CNNDataGenerator
├── Bayesian_DataGenerator
└── CVAE_DataGenerator
NeuralNetMaster (astroNN.models.base_master_nn.NeuralNetMaster)
├── CNNBase
│ ├── ApogeeCNN
│ ├── StarNet2017
│ ├── ApogeeKplerEchelle
│ ├── SimplePloyNN
│ └── Cifar10CNN
├── BayesianCNNBase
│ ├── MNIST_BCNN # For authors testing only
│ ├── ApogeeBCNNCensored
│ └── ApogeeBCNN
├── ConvVAEBase
│ └── ApogeeCVAE # For authors testing only
└── CGANBase
└── GalaxyGAN2017 # For authors testing only
NeuralNetMaster Class API
All astroNN Neural Nets classes inherited from this astroNN.models.base_master_nn.NeuralNetMaster
and thus methods
of this class is shared across all astroNN Neural Nets classes.
- class astroNN.models.base_master_nn.NeuralNetMaster[source]
Top-level class for an astroNN neural network
- Variables
name – Full English name
_model_type – Type of model
_model_identifier – Unique model identifier, by default using class name as ID
_implementation_version – Version of the model
_python_info – Placeholder to store python version used for debugging purpose
_astronn_ver – astroNN version detected
_keras_ver – Keras version detected
_tf_ver – Tensorflow version detected
currentdir – Current directory of the terminal
folder_name – Folder name to be saved
fullfilepath – Full file path
batch_size – Batch size for training, by default 64
autosave – Boolean to flag whether autosave model or not
task – Task
lr – Learning rate
max_epochs – Maximum epochs
val_size – Validation set size in percentage
val_num – Validation set autual number
beta_1 – Exponential decay rate for the 1st moment estimates for optimization algorithm
beta_2 – Exponential decay rate for the 2nd moment estimates for optimization algorithm
optimizer_epsilon – A small constant for numerical stability for optimization algorithm
optimizer – Placeholder for optimizer
targetname – Full name for every output neurones
- History
- 2017-Dec-23 - Written - Henry Leung (University of Toronto)2018-Jan-05 - Updated - Henry Leung (University of Toronto)
- flush()[source]
- Experimental, I don’t think it worksFlush GPU memory from tensorflow
- History
2018-Jun-19 - Written - Henry Leung (University of Toronto)
- get_config()[source]
Get model configuration as a dictionary
- Returns
dict
- History
2018-May-23 - Written - Henry Leung (University of Toronto)
- get_weights()[source]
Get all model weights
- Returns
weights arrays
- Return type
ndarray
- History
2018-May-23 - Written - Henry Leung (University of Toronto)
- property has_model
Get whether the instance has a model, usually a model is created after you called train(), the instance will has no model if you did not call train()
- Returns
bool
- History
2018-May-21 - Written - Henry Leung (University of Toronto)
- hessian(x=None, mean_output=False, mc_num=1, denormalize=False)[source]
- Calculate the hessian of output to inputPlease notice that the de-normalize (if True) assumes the output depends on the input data first orderlyin which the hessians does not depends on input scaling and only depends on output scalingThe hessians can be all zeros and the common cause is you did not use any activation oractivation that is still too linear in some sense like ReLU.
- Parameters
- Returns
An array of Hessian
- Return type
ndarray
- History
2018-Jun-14 - Written - Henry Leung (University of Toronto)
- property input_shape
Get input shape of the prediction model
- Returns
input shape expectation
- Return type
- History
2018-May-21 - Written - Henry Leung (University of Toronto)
- jacobian(x=None, mean_output=False, mc_num=1, denormalize=False)[source]
- Calculate jacobian of gradient of output to input high performance calculation update on 15 April 2018Please notice that the de-normalize (if True) assumes the output depends on the input data first orderlyin which the equation is simply jacobian divided the input scaling, usually a good approx. if you use ReLU all the way
- Parameters
- Returns
An array of Jacobian
- Return type
ndarray
- History
- 2017-Nov-20 - Written - Henry Leung (University of Toronto)2018-Apr-15 - Updated - Henry Leung (University of Toronto)
- property output_shape
Get output shape of the prediction model
- Returns
output shape expectation
- Return type
- History
2018-May-19 - Written - Henry Leung (University of Toronto)
- plot_dense_stats()[source]
Plot dense layers weight statistics
- Returns
A plot
- History
2018-May-12 - Written - Henry Leung (University of Toronto)
- plot_model(name='model.png', show_shapes=True, show_layer_names=True, rankdir='TB')[source]
Plot model architecture with pydot and graphviz
- Parameters
- Returns
No return but will save the model architecture as png to disk
- save(name=None, model_plot=False)[source]
Save the model to disk
- Parameters
name (string or path) – Folder name/path to be saved
model_plot (boolean) – True to plot model too
- Returns
A saved folder on disk
- summary()[source]
Get model summary
- Returns
None, just print
- History
2018-May-23 - Written - Henry Leung (University of Toronto)
- transfer_weights(model, exclusion_output=False)[source]
Transfer weight of a model to current model if possible # TODO: remove layers after successful transfer so wont mix up?
- Parameters
model (astroNN.model.NeuralNetMaster or keras.models.Model) – astroNN model
exclusion_output (bool) – whether to exclude output in the transfer or not
- Returns
bool
- History
2022-Mar-06 - Written - Henry Leung (University of Toronto)
- property uses_learning_phase
To determine whether the model depends on keras learning flag. If False, then setting learning phase will not affect the model
- Returns
the boolean to indicate keras learning flag dependence of the model
- Return type
- History
2018-Jun-03 - Written - Henry Leung (University of Toronto)
CNNBase
Documented Members:
astroNN.models.SimplePloyNN()
- class astroNN.models.base_cnn.CNNBase[source]
Top-level class for a convolutional neural network
- evaluate(input_data, labels)[source]
Evaluate neural network by provided input data and labels and get back a metrics score
- Parameters
input_data (ndarray) – Data to be inferred with neural network
labels (ndarray) – labels
- Returns
metrics score dictionary
- Return type
- History
2018-May-20 - Written - Henry Leung (University of Toronto)
- fit(input_data, labels, sample_weight=None)[source]
Train a Convolutional neural network
- Parameters
input_data (ndarray) – Data to be trained with neural network
labels (ndarray) – Labels to be trained with neural network
sample_weight (Union([NoneType, ndarray])) – Sample weights (if any)
- Returns
None
- Return type
NoneType
- History
2017-Dec-06 - Written - Henry Leung (University of Toronto)
- fit_on_batch(input_data, labels, sample_weight=None)[source]
Train a neural network by running a single gradient update on all of your data, suitable for fine-tuning
- Parameters
input_data (ndarray) – Data to be trained with neural network
labels (ndarray) – Labels to be trained with neural network
sample_weight (Union([NoneType, ndarray])) – Sample weights (if any)
- Returns
None
- Return type
NoneType
- History
2018-Aug-22 - Written - Henry Leung (University of Toronto)
BayesianCNNBase
Documented Members:
- class astroNN.models.base_bayesian_cnn.BayesianCNNBase[source]
Top-level class for a Bayesian convolutional neural network
- History
2018-Jan-06 - Written - Henry Leung (University of Toronto)
- evaluate(input_data, labels, inputs_err=None, labels_err=None, batch_size=None)[source]
Evaluate neural network by provided input data and labels and get back a metrics score
- Parameters
input_data (ndarray) – Data to be trained with neural network
labels (ndarray) – Labels to be trained with neural network
inputs_err (Union([NoneType, ndarray])) – Error for input_data (if any), same shape with input_data.
labels_err (Union([NoneType, ndarray])) – Labels error (if any)
- Returns
metrics score dictionary
- Return type
- History
2018-May-20 - Written - Henry Leung (University of Toronto)
- fit(input_data, labels, inputs_err=None, labels_err=None, sample_weight=None, experimental=False)[source]
Train a Bayesian neural network
- Parameters
input_data (ndarray) – Data to be trained with neural network
labels (ndarray) – Labels to be trained with neural network
inputs_err (Union([NoneType, ndarray])) – Error for input_data (if any), same shape with input_data.
labels_err (Union([NoneType, ndarray])) – Labels error (if any)
sample_weight (Union([NoneType, ndarray])) – Sample weights (if any)
- Returns
None
- Return type
NoneType
- History
- 2018-Jan-06 - Written - Henry Leung (University of Toronto)2018-Apr-12 - Updated - Henry Leung (University of Toronto)
- fit_on_batch(input_data, labels, inputs_err=None, labels_err=None, sample_weight=None)[source]
Train a Bayesian neural network by running a single gradient update on all of your data, suitable for fine-tuning
- Parameters
input_data (ndarray) – Data to be trained with neural network
labels (ndarray) – Labels to be trained with neural network
inputs_err (Union([NoneType, ndarray])) – Error for input_data (if any), same shape with input_data.
labels_err (Union([NoneType, ndarray])) – Labels error (if any)
sample_weight (Union([NoneType, ndarray])) – Sample weights (if any)
- Returns
None
- Return type
NoneType
- History
- 2018-Aug-25 - Written - Henry Leung (University of Toronto)
- predict(input_data, inputs_err=None, batch_size=None)[source]
Test model, High performance version designed for fast variational inference on GPU
- Parameters
input_data (ndarray) – Data to be inferred with neural network
inputs_err (Union([NoneType, ndarray])) – Error for input_data, same shape with input_data.
- Returns
prediction and prediction uncertainty
- History
- 2018-Jan-06 - Written - Henry Leung (University of Toronto)2018-Apr-12 - Updated - Henry Leung (University of Toronto)
ConvVAEBase
Documented Members:
- class astroNN.models.base_vae.ConvVAEBase[source]
Top-level class for a Convolutional Variational Autoencoder
- History
2018-Jan-06 - Written - Henry Leung (University of Toronto)
- evaluate(input_data, labels)[source]
Evaluate neural network by provided input data and labels/reconstruction target to get back a metrics score
- Parameters
input_data (ndarray) – Data to be inferred with neural network
labels (ndarray) – labels
- Returns
metrics score
- Return type
- History
2018-May-20 - Written - Henry Leung (University of Toronto)
- fit(input_data, input_recon_target, sample_weight=None)[source]
Train a Convolutional Autoencoder
- Parameters
input_data (ndarray) – Data to be trained with neural network
input_recon_target (ndarray) – Data to be reconstructed
sample_weight (Union([NoneType, ndarray])) – Sample weights (if any)
- Returns
None
- Return type
NoneType
- History
2017-Dec-06 - Written - Henry Leung (University of Toronto)
- fit_on_batch(input_data, input_recon_target, sample_weight=None)[source]
Train a AutoEncoder by running a single gradient update on all of your data, suitable for fine-tuning
- Parameters
input_data (ndarray) – Data to be trained with neural network
input_recon_target (ndarray) – Data to be reconstructed
sample_weight (Union([NoneType, ndarray])) – Sample weights (if any)
- Returns
None
- Return type
NoneType
- History
2018-Aug-25 - Written - Henry Leung (University of Toronto)
- jacobian_latent(x=None, mean_output=False, mc_num=1, denormalize=False)[source]
- Calculate jacobian of gradient of latent space to input high performance calculation update on 15 April 2018Please notice that the de-normalize (if True) assumes the output depends on the input data first orderlyin which the equation is simply jacobian divided the input scaling, usually a good approx. if you use ReLU all the way
- Parameters
- Returns
An array of Jacobian
- Return type
ndarray
- History
- 2017-Nov-20 - Written - Henry Leung (University of Toronto)2018-Apr-15 - Updated - Henry Leung (University of Toronto)
- predict(input_data)[source]
Use the neural network to do inference and get reconstructed data
- Parameters
input_data (ndarray) – Data to be inferred with neural network
- Returns
reconstructed data
- Return type
ndarry
- History
2017-Dec-06 - Written - Henry Leung (University of Toronto)
- predict_decoder(z)[source]
Use the decoder to get the hidden layer encoding/representation
- Parameters
z (ndarray) – Latent space vectors
- Returns
output reconstruction
- Return type
ndarray
- History
2022-Dec-08 - Written - Henry Leung (University of Toronto)
- predict_encoder(input_data)[source]
Use the encoder to get the hidden layer encoding/representation
- Parameters
input_data (ndarray) – Data to be inferred with neural network
- Returns
hidden layer encoding/representation mean and std
- Return type
ndarray
- History
2017-Dec-06 - Written - Henry Leung (University of Toronto)
Workflow of Setting up astroNN Neural Nets Instances and Training
astroNN contains some predefined neural networks which work well in certain aspect. For most general usage, I recommend you to create your own neural network for more flexibility and take advantage of astroNN custom loss function or layers.
For predefined neural network, generally you have to setup an instances of astroNN Neural Nets class with some predefined architecture. For example,
1# import the neural net class from astroNN first
2from astroNN.models import ApogeeCNN
3
4# astronn_neuralnet is an astroNN's neural network instance
5# In this case, it is an instance of ApogeeCNN
6astronn_neuralnet = ApogeeCNN()
Lets say you have your training data prepared, you should specify what the neural network is outputing by setting up the targetname
1# Just an example, if the training data is Teff, logg, Fe and absmag
2astronn_neuralnet.targetname = ['teff', 'logg', 'Fe', 'absmag']
By default, astroNN will generate folder name automatically with naming scheme astroNN_[month][day]_run[run number]
.
But you can specify custom name by
1# astronn_neuralnet is an astroNN's neural network instance
2astronn_neuralnet.folder_name = 'some_custom_name'
You can enable autosave (save all stuffs immediately after training or save it yourself by
1# To enable autosave
2astronn_neuralnet.autosave = True
3
4# To save all the stuffs, model_plot=True to plot models too, otherwise wont plot, needs pydot_ng and graphviz
5astronn_neuralnet.save(model_plot=False)
astroNN will normalize your data after you called train() method. The advantage of it is if you are using normalization provided by astroNN, you can make sure when test() method is called, the testing data will be normalized and prediction will be denormalized in the exact same way as training data. This can minimize human error.
If you want to normalize by yourself, you can disable it by
1# astronn_neuralnet is an astroNN's neural network instance
2astronn_neuralnet.input_norm_mode=0
3astronn_neuralnet.labels_norm_mode = 0
You can add a list of Keras/astroNN callback by
1astronn_neuralnet.callbacks = [# some callback(s) here)]
So now everything is set up for training
1# Start the training
2astronn_neuralnet.train(x_train,y_train)
If you did not enable autosave, you can save it after training by
1# To save all the stuffs, model_plot=True to plot models too, otherwise wont plot, needs pydot_ng and graphviz
2astronn_neuralnet.save(model_plot=False)
Load astroNN Generated Folders
First way to load a astroNN generated folder, you can use the following code. You need to replace astroNN_0101_run001
with the folder name. should be something like astroNN_[month][day]_run[run number]
- astroNN.models.load_folder(folder=None)[source]
To load astroNN model object from folder
- Parameters
folder (str) – [optional] you should provide folder name if outside folder, do not specific when you are inside the folder
- Returns
astroNN Neural Network instance
- Return type
astroNN.nn.NeuralNetMaster.NeuralNetMaster
- History
2017-Dec-29 - Written - Henry Leung (University of Toronto)
1from astroNN.models import load_folder
2astronn_neuralnet = load_folder('astroNN_0101_run001')

OR second way to open astroNN generated folders is to open the folder and run command line window inside there, or switch directory of your command line window inside the folder and run
1from astroNN.models import load_folder
2astronn_neuralnet = load_folder()

astronn_neuralnet will be an astroNN neural network object in this case. It depends on the neural network type which astroNN will detect it automatically, you can access to some methods like doing inference or continue the training (fine-tuning). You should refer to the tutorial for each type of neural network for more detail.
There is a few parameters from keras_model you can always access,
1# The model summary from Keras
2astronn_neuralnet.keras_model.summary()
3
4# The model input
5astronn_neuralnet.keras_model.input
6
7# The model input shape expectation
8astronn_neuralnet.keras_model.input_shape
9
10# The model output
11astronn_neuralnet.keras_model.output
12
13# The model output shape expectation
14astronn_neuralnet.keras_model.output_shape
astroNN neuralnet object also carries targetname (hopefully correctly set by the writer of neural net), parameters used to normalize the training data (The normalization of training and testing data must be the same)
1# The tragetname corresponding to output neurone
2astronn_neuralnet.targetname
3
4# The model input
5astronn_neuralnet.keras_model.input
6
7# The mean used to normalized training data
8astronn_neuralnet.input_mean_norm
9
10# The standard derivation used to normalized training data
11astronn_neuralnet.input_std_norm
12
13# The mean used to normalized training labels
14astronn_neuralnet.labels_mean_norm
15
16# The standard derivation used to normalized training labels
17astronn_neuralnet.labels_std_norm
Load and Use Multiple astroNN Generated Folders
Note
astroNN fully supports eager execution now and you no longer need to context manage graph and session in order to use multiple model at the same time
It is tricky to load and use multiple models at once since keras share a global session by default if no default tensorflow session provided and astroNN might encounter namespaces/scopes collision. So astroNN assign seperate Graph and Session for each astroNN neural network model. You can do:
1from astroNN.models import load_folder
2
3astronn_model_1 = load_folder("astronn_model_1")
4astronn_model_2 = load_folder("astronn_model_2")
5astronn_model_3 = load_folder("astronn_model_3")
6
7with astronn_model_1.graph.as_default():
8 with astronn_model_1.session.as_default():
9 # do stuff with astronn_model_1 here
10
11with astronn_model_2.graph.as_default():
12 with astronn_model_2.session.as_default():
13 # do stuff with astronn_model_2 here
14
15with astronn_model_3.graph.as_default():
16 with astronn_model_3.session.as_default():
17 # do stuff with astronn_model_3 here
18
19# For example do things with astronn_model_1 again
20with astronn_model_1.graph.as_default():
21 with astronn_model_1.session.as_default():
22 # do more stuff with astronn_model_1 here
Workflow of Testing and Distributing astroNN Models
The first step of the workflow should be loading an astroNN folder as described above.
Lets say you have loaded the folder and have some testing data, you just need to provide the testing data without any normalization if you used astroNN normalization during training. The testing data will be normalized and prediction will be denormalized in the exact same way as training data.
1# Run forward pass for the test data throught the neural net to get prediction
2# The prediction should be denormalized if you use astroNN normalization during training
3prediction = astronn_neuralnet.test(x_test)
You can always train on new data based on existing weights
1# Start the training on existing models (fine-tuning), astronn_neuralnet is a trained astroNN models
2astronn_neuralnet.train(x_train,y_train)
Creating Your Own Model with astroNN Neural Net Classes
You can create your own neural network model inherits from astroNN Neural Network class to take advantage of the existing code in this package. Here we will go thought how to create a simple model to do classification with MNIST dataset with one convolutional layer and one fully connected layer neural network.
Lets create a python script named custom_models.py
under an arbitrary folder, lets say ~/
which is your home folder,
add ~/custom_models.py
to astroNN configuration file.
1# import everything we need
2from tensorflow import keras
3# this is the astroNN neural net abstract class we will going to inherit from
4from astroNN.models.CNNBase import CNNBase
5
6regularizers = keras.regularizers
7MaxPooling2D, Conv2D, Dense, Flatten, Activation, Input = keras.layers.MaxPooling2D, keras.layers.Conv2D, \
8 keras.layers.Dense, keras.layers.Flatten, \
9 keras.layers.Activation, keras.layers.Input
10
11# now we are creating a custom model based on astroNN neural net abstract class
12class my_custom_model(CNNBase):
13 def __init__(self, lr=0.005):
14 # standard super for inheriting abstract class
15 super().__init__()
16
17 # some default hyperparameters
18 self._implementation_version = '1.0'
19 self.initializer = 'he_normal'
20 self.activation = 'relu'
21 self.num_filters = [8]
22 self.filter_len = (3, 3)
23 self.pool_length = (4, 4)
24 self.num_hidden = [128]
25 self.max_epochs = 1
26 self.lr = lr
27 self.reduce_lr_epsilon = 0.00005
28
29 self.task = 'classification'
30 # you should set the targetname some that you know what those output neurones are representing
31 # in this case the outpu the neurones are simply representing digits
32 self.targetname = ['Zero', 'One', 'Two', 'Three', 'Four', 'Five', 'Six', 'Seven', 'Eight', 'Nine']
33
34 # set default input norm mode to 255 to normalize images correctly
35 self.input_norm_mode = 255
36 # set default labels norm mode to 0 (equivalent to do nothing) to normalize labels correctly
37 self.labels_norm_mode = 0
38
39 def model(self):
40 input_tensor = Input(shape=self._input_shape, name='input')
41 cnn_layer_1 = Conv2D(kernel_initializer=self.initializer, padding="same", filters=self.num_filters[0],
42 kernel_size=self.filter_len)(input_tensor)
43 activation_1 = Activation(activation=self.activation)(cnn_layer_1)
44 maxpool_1 = MaxPooling2D(pool_size=self.pool_length)(activation_1)
45 flattener = Flatten()(maxpool_1)
46 layer_2 = Dense(units=self.num_hidden[0], kernel_initializer=self.initializer)(flattener)
47 activation_2 = Activation(activation=self.activation)(layer_2)
48 layer_3 = Dense(units=self.labels_shape, kernel_initializer=self.initializer)(activation_2)
49 output = Activation(activation=self._last_layer_activation, name='output')(layer_3)
50
51 model = Model(inputs=input_tensor, outputs=output)
52
53 return model
Save the file and we can open python under the same location as the python script
1# import everything we need
2from custom_models import my_custom_model
3from keras.datasets import mnist
4from keras import utils
5
6# load MNIST
7(x_train, y_train), (x_test, y_test) = mnist.load_data()
8# convert to approach type
9x_train = x_train.astype('float32')
10x_test = x_test.astype('float32')
11y_train = utils.to_categorical(y_train, 10)
12
13# create a neural network instance
14net = my_custom_model()
15
16# train
17net.train(x_train, y_train)
18
19# save the model after training
20net.save("trained_models_folder")
If you want to share the trained models, you have to copy custom_models.py
to the inside of the folder so that
astroNN can load it successfully on other computers.
The second way is you send the file which is custom_models.py
to the target computer and install the file by adding
the file to config.ini
on the target computer.
You can simply load the folder on other computers by running python inside the folder and run
1# import everything we need
2from astroNN.models import load_folder
3
4net = load_folder()
OR outside the folder trained_models_folder
1# import everything we need
2from astroNN.models import load_folder
3
4net = load_folder("trained_models_folder")
NeuralNetMaster Class
NeuralNetMaster is the top level abstract class for all astroNN sub neural network classes. NeuralNetMaster define the structure of how an astroNN neural network class should look like.
NeuralNetMaster consists of a pre-training checking (check input and labels shape, cpu/gpu check and create astroNN folder for every run.
Bayesian Neural Net with Dropout Variational Inference
With traditional neural network, weight in neural network are point estimate which result a point estimate result. Unlike statistical modelling which have uncertainty estimates, the whole point of machine learning is just learn from data and predict an single outcome. Uncertainty estimates is important in astronomy and it will be best if we could add uncertainty to neural network.
Background Knowledge
To understand Bayesian Neural Net, we first need to understand some background knowledge.
Bayes Rule
To understand how a Bayesian Neural Net works, we must first known about Bayesian statistics. The core of Bayesian statistic is Bayes Rule.
Suppose we have event A and B. Bayes Rule tells us \(P(A|B)=\frac{P(B|A)P(A)}{P(B)}\) where \(P(A|B)\) is conditional probability which represents the likelihood of event A occurring given that B occurred. \(P(B|A)\) represents the likelihood of event B occurring given that A occurred. \(P(A)\) and :math`P(B)` are probability of observing A and B independently of each other.
The Bayesian interpretation of a probablility is a measure of a prior belief. In such case, \(P(A)\) can be viewed as a prior belief in A and \(P(A|B)\) measures the postterior belief of having accounted for B.
Simple Bayesian Regression
The problem is a linear regression problem, we have some input data \(X\) and output data \(Y\) and we want to find \(w\) such that \(Y = wX\). Suppose we use Mean Squared Error (L2) loss which is commonly found in neural network. The objective \((Y-wX)^2\)
First step, we need to somehow change this to a probability. You want to maximizing the likelihood to generate \(Y\) given you have \(X\) and \(w\), i.e. \(P(Y|X,w)\)
Please notice using Mean Squared Error (L2), it is equivalent maximizing the log-likelihood of a Gaussian, i.e \(Y\) is Gaussian distributed.
But we want this problem be Bayesian, so we impose a prior belief on our weight, \(P(Y|X,w) P(w)\). Usually we set gaussian distribution as our belief.
By Bayes Rule, the posterior distribution of the weight is \(P(w|X,Y)=\frac{P(Y|X,w)P(w)}{C}\) and \(C\) is \(P(Y)\) or \(\int P(X, w) dw\), an integral usually very difficult to calculate.
Variational Inference
To solve this problem we will need to use Variational Inference. How to do Variational Inference.
The first step we need to introduce a parameterised distribution \(Q(w|v)\), Q representing a variational distribution and \(v\) is the variational parameter, over \(w\) to approximate the true posterior.
And bingo, another advantage is from an integration problem, we now have an optimizing problem on variational parameter \(v\). What are we optimizing to? We need to have a \(v\) so that to match the true posterior distribution as good as possible. True posterior refers to \(P(w|y,x)\) and of course we better have a \(Q(w|v)\) which close to the true posterior.
Approximation to the integral of a probability distribution (\(\int P(X, w) dw\) in this case) can be done by Monte Carlo Sampling (similarilty to estimation of \(\pi\) by MC sampling)
Dropout Variational Inference
The core idea Bayesian Neural Network is Neural Net with Dropout Variational Inference and gaussian prior weights is bayesian. By reparametrising the approximate variational distribution Q(w|v) to be Bernoulli
which is exactly the thing used by dropout.
Thus the loss is
How is uncertainty calculated from neural network for regression task
Or if you have known input data uncertainty, you should add the propagated uncertainty to the final variance too.
The final prediction will be
Inverse Model Precision is by definition
For more detail, please see my demonstration here
A simple way to think about predictive, model and propagated uncertainty
Since Bayesian Neural Network involves multiple source of uncertainty and they can be confusing. There is one simple way to think about these uncertainty.
Let’s say you have a student and some maths problems with solutions and some maths problems without solutions. For simplicity all the maths problems are only either differentiation or integration. You want the solution for those maths problems without solution. One way to do it is to let the student to do the maths with known solution, and evaluate his/her performance. If the student did all the integration problems wrong, then you know the integration solutions from the student cannot be trusted.
In more real life situation, you don’t know the training process/data, but you can interact with a trained student. Now you just give an integration problem to the student, the student should tells you he/she does not have confidence on that problem at all because it is about integration and the student knows his/her own ability for doing integration poorly. This is something that is predictable, so we call them predictive uncertainty.
Let’s say the student has done very well on differentiation problems and you should expect he/she has a high confidence on this area. But if you are a teacher, you know if students said they understand a topic, they probably not really understand it. One way to measure the model uncertainty from the student is you give the problems to the student to solve and you get back a set of solutions. And after a week or so, you give the same problems to the student to solve and you get another set of solutions. If the two solutions are the same, and the student said he/she is confident, then you know the solutions are probably right. If the two solutions are not the same, then even the student said he/she is confident, you should not trust those solutions from the student.
The propagated uncertainty can be just as simple as you have some typos in the problems, and lead to the student giving some wrong answers.
Gaia DR2 with astroNN result
Gaia DR2 is released on 25 April 2018 with data collected from 25 July 2014 to 23 May 2016 with 1.5 billion sources.
Official Gaia DR2 page: https://www.cosmos.esa.int/web/gaia/dr2
astroNN is used to train neural network with Gaia DR1 parallax to predict intrinsic brightness of stars from APOGEE spectra. Since Gaia uses geometric method to infer distances to stars, and it has its own limitation, the major one will be the star must be close to us. If neural network can infer intrinsic brightness based on APOGEE spectra, with apparent magnitude we can get the distance as long as we have the stellar spectra.
This page will act as a notebook for the author (Henry) and share his latest update on Gaia DR2 preparation.
FAQ: What is fakemag? : http://astronn.readthedocs.io/en/latest/tools_gaia.html#fakemag-dummy-scale
FAQ: Which band I will use for apparent magnitude?: K-mag will be used to minimize the effect of extinction
(25 Apr 2018 update) Neural Network Distance Prediction on the whole APOGEE DR14 result with Gaia DR2
Procedure to reproduce the result is described here: https://github.com/henrysky/astroNN/tree/master/demo_tutorial/gaia_dr1_dr2/
Neural Network trained only Gaia DR1 (20% parallax error cuts)-APOGEE DR14 (SNR>50, STARFLAG==0) overlap, around 12,000 spectra. Results are expressed in mean absolute percentage error. Gaia DR2 refers to the subset of DR2 matched with Apogee DR14, parallax > 0 and parallax error < 0.2
Outperformed Apogee Distances DR14 BPG Catalog:
Apogee Distances DR14 BPG (20% Model Confidence Cut): 77,401 spectra - 20.6%
astroNN ApogeeBCNN
(20% Neural Network Confidence Cut): 57,704 spectra - 14.5%astroNN ApogeeBCNN
(25% Neural Network Confidence Cut): 76,136 spectra - 16.8%astroNN ApogeeBCNN
(100% Neural Network Confidence Cut): 92,887 spectra - 22.6%
Outperformed “teacher” Gaia DR1 with 20% error cuts slightly on training set spectra:
astroNN ApogeeBCNN
(20% Neural Network Confidence Cut): 10,039 spectra - 6.74% mean absolute percentage error with DR2Gaia DR1 (20% error cuts): 9,019 spectra - 6.97% mean absolute percentage error with DR2
Gaia DR1, Anderson2017 with 20% error cuts in APOGEE DR14 crossed matched:
Gaia DR1 (20% Observation Error Cut): 20,675 spectra - 8.3% mean absolute percentage error with DR2
Anderson2017 (20% Model Confidence Cut): 25,303 spectra - 8.4% mean absolute percentage error with DR2
Apogee Red Clumps - astroNN - Gaia DR2 crossed matched, Red Clumps Catalog DR14 is better than NN:
The whole Red Clumps Catalog: 22,421 spectra - 20.6% mean absolute percentage error with DR2
Red Clumps Catalog crossed matched: 12,476 spectra - 18.9% mean absolute percentage error with DR2
astroNN crossed matched: 12,476 spectra - 25.0% mean absolute percentage error with DR2
Internal model identifier for the author: astroNN_0422_run001

Neural Network Mean Absolute Percentage Error to Gaia DR2 as a function of Teff

Neural Network Mean Absolute Percentage Error to Gaia DR2 as a function of neural network uncertainty estimation

Plans/Questions
Train neural network on Gaia DR1 and validate on Gaia DR2 (result stated above)
Temperature cuts on spectra? (Didn’t do it)
If neural network turns out very accurate when DR2 comes out, how did neural network predict those distance?
If neural network turns out very accurate when DR2 comes out, then we can get distance for many APOGEE spectra?
(No Need, the result is pretty good) If neural network failed, is predicting intrinsic brightness from APOGEE spectra impossible, or just because the training set is too small in DR1 led to failure?
Nerual Network Distance Prediction on the whole APOGEE DR14
Neural Network trained only Gaia DR1 (20% parallax error cuts)-APOGEE DR14 (SNR>50, STARFLAG==0) overlap
Testing on the whole APOGEE DR14 (SNR>50, STARFLAG==0 cuts), around ~120,000 spectra

2M16363993+3654060 Distance Disagreement between astroNN and Gaia/Anderson2017 Parallax
Internal model identifier for the author: astroNN_0128_run002

Neural Network trained on Anderson2017 parallax constantly predicted an almost constant offset with very small uncertainty to the ground truth (Anderson2017) on the star with APOGEE_ID 2M16363993+3654060. astroNN agreed pretty well with APOGEE_distances BPG_dist50. Seems like Gaia/Anderson2017 are the one which is far off.
I have to emphasise that the neural network is trained on the parallax from Anderson2017 which is improved parallax from Gaia DR1. There is no surprise that neural network identified outliers from the training/testing set. But the fact that neural network managed to have a similar answer with APOGEE_distances BPG_dist50 may indicate neural network learned some “correct” physics to infer intrinsic distance from APOGEE spectra.
The result:
astroNN Bayesian Neural Network 1 : \(2287.61 \text{ parsec} \pm 107.27 \text{ parsec}\)
APOGEE_distances BPG_dist50 2 : \(2266.15 \text{ parsec} \pm 266.1705 \text{ parsec}\)
Anderson2017 parallax: \(568.08 \text{ parsec} \pm 403.86 \text{ parsec}\)
Gaia DR1 parallax: \(318.05 \text{ parsec} \pm 1021.73 \text{ parsec}\)
Footnotes
Distance Prediction with APOGEE-North Spectra
Internal model identifier for the author: astroNN_0224_run002
By using astroNN.models.Apogee_BCNN to train a neural network on Anderson2017 improved Gaia parallax (Predict stellar intrinsic brightness from their spectra). Here is the result
First image, Anderson2017 is the ground truth and tested the neural network on individual spectra
Second image, assume APOGEE Distances DR14 is the ground truth, tested the neural network on individual spectra


Distance Prediction with APOGEE-South Spectra
Internal model identifier for the author: astroNN_0224_run002
The neural network has trained on APOGEE-North spectra and gaia parallax. And then neural network has been tested on spectra from APOGEE-South (Different telescope and cameras)

Milkyway via the Eye of Neural Network
Internal model identifier for the author: astroNN_0224_run002
Both the temperature and distance are the prediction from neural network. Combined with the observed coordinates and apparent magnitude, we can get a 3D map of stellar parameters via a neural network.
It seems like the neural network constantly overestimating the intrinsic brightness of low temperature stars, thats why it seems like low temperature stars dominated at distant.


Uncertainty Analysis of Neural Nets with Variational Methods
Training neural net with DR14 APOGEE_Distances Value Added Catalogue using astroNN
Mini Tools for APOGEE data
Note
astroNN only contains a limited amount of necessary tools. For a more comprehensive python tool to deal with APOGEE data, please refer to Jo Bovy’s APOGEE tools
astroNN.apogee
module has a handful of tool to deal with APOGEE data.
The APO Galactic Evolution Experiment employs high-resolution, high signal-to-noise infrared spectroscopy
to penetrate the dust that obscures significant fractions of the disk and bulge of our Galaxy. APOGEE is surveying
red giant stars across the full range of the Galactic bulge, bar, disk, and halo. APOGEE generated precise
radial velocities and detailed chemical abundances, providing unprecedented insights into the dynamical structure and
chemical history of the Galaxy. In conjunction with the planet-finding surveys, Kepler and CoRoT, APOGEE unravels
problems in fundamental astrophysics.
SDSS APOGEE: http://www.sdss.org/surveys/apogee-2/
Continuum Normalization of APOGEE Spectra
You can access the default astroNN continuum mask fro APOGEE spectra by
1import os
2import astroNN
3import numpy as np
4
5dr = 14
6
7dir = os.path.join(os.path.dirname(astroNN.__path__[0]), 'astroNN', 'data', f'dr{dr}_contmask.npy')
8cont_mask = np.load(dir)
When you do continuum normalization using astroNN, you can just use con_mask=None to use default mask provided by Jo Bovy’s APOGEE Tools. astroNN will use a SINGLE continuum pixel mask to normalize all spectra you provided. Moreover, astroNN will normalize the spectra by chips instead of normalize them all together.
- astroNN.apogee.apogee_continuum(spectra, spectra_err, cont_mask=None, deg=2, dr=None, bitmask=None, target_bit=None, mask_value=1.0)[source]
It is designed only for apogee spectra by fitting Chebyshev polynomials to the flux values in the continuum mask by chips. The resulting continuum will have the same shape as fluxes.
- Parameters
spectra (ndarray) – spectra
spectra_err (ndarray) – spectra uncertainty, same shape as spectra
cont_mask (ndarray[bool]) – continuum mask
deg (int) – The degree of Chebyshev polynomial to use in each region, default is 2 which works the best so far
dr (int) – apogee dr
bitmask (ndarray) – bitmask array of the spectra, same shape as spectra
target_bit (Union(int, list[int], ndarray[int])) – a list of bit to be masked
mask_value (Union(int, float)) – if a pixel is determined to be a bad pixel, this value will be used to replace that pixel flux
- Returns
normalized spectra, normalized spectra uncertainty
- Return type
ndarray, ndarray
- History
2018-Mar-21 - Written - Henry Leung (University of Toronto)
1from astroNN.apogee import apogee_continuum
2
3# spectra_errs refers to the 1-sigma error array provided by APOGEE
4# spectra can be multiple spectra at a time
5norm_spec, norm_spec_err = apogee_continuum(apogee_spectra, spectra_errs, cont_mask=None, deg=2, dr=14)
6
7# If you deal with bitmask too and want to set some target bits to zero, you can add additional arguement in apogee_continuum()
8# You target_bit=[a list of number] or target_bit=None to use default target_bit
9apogee_continuum(apogee_spectra, spectra_errs, cont_mask=None, deg=2, dr=14, bitmask=apogee_bitmask, target_bit=None)
norm_spec refers to the normalized spectra while norm_spec_err refers to the normalized spectra error

You can use continuum()
to normalize any spectra while apogee_continuum()
is specifically designed for APOGEE spectra.
1from astroNN.apogee import continuum
2
3spec, spec_err = continuum(spectra, spectra_errs, cont_mask, deg=2)
APOGEE Data Downloader
astroNN APOGEE data downloader always act as functions that will return you the path of downloaded file(s),
and download it if it does not exist locally. If the file cannot be found on server, astroNN will generally return False
as the path.
General Way to Open Fits File
astropy.io.fits documentation: http://docs.astropy.org/en/stable/io/fits/
1from astropy.io import fits
2
3data = fits.open(local_path_to_file)
allstar file
Data Model: https://data.sdss.org/datamodel/files/APOGEE_ASPCAP/APRED_VERS/ASPCAP_VERS/allStar.html
- astroNN.apogee.allstar(dr=None, flag=None)[source]
Download the allStar file (catalog of ASPCAP stellar parameters and abundances from combined spectra)
1from astroNN.apogee import allstar
2
3local_path_to_file = allstar(dr=16)
allvisit file
Data Model: https://data.sdss.org/datamodel/files/APOGEE_ASPCAP/APRED_VERS/ASPCAP_VERS/allVisit.html
- astroNN.apogee.allvisit(dr=None, flag=None)[source]
Download the allVisit file (catalog of properties from individual visit spectra)
1from astroNN.apogee import allvisit
2
3local_path_to_file = allvisit(dr=16)
Combined Spectra (aspcapStar)
Data Model: https://data.sdss.org/datamodel/files/APOGEE_ASPCAP/APRED_VERS/ASPCAP_VERS/TELESCOPE/FIELD/aspcapStar.html
- astroNN.apogee.combined_spectra(dr=None, location=None, field=None, apogee=None, telescope=None, verbose=1, flag=None)[source]
Download the required combined spectra file a.k.a aspcapStar
- Parameters
- Returns
full file path and download in background if not found locally, False if cannot be found on server
- Return type
- History
- 2017-Oct-15 - Written - Henry Leung (University of Toronto)2018-Aug-31 - Updated - Henry Leung (University of Toronto)
1from astroNN.apogee import combined_spectra
2
3local_path_to_file = combined_spectra(dr=16, location=a_location_id, apogee=a_apogee_id)
Visit Spectra (apStar)
Data Model: https://data.sdss.org/datamodel/files/APOGEE_REDUX/APRED_VERS/stars/TELESCOPE/FIELD/apStar.html
- astroNN.apogee.visit_spectra(dr=None, location=None, field=None, apogee=None, telescope=None, verbose=1, flag=None, commission=False)[source]
Download the required individual spectra file a.k.a apStar or asStar
- Parameters
dr (int) – APOGEE DR
location (int) – Location ID [Optional]
field (str) – Field [Optional]
apogee (str) – Apogee ID
telescope (str) – Telescope ID, for example ‘apo25m’ or ‘lco25m’
verbose (int) – verbose, set 0 to silent most logging
flag (int) – 0: normal, 1: force to re-download
commission (bool) – whether the spectra is taken during commissioning
- Returns
full file path and download in background if not found locally, False if cannot be found on server
- Return type
- History
- 2017-Nov-11 - Written - Henry Leung (University of Toronto)2018-Aug-31 - Updated - Henry Leung (University of Toronto)
1from astroNN.apogee import visit_spectra
2
3local_path_to_file = visit_spectra(dr=16, location=a_location_id, apogee=a_apogee_id)
astroNN catalogue for APOGEE
Data Model (DR16): https://data.sdss.org/datamodel/files/APOGEE_ASTRONN/apogee_astronn.html
- astroNN.apogee.downloader.apogee_astronn(dr=None, flag=None)[source]
- Download the apogee_astroNN file (catalog of astroNN stellar parameters, abundances, distances and orbital
parameters from combined spectra)
1from astroNN.apogee import apogee_astronn
2
3local_path_to_file = apogee_astronn(dr=16)
Red Clumps of SDSS Value Added Catalogs
Introduction: http://www.sdss.org/dr16/data_access/value-added-catalogs/?vac_id=apogee-red-clump-rc-catalog
Data Model (DR16): https://data.sdss.org/datamodel/files/APOGEE_RC/cat/apogee-rc-DR16.html
- astroNN.datasets.apogee.load_apogee_rc(dr=None, unit='distance', extinction=True)[source]
Load apogee red clumps (absolute magnitude measurement)
- Parameters
- Returns
numpy array of ra, dec, array
- Return type
ndarrays
- History
- 2018-Jan-21 - Written - Henry Leung (University of Toronto)2018-May-12 - Updated - Henry Leung (University of Toronto)
1from astroNN.apogee import apogee_rc
2
3local_path_to_file = apogee_rc(dr=16)
Or you can use load_apogee_rc() to load the data by
1from astroNN.datasets import load_apogee_rc
2
3# unit can be 'distance' for distance in parsec, 'absmag' for k-band absolute magnitude
4# 'fakemag' for astroNN's k-band fakemag scale
5RA, DEC, array = load_apogee_rc(dr=16, unit='distance', extinction=True) # extinction only effective if not unit='distance'
APOKASC in the Kepler Fields
1from astroNN.datasets import load_apokasc
2
3ra, dec, logg = load_apokasc()
4
5# OR you want the gold and basic standard separately
6gold_ra, gold_dec, gold_logg, basic_ra, basic_dec, basic_logg = load_apokasc(combine=False)
APOGEE Distance Estimations
Introduction: http://www.sdss.org/dr14/data_access/value-added-catalogs/?vac_id=apogee-dr14-based-distance-estimations
Data Model (DR14): https://data.sdss.org/datamodel/files/APOGEE_DISTANCES/apogee_distances.html Data Model (DR16): https://data.sdss.org/datamodel/files/APOGEE_STARHORSE/apogee_starhorse.html
- astroNN.apogee.apogee_distances(dr=None, flag=None)[source]
Download the APOGEE Distances VAC catalogue (APOGEE Distances for DR14, APOGEE Starhourse for DR16/17)
1from astroNN.apogee.downloader import apogee_distances
2
3local_path_to_file = apogee_distances(dr=14)
- astroNN.datasets.load_apogee_distances(dr=None, unit='distance', cuts=True, extinction=True, keepdims=False)[source]
Load apogee distances (absolute magnitude from stellar model)
- Parameters
dr (int) – Apogee DR
unit (string) –
which unit you want to get back
”absmag” for absolute magnitude
”fakemag” for fake magnitude
”distance” for distance in parsec
cuts (Union[boolean, float]) – Whether to cut bad data (negative parallax and percentage error more than 20%), or a float to set the threshold
extinction (bool) – Whether to take extinction into account, only affect when unit is NOT ‘distance’
keepdims (boolean) – Whether to preserve indices the same as APOGEE allstar DR14, no effect when cuts=False, set to -9999 for bad indices when cuts=True keepdims=True
- Returns
numpy array of ra, dec, array, err_array
- Return type
ndarrays
- History
- 2018-Jan-25 - Written - Henry Leung (University of Toronto)2021-Jan-29 - Updated - Henry Leung (University of Toronto)
Or you can use load_apogee_distances() to load the data by
1from astroNN.datasets import load_apogee_distances
2
3# unit can be 'distance' for distance in parsec, 'absmag' for k-band absolute magnitude
4# 'fakemag' for astroNN's k-band fakemag scale
5# cuts=True to cut out those unknown values (-9999.) and measurement error > 20%
6RA, DEC, array, err_array = load_apogee_distances(dr=14, unit='distance', cuts=True, keepdims=False)
Cannon’s allstar
Data Model (DR14): https://data.sdss.org/datamodel/files/APOGEE_REDUX/APRED_VERS/APSTAR_VERS/ASPCAP_VERS/RESULTS_VERS/CANNON_VERS/allStarCannon.html
- astroNN.apogee.allstar_cannon(dr=None, flag=None)[source]
Download the allStarCannon file (catalog of Cannon stellar parameters and abundances from combined spectra)
1from astroNN.apogee import allstar_cannon
2
3local_path_to_file = allstar_cannon(dr=14)
Mini Tools for LAMOST data
astroNN.lamost
module is designed for dealing with LAMOST DR5.
LAMOST DR5 is not a public data release yet, this module only provides a limited amount of tools to deal with the spectra. If you do not have the data, astroNN will not provide any LAMOST DR5 data nor functions to download them.
LAMOST Data Policy: http://www.lamost.org/policies/data_policy.html
LAMOST DR5 Homepage: http://dr5.lamost.org/
LAMOST DR5 Data Model: http://dr5.lamost.org/doc/data-production-description
LAMOST Spectra Wavelength Solution
- astroNN.lamost.wavelength_solution(dr=None)[source]
To return wavelegnth_solution
- Parameters
dr (Union(int, NoneType)) – data release
- Returns
wavelength solution array
- Return type
ndarray
- History
2018-Mar-15 - Written - Henry Leung (University of Toronto)
You can retrieve LAMOST spectra wavelength solution by
from astroNN.lamost import wavelength_solution
lambda_solution = wavelength_solution(dr=5)
Pseudo-Continuum Normalization of LAMOST Spectra
- astroNN.lamost.pseudo_continuum(flux, ivar, wavelength=None, L=50, dr=None)[source]
Pseudo-Continuum normalise a spectrum by dividing by a Gaussian-weighted smoothed spectrum.
- Parameters
- Returns
Continuum normalized flux and flux uncerteinty
- Return type
ndarray
from astroNN.lamost import pseudo_continuum
# spectra_errs refers to the inverse variance array provided by LAMOST
# spectra can be multiple spectra at a time
norm_spec, norm_spec_err = pseudo_continuum(spectra, spectra_errs, dr=5)
Load LAMOST DR5 catalogue
- astroNN.lamost.load_allstar_dr5()[source]
Open LAMOST DR5 allstar
- Returns
fits file opened by astropy
- Return type
astropy.io.fits.hdu.hdulist.HDUList
- History
2018-Jun-17 - Written - Henry Leung (University of Toronto)
from astroNN.lamost import load_allstar_dr5
fits_file = load_allstar_dr5()
fits_file[1].header # print file header
Mini Tools for Gaia data
Note
astroNN only contains a limited amount of necessary tools. For a more comprehensive python tool to deal with Gaia data, please refer to Jo Bovy’s gaia_tools
astroNN.gaia
module provides a handful of tools to deal with astrometry and photometry.
The mission of the GAIA spacecraft is to create a dynamic, three-dimensional map of the Milky Way Galaxy by measuring
the distances, positions and proper motion of stars. To do this, the spacecraft employs two telescopes, an imaging
system, an instrument for measuring the brightness of stars, and a spectrograph. Launched in 2013, GAIA orbits the Sun
at Lagrange point L2, 1.5 million kilometres from Earth. By the end of its five-year mission, GAIA will have mapped well
over one billion stars—one percent of the Galactic stellar population.
ESA Gaia satellite: http://sci.esa.int/gaia/
Gaia Data Downloader
astroNN Gaia data downloader always act as functions that will return you the path of downloaded file(s),
and download it if it does not exist locally. If the file cannot be found on server, astroNN will generally return
False
as the path.
Load Gaia DR2 - Apogee DR14 matches
- astroNN.gaia.gaiadr2_parallax(cuts=True, keepdims=False, offset=False)[source]
Load Gaia DR2 - APOGEE DR14 matches, indices corresponds to APOGEE allstar DR14 file
- Parameters
cuts (Union[boolean, float]) – Whether to cut bad data (negative parallax and percentage error more than 20%), or a float to set the threshold
keepdims (boolean) – Whether to preserve indices the same as APOGEE allstar DR14, no effect when cuts=False, set to -9999 for bad indices when cuts=True keepdims=True
offset (Union[boolean, float, str]) –
Whether to correction Gaia DR2 zero point offset
False to assume no offset correction
True to assume 52.8-4.21(G-12.2)
”leungbovy2019” for leung & bovy 2019 offset correction
a float to assume a float offset globally
- Returns
numpy array of ra, dec, parallax, parallax_error
- Return type
ndarrays
- History
2018-Apr-26 - Written - Henry Leung (University of Toronto)
1 from astroNN.gaia import gaiadr2_parallax
2
3 # To load Gaia DR2 - APOGEE DR14 matches, indices corresponds to APOGEE allstar DR14 file
4 ra, dec, parallax, parallax_error = gaiadr2_parallax(cuts=True, keepdims=False, offset=False)
Gaia DR1 TGAS Downloader and Loader
- astroNN.gaia.tgas(flag=None)[source]
Get path to the Gaia TGAS DR1 files, download if files not found
- Returns
List of file path
- Return type
- History
2017-Oct-13 - Written - Henry Leung (University of Toronto)
To download TGAS DR1, moreover TGAS is only available in DR1
1 from astroNN.gaia import tgas
2
3 # To download tgas dr1 to GAIA_TOOLS_DATA and it will return the list of path to those files
4 files_paths = tgas()
To load Gaia TGAS
- astroNN.gaia.tgas_load(cuts=True)[source]
To load useful parameters from multiple TGAS DR1 files
- Parameters
cuts (Union[boolean, 0.2]) – Whether to cut bad data (negative parallax and percentage error more than 20%, or a custom cut percentage)
- Returns
Dictionary of parameters
- Return type
- History
2017-Dec-17 - Written - Henry Leung (University of Toronto)
1 from astroNN.gaia import tgas_load
2
3 # To load the tgas DR1 files and return a dictionary of ra(J2015), dec(J2015), pmra, pmdec, parallax, parallax error, g-band mag
4 # cuts=True to cut bad data (negative parallax and percentage error more than 20%)
5 output = tgas_load(cuts=True)
6
7 # outout dictionary
8 output['ra'] # ra(J2015)
9 output['dec'] # dec(J2015)
10 output['pmra'] # proper motion in RA
11 output['pmdec'] # proper motion in DEC
12 output['parallax'] # parallax
13 output['parallax_err'] # parallax error
14 output['gmag'] # g-band mag
Gaia_source DR1 Downloader
No plan to support DR2 Gaia Source, please refers to Jo Bovy’s https://github.com/jobovy/gaia_tools
1 from astroNN.gaia import gaia_source
2
3 # To download gaia_source DR1 to GAIA_TOOLS_DATA and it will return the list of path to those files
4 files_paths = gaia_source(dr=1)
Anderson et al 2017 Improved Parallax from Data-driven Stars Model
Anderson2017 is described in here: https://arxiv.org/pdf/1706.05055
Please be advised starting from 26 April 2018, anderson2017 in astroNN is reduced to parallax cross matched with APOGEE DR14 only. If you see this message, anderson2017 in this astroNN version is reduced. Moreover, anderson2017 will be removed in the future
1 from astroNN.gaia import anderson_2017_parallax
2
3 # To load the improved parallax
4 # Both parallax and para_var is in mas
5 # cuts=True to cut bad data (negative parallax and percentage error more than 20%)
6 ra, dec, parallax, para_err = anderson_2017_parallax(cuts=True)
fakemag (dummy scale)
fakemag
is an astroNN dummy scale primarily used to preserve the gaussian standard error from Gaia. astroNN
always assume there is no error in apparent magnitude measurement.
\(L_\mathrm{fakemag} = \varpi 10^{\frac{1}{5}m_\mathrm{apparent}} = 10^{\frac{1}{5}M_\mathrm{absolute}+2}\), where \(\varpi\) is parallax in mas
You can get a sense of the fakemag scale from the following plot

Coordinates Matching between catalogs xmatch
- astroNN.datasets.xmatch.xmatch(ra1, dec1, ra2, dec2, epoch1=2000.0, epoch2=2000.0, pmra2=None, pmdec2=None, maxdist=2)[source]
Cross-matching between arrays by RA/DEC coordiantes
- Parameters
ra1 (ndarray) – 1d array for the first catalog RA
dec1 (ndarray) – 1d array for the first catalog DEC
ra2 (ndarray) – 1d array for the second catalog RA
dec2 (ndarray) – 1d array for the second catalog DEC
epoch1 (Union([float, ndarray])) – Epoch for the first catalog, can be float or 1d array
epoch1 – Epoch for the second catalog, can be float or 1d array
pmra2 (ndarray) – RA proper motion for second catalog, only effective if epoch1 not equals epoch2
pmdec2 (ndarray) – DEC proper motion for second catalog, only effective if epoch1 not equals epoch2
maxdist (float) – Maximium distance in arcsecond
- Returns
numpy array of ra, dec, separation
- Return type
ndarrays
- History
- 2018-Jan-25 - Written - Henry Leung (University of Toronto)2021-Jan-29 - Updated - Henry Leung (University of Toronto)
Here is an example
1 from astroNN.datasets import xmatch
2 import numpy as np
3
4 # Some coordinates for cat1, J2000.
5 cat1_ra = np.array([36.,68.,105.,23.,96.,96.])
6 cat1_dec = np.array([72.,56.,54.,55.,88.,88.])
7
8 # Some coordinates for cat2, J2000.
9 cat2_ra = np.array([23.,56.,222.,96.,245.,68.])
10 cat2_dec = np.array([36.,68.,82.,88.,26.,56.])
11
12 # Using maxdist=2 arcsecond separation threshold, because its default, so not shown here
13 # Using epoch1=2000. and epoch2=2000., because its default, so not shown here
14 # because both datasets are J2000., so no need to provide pmra and pmdec which represent proper motion
15 idx_1, idx_2, sep = xmatch(ra1=cat1_ra, dec1=cat1_dec, ra2=cat2_ra, dec2=cat2_dec)
16
17 print(idx_1)
18 >>> [1 4 5]
19 print(idx_2)
20 >>> [5 3 3]
21 print(cat1_ra[idx_1], cat2_ra[idx_2])
22 >>> [68. 96. 96.], [68. 96. 96.]
23
24 # What happens if we swap cat_1 and cat_2
25 idx_1, idx_2, sep = xmatch(ra1=cat2_ra, dec1=cat2_dec, ra2=cat1_ra, dec2=cat1_dec)
26
27 print(idx_1)
28 >>> [3 5]
29 print(idx_2)
30 >>> [4 1]
31 print(cat1_ra[idx_2], cat2_ra[idx_1])
32 >>> [96. 68.], [96. 68.] # xmatch cant find all the match
APOGEE Spectra with Convolutional Neural Net - ApogeeCNN
- class astroNN.models.apogee_models.ApogeeCNN(lr=0.005)[source]
Class for Convolutional Neural Network for stellar spectra analysis
- History
2017-Dec-21 - Written - Henry Leung (University of Toronto)

Although in theory you can feed any 1D data to astroNN neural networks. This tutorial will only focus on spectra analysis.
from astroNN.models import ApogeeCNN
from astroNN.datasets import H5Loader
# Load the train data from dataset first, x_train is spectra and y_train will be ASPCAP labels
loader = H5Loader('datasets.h5')
loader.load_err = False
x_train, y_train = loader.load()
# And then create an instance of Convolutional Neural Network class
cnn_net = ApogeeCNN()
# You don't have to specify the task because its 'regression' by default. But if you are doing classification. you can set task='classification'
cnn_net.task = 'regression'
# Set max_epochs to 10 for a quick result. You should train more epochs normally
cnn_net.max_epochs = 10
cnn_net.train(x_train, y_train)
Here is a list of parameter you can set but you can also not set them to use default
ApogeeCNN.batch_size = 64
ApogeeCNN.initializer = 'he_normal'
ApogeeCNN.activation = 'relu'
ApogeeCNN.num_filters = [2, 4]
ApogeeCNN.filter_len = 8
ApogeeCNN.pool_length = 4
ApogeeCNN.num_hidden = [196, 96]
ApogeeCNN.max_epochs = 250
ApogeeCNN.lr = 0.005
ApogeeCNN.reduce_lr_epsilon = 0.00005
ApogeeCNN.reduce_lr_min = 0.0000000001
ApogeeCNN.reduce_lr_patience = 10
ApogeeCNN.target = 'all'
ApogeeCNN.l2 = 1e-7
ApogeeCNN.input_norm_mode = 1
ApogeeCNN.labels_norm_mode = 2
Note
You can disable astroNN data normalization via ApogeeCNN.input_norm_mode=0
as well as ApogeeCNN.labels_norm_mode = 0
and do normalization yourself. But make sure you don’t normalize labels with MAGIC_NUMBER
(missing labels).
After the training, you can use cnn_net in this case and call test method to test the neural network on test data. Or you can load the folder by
from astroNN.models import load_folder
cnn_net = load_folder('astroNN_0101_run001')
# Load the test data from dataset, x_test is spectra and y_test will be ASPCAP labels
loader2 = H5Loader('datasets.h5')
loader2.load_combined = False
x_test, y_test = loader2.load()
pred = cnn_net.test(x_test) # pred contains denormalized result aka. ASPCAP labels prediction in this case
Since astroNN.models.ApogeeCNN does not have uncertainty analysis feature.
You can calculate jacobian which represents the output derivative to the input and see where those output is sensitive to in inputs.
# Calculate jacobian first
jacobian_array = cnn_net.jacobian(x_test, mean_output=True)
Note
You can access to Keras model method like model.predict via (in the above tutorial) cnn_net.keras_model (Example: cnn_net.keras_model.predict())
Example Plots using aspcap_residue_plot


ASPCAP labels prediction using CNN vs The Cannon 2
Warning
Please refer to Bayesian Neural Network for the most updated result: http://astronn.readthedocs.io/en/latest/neuralnets/apogee_bcnn.html

Example Plots using jacobian


APOGEE Spectra with Bayesian Neural Net - ApogeeBCNN
- class astroNN.models.apogee_models.ApogeeBCNN(lr=0.0005, dropout_rate=0.3)[source]
Class for Bayesian convolutional neural network for stellar spectra analysis
- History
2017-Dec-21 - Written - Henry Leung (University of Toronto)

Although in theory you can feed any 1D data to astroNN neural networks. This tutorial will only focus on spectra analysis.
from astroNN.models import ApogeeBCNN
from astroNN.datasets import H5Loader
# Load the train data from dataset first, x_train is spectra and y_train will be ASPCAP labels
loader = H5Loader('datasets.h5')
loader.load_combined = True
loader.load_err = True
x_train, y_train, x_err, y_err = loader.load()
# And then create an instance of Bayesian Convolutional Neural Network class
bcnn_net = ApogeeBCNN()
# You don't have to specify the task because its 'regression' by default. But if you are doing classification. you can set task='classification'
bcnn_net.task = 'regression'
# Set max_epochs to 10 for a quick result. You should train more epochs normally, especially with dropout
bcnn_net.max_epochs = 10
bcnn_net.train(x_train, y_train, x_err, y_err)
Here is a list of parameter you can set but you can also not set them to use default
ApogeeBCNN.batch_size = 64
ApogeeBCNN.initializer = 'he_normal'
ApogeeBCNN.activation = 'relu'
ApogeeBCNN.num_filters = [2, 4]
ApogeeBCNN.filter_len = 8
ApogeeBCNN.pool_length = 4
ApogeeBCNN.num_hidden = [196, 96]
ApogeeBCNN.max_epochs = 100
ApogeeBCNN.lr = 0.005
ApogeeBCNN.reduce_lr_epsilon = 0.00005
ApogeeBCNN.reduce_lr_min = 0.0000000001
ApogeeBCNN.reduce_lr_patience = 10
ApogeeBCNN.target = 'all'
ApogeeBCNN.l2 = 5e-9
ApogeeBCNN.dropout_rate = 0.2
ApogeeBCNN.length_scale = 0.1 # prior length scale
ApogeeBCNN.input_norm_mode = 3
ApogeeBCNN.labels_norm_mode = 2
Note
You can disable astroNN data normalization via ApogeeBCNN.input_norm_mode=0
as well as ApogeeBCNN.labels_norm_mode=0
and do normalization yourself. But make sure you don’t normalize labels with MAGIC_NUMBER (missing labels).
After the training, you can use bcnn_net in this case and call test method to test the neural network on test data. Or you can load the folder by
from astroNN.models import load_folder
bcnn_net = load_folder('astroNN_0101_run001')
# Load the test data from dataset, x_test is spectra and y_test will be ASPCAP labels
loader2 = H5Loader('datasets.h5')
loader2.load_combined = False
loader2.load_err = False
x_test, y_test = loader2.load()
# pred contains denormalized result aka. ASPCAP labels prediction in this case
# pred_std is a list of uncertainty
# pred_std['total'] is the total uncertainty (standard derivation) which is the sum of all the uncertainty
# pred_std['predictive'] is the predictive uncertainty predicted by bayesian neural net
# pred_std['model'] is the model uncertainty from dropout variational inference
pred, pred_std = bcnn_net.test(x_test)
Since astroNN.models.ApogeeBCNN uses Bayesian deep learning which provides uncertainty analysis features.
You can calculate jacobian which represents the output derivative to the input and see where those output is sensitive to in inputs.
# Calculate jacobian first
jacobian_array = bcnn_net.jacobian(x_test, mean_output=True)
Note
You can access to Keras model method like model.predict via (in the above tutorial) bcnn_net.keras_model (Example: bcnn_net.keras_model.predict())
ASPCAP Labels Prediction
Internal model identifier for the author: astroNN_0321_run002
Training set (30067 spectra + separate 3340 validation spectra): Starflag=0 and ASPCAPflag=0, 4000<Teff<5500, 200<SNR
Testing set (97723 spectra): Individual Visit of the training spectra, median SNR is around SNR~100
Using astroNN.models.ApogeeBCNN with default hyperparameter
Ground Truth is ASPCAP labels.
Median of residue |
astropy mad_std of residue |
|
---|---|---|
Al |
-0.003 |
0.042 |
Alpha |
0.000 |
0.013 |
C |
0.003 |
0.032 |
C1 |
0.005 |
0.037 |
Ca |
0.002 |
0.022 |
Co |
-0.005 |
0.071 |
Cr |
-0.001 |
0.031 |
fakemag |
3.314 |
16.727 |
Fe |
0.001 |
0.016 |
K |
-0.001 |
0.032 |
Log(g) |
0.002 |
0.048 |
M |
0.003 |
0.015 |
Mg |
0.001 |
0.021 |
Mn |
0.003 |
0.025 |
N |
-0.002 |
0.037 |
Na |
-0.006 |
0.103 |
Ni |
0.000 |
0.021 |
O |
0.004 |
0.027 |
P |
0.005 |
0.086 |
S |
0.006 |
0.043 |
Si |
0.001 |
0.022 |
Teff |
0.841 |
23.574 |
Ti |
0.002 |
0.032 |
Ti2 |
-0.009 |
0.089 |
V |
-0.002 |
0.059 |
Median Absolute Error of prediction at three different low SNR level.
SNR ~ 20 |
SNR ~ 40 |
SNR ~ 60 |
|
---|---|---|---|
Al |
0.122 dex |
0.069 dex |
0.046 dex |
Alpha |
0.024 dex |
0.017 dex |
0.014 dex |
C |
0.088 dex |
0.051 dex |
0.037 dex |
C1 |
0.084 dex |
0.054 dex |
0.041 dex |
Ca |
0.069 dex |
0.039 dex |
0.029 dex |
Co |
0.132 dex |
0.104 dex |
0.085 dex |
Cr |
0.082 dex |
0.049 dex |
0.037 dex |
fakemag |
Not Calculated |
Not Calculated |
Not Calculated |
Fe |
0.070 dex |
0.035 dex |
0.024 dex |
K |
0.091 dex |
0.050 dex |
0.037 dex |
Log(g) |
0.152 dex |
0.085 dex |
0.059 dex |
M |
0.067 dex |
0.033 dex |
0.023 dex |
Mg |
0.080 dex |
0.039 dex |
0.026 dex |
Mn |
0.089 dex |
0.050 dex |
0.037 dex |
N |
0.118 dex |
0.067 dex |
0.046 dex |
Na |
0.119 dex |
0.110 dex |
0.099 dex |
Ni |
0.076 dex |
0.039 dex |
0.027 dex |
O |
0.076 dex |
0.046 dex |
0.037 dex |
P |
0.106 dex |
0.082 dex |
0.077 dex |
S |
0.072 dex |
0.052 dex |
0.041 dex |
Si |
0.076 dex |
0.042 dex |
0.024 dex |
Teff |
74.542 K |
41.955 K |
29.271 K |
Ti |
0.080 dex |
0.049 dex |
0.037 dex |
Ti2 |
0.124 dex |
0.099 dex |
0.092 dex |
V |
0.119 dex |
0.080 dex |
0.064 dex |
ASPCAP Labels Prediction with >50% corrupted labels
Internal model identifier for the author: astroNN_0224_run004
Setting is the same as above, but manually corrupt more labels to ensure the modified loss function is working fine
52.5% of the total training labels is corrupted to -9999 (4.6% of the total labels are -9999. from ASPCAP), while testing set is unchanged
Median of residue |
astropy mad_std of residue |
|
---|---|---|
Al |
0.003 |
0.047 |
Alpha |
0.000 |
0.015 |
C |
0.005 |
0.037 |
C1 |
0.003 |
0.042 |
Ca |
0.002 |
0.025 |
Co |
0.001 |
0.076 |
Cr |
0.000 |
0.033 |
fakemag |
-0.020 |
5.766 |
Fe |
0.001 |
0.020 |
K |
0.001 |
0.035 |
Log(g) |
-0.002 |
0.064 |
M |
0.002 |
0.019 |
Mg |
0.003 |
0.025 |
Mn |
0.003 |
0.030 |
N |
0.001 |
0.043 |
Na |
-0.004 |
0.106 |
Ni |
0.001 |
0.025 |
O |
0.004 |
0.031 |
P |
0.004 |
0.091 |
S |
0.006 |
0.045 |
Si |
0.001 |
0.026 |
Teff |
-0.405 |
31.222 |
Ti |
0.003 |
0.035 |
Ti2 |
-0.012 |
0.092 |
V |
0.002 |
0.063 |
ASPCAP Labels Prediction with limited amount of data
Internal model identifier for the author: astroNN_0401_run001
Setting is the same including the neural network, but the number of training data is limited to 5000 (4500 of them is for training, 500 validation), validation set is completely separated. Testing set is the same without any limitation.
Median of residue |
astropy mad_std of residue |
|
---|---|---|
Al |
-0.002 |
0.051 |
Alpha |
0.001 |
0.017 |
C |
-0.002 |
0.040 |
C1 |
-0.003 |
0.046 |
Ca |
-0.003 |
0.027 |
Co |
-0.006 |
0.080 |
Cr |
0.000 |
0.036 |
fakemag |
18.798 |
30.687 |
Fe |
-0.004 |
0.022 |
K |
-0.003 |
0.038 |
Log(g) |
-0.005 |
0.064 |
M |
-0.004 |
0.020 |
Mg |
-0.002 |
0.026 |
Mn |
-0.002 |
0.033 |
N |
-0.003 |
0.053 |
Na |
-0.026 |
0.121 |
Ni |
-0.003 |
0.026 |
O |
-0.003 |
0.033 |
P |
0.001 |
0.097 |
S |
-0.003 |
0.047 |
Si |
-0.003 |
0.028 |
Teff |
-1.348 |
33.202 |
Ti |
-0.004 |
0.037 |
Ti2 |
-0.017 |
0.097 |
V |
-0.005 |
0.065 |
Example Plots using aspcap_residue_plot


Example Plots using jacobian


APOGEE Spectra with Censored Bayesian NN - ApogeeBCNNCensored
- class astroNN.models.apogee_models.ApogeeBCNNCensored(lr=0.0005, dropout_rate=0.3)[source]
Class for Bayesian censored convolutional neural network for stellar spectra analysis [specifically APOGEE DR14 spectra only]
Described in the paper: https://ui.adsabs.harvard.edu/abs/2019MNRAS.483.3255L/abstract
- History
2018-May-27 - Written - Henry Leung (University of Toronto)

ApogeeBCNNCensored can only be used with Apogee spectra with 7,514 pixels
from astroNN.models import ApogeeBCNNCensored
from astroNN.datasets import H5Loader
# Load the train data from dataset first, x_train is spectra and y_train will be ASPCAP labels
loader = H5Loader('datasets.h5')
loader.load_combined = True
loader.load_err = False
loader.target = ['teff', 'logg', 'C', 'C1', 'N', 'O', 'Na', 'Mg', 'Al', 'Si', 'P', 'S', 'K',
'Ca', 'Ti', 'Ti2', 'V', 'Cr', 'Mn', 'Fe','Co', 'Ni']
x_train, y_train, x_err, y_err = loader.load()
# And then create an instance of Apogee Censored Bayesian Convolutional Neural Network class
bcnncensored_net = ApogeeBCNNCensored()
# Set max_epochs to 10 for a quick result. You should train more epochs normally, especially with dropout
bcnncensored_net.max_epochs = 10
bcnncensored_net.train(x_train, y_train, x_err, y_err)
Here is a list of parameter you can set but you can also not set them to use default
ApogeeBCNNCensored.batch_size = 64
ApogeeBCNNCensored.initializer = 'he_normal'
ApogeeBCNNCensored.activation = 'relu'
ApogeeBCNNCensored.num_filters = [2, 4]
ApogeeBCNNCensored.filter_len = 8
ApogeeBCNNCensored.pool_length = 4
# number of neurone for [old_bcnn_1, old_bcnn_2, aspcap_1, aspcap_2, hidden]
ApogeeBCNNCensored.num_hidden = [128, 64, 32, 8, 2]
ApogeeBCNNCensored.max_epochs = 50
ApogeeBCNNCensored.lr = 0.005
ApogeeBCNNCensored.reduce_lr_epsilon = 0.00005
ApogeeBCNNCensored.reduce_lr_min = 0.0000000001
ApogeeBCNNCensored.reduce_lr_patience = 10
ApogeeBCNNCensored.target = 'all'
ApogeeBCNNCensored.l2 = 5e-9
ApogeeBCNNCensored.dropout_rate = 0.2
ApogeeBCNNCensored.length_scale = 0.1 # prior length scale
ApogeeBCNNCensored.input_norm_mode = 3
ApogeeBCNNCensored.labels_norm_mode = 2
Note
You can disable astroNN data normalization via ApogeeBCNNCensored.input_norm_mode=0
as well as ApogeeBCNNCensored.labels_norm_mode=0
and do normalization yourself. But make sure you don’t normalize labels with MAGIC_NUMBER (missing labels).
After the training, you can use bcnncensored_net in this case and call test method to test the neural network on test data. Or you can load the folder by
from astroNN.models import load_folder
bcnncensored_net = load_folder('astroNN_0101_run001')
# Load the test data from dataset, x_test is spectra and y_test will be ASPCAP labels
loader2 = H5Loader('datasets.h5')
loader2.load_combined = False
loader2.load_err = False
loader2.target = ['teff', 'logg', 'C', 'C1', 'N', 'O', 'Na', 'Mg', 'Al', 'Si', 'P', 'S', 'K',
'Ca', 'Ti', 'Ti2', 'V', 'Cr', 'Mn', 'Fe','Co', 'Ni']
x_test, y_test = loader2.load()
# pred contains denormalized result aka. ASPCAP labels prediction in this case
# pred_std is a list of uncertainty
# pred_std['total'] is the total uncertainty (standard derivation) which is the sum of all the uncertainty
# pred_std['predictive'] is the predictive uncertainty predicted by bayesian neural net
# pred_std['model'] is the model uncertainty from dropout variational inference
pred, pred_std = bcnncensored_net.test(x_test)
bcnncensored_net.aspcap_residue_plot(pred, y_test, pred_std['total'])
You can calculate jacobian which represents the output derivative to the input and see where those output is sensitive to in inputs.
# Calculate jacobian first
jacobian_array = bcnncensored_net.jacobian(x_test, mean_output=True)
# Plot the graphs
bcnncensored_net.jacobian_aspcap(jacobian=jacobian_array, dr=14)
Architecture
The architecture of this neural network is as follow.

Why Censored NN for APOGEE Spectra analysis?
Internal model identifier for the author: astroNN_0529_run010
It caught our attention that ApogeeBCNN neural network found no spread in \([Al/H]\) in \(M13\) globular cluster (Literature of showing a spread in \([Al/H]\): https://arxiv.org/pdf/1501.05127.pdf) and it may imply a problem in ApogeeBCNN that it found strongly correlation between elements but not actually measuring individually.

It becomes clear when we plot the training set \([Al/H]\) vs \([Mg/H]\) as follow, \([Al/H]\) and \([Mg/H]\) are strongly correlated and ApogeeBCNN is just measuring \([Al/H]\) as some kind of \([Mg/H]\) and fooled in \(M13\) because \(M13\) has a spread in \([Al/H]\) but not \([Mg/H]\), in other word, the region in \([Mg/H, Al/H]\) parameter space of \(M13\) is not covered by training set.

So Censored Neural Net is proposed to solve the issue by encouraging neural network to look at the ASPCAP window regions.
And it seems like it solved the issue and now neural network show a spread in \([Al/H]\) but not \([Mg/H]\)

with this censored neural network and plot the training set, indeed it shows a little more spread

ASPCAP Labels Prediction
Internal model identifier for the author: astroNN_0529_run010
Training set and Testing set is exactly the same as APOGEE Spectra with Bayesian Neural Net - ApogeeBCNN
Training set (30067 spectra + separate 3340 validation spectra): Starflag=0 and ASPCAPflag=0, 4000<Teff<5500, 200<SNR
Testing set (97723 spectra): Individual Visit of the training spectra, median SNR is around SNR~100
Using astroNN.models.ApogeeBCNNCensored with default hyperparameter
Ground Truth is ASPCAP labels.
Median of residue |
astropy mad_std of residue |
|
---|---|---|
Al |
-0.002 |
0.047 |
C |
0.000 |
0.033 |
C1 |
0.000 |
0.044 |
Ca |
0.001 |
0.024 |
Co |
-0.002 |
0.072 |
Cr |
-0.006 |
0.033 |
Fe |
-0.003 |
0.019 |
K |
-0.001 |
0.036 |
Log(g) |
0.006 |
0.049 |
Mg |
-0.002 |
0.021 |
Mn |
-0.004 |
0.032 |
N |
-0.004 |
0.035 |
Na |
-0.014 |
0.118 |
Ni |
-0.003 |
0.023 |
O |
0.001 |
0.033 |
P |
0.001 |
0.100 |
S |
0.000 |
0.048 |
Si |
-0.002 |
0.024 |
Teff |
2.310 |
23.296 |
Ti |
-0.001 |
0.035 |
Ti2 |
-0.006 |
0.090 |
V |
-0.002 |
0.067 |
APOGEE Spectra with Bayesian NN and Gaia offset calibration - ApogeeDR14GaiaDR2BCNN
- class astroNN.models.apogee_models.ApogeeDR14GaiaDR2BCNN(lr=0.001, dropout_rate=0.3)[source]
Class for Bayesian convolutional neural network for APOGEE DR14 Gaia DR2
- History
2018-Nov-06 - Written - Henry Leung (University of Toronto)

ApogeeDR14GaiaDR2BCNN can only be used with Apogee spectra with 7,514 pixels
from astroNN.models import ApogeeDR14GaiaDR2BCNN
from astroNN.datasets import H5Loader
# Load the train data from dataset first, x_train is spectra and y_train will be ASPCAP labels
loader = H5Loader('datasets.h5')
loader.load_combined = True
loader.load_err = False
loader.target = ['Ks-band fakemag']
x_train, y_train, x_err, y_err = loader.load()
# And then create an instance of Apogee Censored Bayesian Convolutional Neural Network class
apogee_gaia_bcnn = ApogeeDR14GaiaDR2BCNN()
# Set max_epochs to 10 for a quick result. You should train more epochs normally, especially with dropout
apogee_gaia_bcnn.max_epochs = 10
apogee_gaia_bcnn.train(x_train, y_train, x_err, y_err)
Here is a list of parameter you can set but you can also not set them to use default
ApogeeDR14GaiaDR2BCNN.batch_size = 64
ApogeeDR14GaiaDR2BCNN.initializer = 'he_normal'
ApogeeDR14GaiaDR2BCNN.activation = 'relu'
ApogeeDR14GaiaDR2BCNN.num_filters = [2, 4]
ApogeeDR14GaiaDR2BCNN.filter_len = 8
ApogeeDR14GaiaDR2BCNN.pool_length = 4
# number of neurone for [old_bcnn_1, old_bcnn_2, offset_hidden_1, offset_hidden_2]
ApogeeDR14GaiaDR2BCNN.num_hidden = [162, 64, 32, 16]
ApogeeDR14GaiaDR2BCNN.max_epochs = 50
ApogeeDR14GaiaDR2BCNN.lr = 0.005
ApogeeDR14GaiaDR2BCNN.reduce_lr_epsilon = 0.00005
ApogeeDR14GaiaDR2BCNN.reduce_lr_min = 0.0000000001
ApogeeDR14GaiaDR2BCNN.reduce_lr_patience = 10
ApogeeDR14GaiaDR2BCNN.target = 'all'
ApogeeDR14GaiaDR2BCNN.l2 = 5e-9
ApogeeDR14GaiaDR2BCNN.dropout_rate = 0.2
ApogeeDR14GaiaDR2BCNN.input_norm_mode = 3
ApogeeDR14GaiaDR2BCNN.labels_norm_mode = 2
Note
You can disable astroNN data normalization via ApogeeDR14GaiaDR2BCNN.input_norm_mode=0
as well as ApogeeDR14GaiaDR2BCNN.labels_norm_mode=0
and do normalization yourself. But make sure you don’t normalize labels with MAGIC_NUMBER (missing labels).
After the training, you can use apogee_gaia_bcnn in this case and call test method to test the neural network on test data. Or you can load the folder by
from astroNN.models import load_folder
apogee_gaia_bcnn = load_folder('astroNN_0101_run001')
# Load the test data from dataset, x_test is spectra and y_test will be ASPCAP labels
test_data = ......
# pred contains denormalized result aka. fakemag prediction in this case
# pred_std is a list of uncertainty
# pred_std['total'] is the total uncertainty (standard derivation) which is the sum of all the uncertainty
# pred_std['predictive'] is the predictive uncertainty predicted by bayesian neural net
# pred_std['model'] is the model uncertainty from dropout variational inference
pred, pred_std = apogee_gaia_bcnn.test(test_data)
# Calculate jacobian
jacobian_array = apogee_gaia_bcnn.jacobian(x_test, mean_output=True)
Architecture
The architecture of this neural network is as follow.

Convolutional Variational Autoencoder - ApogeeCVAE
Warning
Information are obsolete, the following code may not be able to run properly with astroNN latest commit
- class astroNN.models.apogee_models.ApogeeCVAE[source]
Class for Convolutional Autoencoder Neural Network for stellar spectra analysis
- History
2017-Dec-21 - Written - Henry Leung (University of Toronto)

It is a 9 layered convolutional neural net (2 convolutional layers->2 dense layers->latent space->2 dense layers->2 convolutional layers)
You can create ApogeeVAE via
from astroNN.models import ApogeeCVAE
# And then create an object of ApogeeCVAE classs
cvae_net = ApogeeCVAE()
APOGEE Spectra Analysis
Although in theory you can feed any 1D data to astroNN neural networks. This tutorial will only focus on spectra analysis.
from astroNN.models import ApogeeCVAE
from astroNN.datasets import H5Loader
# Load the train data from dataset first, x_train is spectra and y_train will be ASPCAP labels
loader = H5Loader('datasets.h5')
x_train, y_train = loader.load()
# And then create an object of Bayesian Convolutional Neural Network classs
cvae_net = ApogeeCVAE()
# Set max_epochs to 10 for a quick result. You should train more epochs normally, especially with dropout
cvae_net.max_epochs = 10
cvae_net.train(x_train)
After the training, you can use ‘vae_net’ in this case and call test method to test the neural network on test data. Or you can load the folder by
from astroNN.models import load_folder
cvae_net = load_folder('astroNN_0101_run001')
# Load the test data from dataset, x_test is spectra and y_test will be ASPCAP labels
loader2 = H5Loader('datasets.h5')
loader2.load_combined = False
x_test, y_test = loader2.load()
VAE is a special case. You can either use test_encoder(x_test) to get the value in latent space or use test(x_test) to get spectra reconstruction
# Get latent space representation
latent_space_value = cvae_net.test_encoder(x_test)
# Get spectra reconstruction
spectra_recon = cvae_net.test(x_test)
Note
You can access to Keras model method like model.predict via (in the above tutorial) vae_net.keras_model (Example: vae_net.keras_model.predict())
Example Plots on latent space using VAE.plot_latent()


Example Plots on spectra reconstruction
x_re = cvae_net.test(x_test)
import pylab as plt
fig = plt.figure(figsize=(20, 15), dpi=150)
plt.plot(x[0], linewidth=0.9, label='APOGEE spectra')
plt.plot(x_re[0], linewidth=0.9, label='Reconstructed spectra by VAE')
plt.xlabel('Pixel', fontsize=25)
plt.ylabel('Normalized flux', fontsize=25)
plt.legend(loc='best', fontsize=25)
plt.tick_params(labelsize=20, width=1, length=10)

Encoder-decoder for APOGEE and Kepler - ApokascEncoderDecoder

ApokascEncoderDecoder can only be used with Apogee spectra with 7,514 pixels and Kepler PSD with 2,092. Both numbers are hardcoded into the model
Please refers to the paper https://ui.adsabs.harvard.edu/abs/2023arXiv230205479L/abstract and https://github.com/henrysky/astroNN_ages for detail
from astroNN.models import ApokascEncoderDecoder
from astroNN.datasets import H5Loader
# Load the train data from dataset first, x_train is spectra and y_train will be ASPCAP labels
loader = H5Loader('datasets.h5')
loader.load_combined = True
loader.load_err = True
x_train, y_train, x_err, y_err = loader.load()
# And then create an instance of Bayesian Convolutional Neural Network class
ved = ApokascEncoderDecoder()
# You don't have to specify the task because its 'regression' by default. But if you are doing classification. you can set task='classification'
ved.task = 'regression'
# Set max_epochs to 10 for a quick result. You should train more epochs normally, especially with dropout
ved.max_epochs = 10
ved.train(x_train, y_train, x_err, y_err)
Here is a list of parameter you can set but you can also not set them to use default
ved.batch_size = 128
ved.initializer = 'glorot_uniform'
ved.activation = 'relu'
ved.num_filters = [32, 64, 16, 16]
ved.filter_len = [8, 32]
ved.pool_length = 2
ved.num_hidden = [16, 16]
ved.latent_dim = 5
ved.max_epochs = 100
ved.lr = 0.005
ved.reduce_lr_epsilon = 0.00005
ved.reduce_lr_min = 0.0000000001
ved.reduce_lr_patience = 10
ved.target = 'PSD'
ved.l2 = 5e-9
ved.input_norm_mode = 2
ved.labels_norm_mode = 0
Note
You can disable astroNN data normalization via ApokascEncoderDecoder.input_norm_mode=0
as well as ApokascEncoderDecoder.labels_norm_mode=0
and do normalization yourself. But make sure you don’t normalize labels with MAGIC_NUMBER (missing labels).
After the training, you can use ved in this case and call test method to test the neural network on test data. Or you can load the folder by
from astroNN.models import load_folder
ved = load_folder('astroNN_0101_run001')
# Load the test data from dataset, x_test is APOGEE spectra
# something here
# pred contains denormalized result aka. Kepler PSD prediction in this case
pred = ved.test(x_test)
# methods like predict_encoder() and predict_decoder() also available
StarNet (arXiv:1709.09182)
- class astroNN.models.apogee_models.StarNet2017[source]
To create StarNet, S. Fabbro et al. (2017) arXiv:1709.09182. astroNN implemented the exact architecture with default parameter same as StarNet paper
- History
2017-Dec-23 - Written - Henry Leung (University of Toronto)

StarNet2017 is a astroNN neural network implementation from the paper (arXiv:1709.09182), StarNet2017 is inherited from astroNN’s CNNBase class defined in astroNN.models.NeuralNetBases
You can create StarNet2017 via
from astroNN.models import StarNet2017
from astroNN.datasets import H5Loader
# And then create an object of StarNet2017 classs
starnet_net = StarNet2017()
# Load the train data from dataset first, x_train is spectra and y_train will be ASPCAP labels
loader = H5Loader('datasets.h5')
loader.load_err = False
x_train, y_train = loader.load()
# And then create an object of Convolutional Neural Network classs
starnet = StarNet2017()
# Set max_epochs to 10 for a quick result. You should train more epochs normally
starnet.max_epochs = 10
starnet.train(x_train, y_train)
Note
Default hyperparameter is the same as the original StarNet paper
Cifar10 with astroNN
Here is a Cifar10 example using astroNN
from keras.datasets import cifar10
from keras import utils
import numpy as np
from astroNN.models import Cifar10CNN
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
y_train = utils.to_categorical(y_train, 10)
y_test = utils.to_categorical(y_test, 10)
y_train = y_train.astype(np.float32)
x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)
y_test = y_test.astype(np.float32)
net = Cifar10CNN()
net.max_epochs = 10
net.train(x_train, y_train)
# Load the folder back
from astroNN.models import load_folder
# Replace with correct name
cnn = load_folder('astroNN_0114_run001')
prediction = cnn.test(x_test)
print(prediction)
Acknowledging astroNN
And here is a list of publications using astroNN
- Publications using astroNN