Getting Started

astroNN is developed on GitHub. You can download astroNN from its Github.

But the easiest way to install is via pip: astroNN on Python PyPI

pip install astroNN

For latest version, you can clone the latest commit of astroNN from github

git clone --depth=1 https://github.com/henrysky/astroNN

and run the following command to install after you open a command line window in the package folder to install:

python -m pip install .

or to develop:

python -m pip install -e .

Prerequisites

Latest version of Anaconda is recommended, but generally the use of Anaconda is still highly recommended

Python 3.8 or above
Tensorflow (the latest version is recommended)
Tensorflow-Probability (the latest version is recommended)
CUDA and CuDNN (optional)
graphviz and pydot are required to plot the model architecture
scikit-learn, tqdm, pandas, h5py and astroquery required for astroNN functions

Since Tensorflow and Tensorflow-Probability are rapidly developing packages and astroNN heavily depends on Tensorflow. The support policy of astroNN to these packages is only the last two official versions are supported (i.e. the latest version and the second latest version are included in test suite). Generally using the latest version of Tensorflow, Tensorflow-Probability are recommended. The current supporting status (i.e. included in test suites) are

Tensorflow 2.16.x (correspond to Tensorflow-Probability 0.24.x)
Tensorflow 2.15.x (correspond to Tensorflow-Probability 0.23.x)

Note

Due to bugs in Tensorflow 1.12.x: https://github.com/tensorflow/tensorflow/issues/22952, 1.14.x: https://github.com/tensorflow/tensorflow/issues/27543 or 2.5.x: https://github.com/tensorflow/tensorflow/pull/47957, you have to patch a few lines in order for astroNN to work proporly. You can patch Tensorflow by running the following code

from astroNN.config import tf_patch

tf_patch()

You can also unpatch Tensorflow to undo changes made by astroNN by running the following code

from astroNN.config import tf_unpatch

tf_unpatch()

For instruction on how to install Tensorflow, please refers to their official website Installing TensorFlow

Recommended system requirement:

64-bits operating system
CPU which supports AVX2 (List of CPUs: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2)
16GB RAM or above
NVIDIA Graphics card (Optional, GTX 10 series or above) or Apple Silicon
(If using NVIDIA GPU): At least 4GB VRAM on GPU

Using astroNN on Google Colab

To use the latest commit of astroNN on Google colab, you can copy and paste the following

!pip install tensorflow
!pip install tensorflow_probability
!pip install git+https://github.com/henrysky/astroNN.git

Basic FAQ

My hardware or software cannot meet the prerequisites, what should I do?

The hardware and software requirement is just an estimation. It is entirely possible to run astroNN without those requirement. But generally, python 3.6 or above (as Tensorflow only supports py36 or above) and mid-to-high end hardware.

Can I contribute to astroNN?

You can contact me (Henry: henrysky.leung [at] utoronto.ca) or refer to Contributor and Issue Reporting guide.

I have found a bug in astorNN

Please try to use the latest commit of astroNN. If the issue persists, please report to https://github.com/henrysky/astroNN/issues

I keep receiving warnings on APOGEE and Gaia environment variables

If you are not dealing with APOGEE or Gaia data, please ignore those warnings. If error raised to prevent you to use some of astroNN functionality, please report it as a bug to https://github.com/henrysky/astroNN/issues

If you don’t want those warnings to be shown again, go to astroNN’s configuration file and set environmentvariablewarning to False

I have installed pydot and graphviz but still fail to plot the model

if you are encountering this issue, please uninstall both pydot and graphviz and run the following command

pip install pydot
conda install graphviz

Then if you are using Mac, run the following command

brew install graphviz

If you are using Windows, go to https://graphviz.gitlab.io/_pages/Download/Download_windows.html to download the Windows package and add the package to the PATH environment variable.

Configuration File

astroNN configuration file is located at ~/.astroNN/config.ini which contains a few astroNN settings.

Currently, the default configuration file should look like this

[Basics]
magicnumber = -9999.0
multiprocessing_generator = False
environmentvariablewarning = True

[NeuralNet]
custommodelpath = None
cpufallback = False
gpu_mem_ratio = True

magicnumber refers to the Magic Number which representing missing labels/data, default is -9999. Please do not change this value if you rely on APOGEE data before DR16. If you want np.nan as most other people might do, you can simply set magicnumber = nan

multiprocessing_generator refers to whether enable multiprocessing in astroNN data generator. Default is False except on Linux and MacOS.

environmentvariablewarning refers to whether you will be warned about not setting APOGEE and Gaia environment variable.

custommodelpath refers to a list of custom models, path to the folder containing custom model (.py files), multiple paths can be separated by ;. Default value is None meaning no additional path will be searched when loading model. Or for example: /users/astroNN/custom_models/;/local/some_other_custom_models/ if you have self defined model in those locations.

cpufallback refers to whether force to use CPU. No effect if you are using tensorflow instead of tensorflow-gpu

gpu_mem_ratio refers to GPU management. Set True to dynamically allocate memory which is astroNN default or enter a float between 0 and 1 to set the maximum ratio of GPU memory to use or set None to let Tensorflow pre-occupy all of available GPU memory which is a designed default behavior from Tensorflow.

For whatever reason if you want to reset the configure file:

1from astroNN.config import config_path
2
3# astroNN will reset the config file if the flag = 2
4config_path(flag=2)

Folder Structure for astroNN, APOGEE, Gaia and LAMOST data

This code depends on environment variables and folders for APOGEE, Gaia and LAMOST data. The environment variables are

  • SDSS_LOCAL_SAS_MIRROR: top-level directory that will be used to (selectively) mirror the SDSS Science Archive Server (SAS)

  • GAIA_TOOLS_DATA: top-level directory under which the Gaia data will be stored.

  • LASMOT_DR5_DATA: top-level directory under which the LASMOST DR5 data will be stored.

How to set environment variable on different operating system: Guide here

$SDSS_LOCAL_SAS_MIRROR/
├── dr14/
│   ├── apogee/spectro/redux/r8/stars/
│   │   ├── apo25m/
│   │   │   ├── 4102/
│   │   │   │   ├── apStar-r8-2M21353892+4229507.fits
│   │   │   │   ├── apStar-r8-**********+*******.fits
│   │   │   │   └── ****/
│   │   ├── apo1m/
│   │   │   ├── hip/
│   │   │   │   ├── apStar-r8-2M00003088+5933348.fits
│   │   │   │   ├── apStar-r8-**********+*******.fits
│   │   │   │   └── ***/
│   │   ├── l31c/l31c.2/
│   │   │   ├── allStar-l30e.2.fits
│   │   │   ├── allVisit-l30e.2.fits
│   │   │   ├── 4102/
│   │   │   │   ├── aspcapStar-r8-l30e.2-2M21353892+4229507.fits
│   │   │   │   ├── aspcapStar-r8-l30e.2-**********+*******.fits
│   │   │   │   └── ****/
│   │   │   └── Cannon/
│   │   │       └── allStarCannon-l31c.2.fits
└── dr13/
    └── *similar to dr14 above/*


$GAIA_TOOLS_DATA/
└── Gaia/
    ├── gdr1/tgas_source/fits/
    │   ├── TgasSource_000-000-000.fits
    │   ├── TgasSource_000-000-001.fits
    │   └── ***.fits
    └── gdr2/gaia_source_with_rv/fits/
        ├── GaiaSource_2851858288640_1584379458008952960.fits
        ├── GaiaSource_1584380076484244352_2200921635402776448.fits
        └── ***.fits

$LASMOT_DR5_DATA/
└── DR5/
    ├── LAMO5_2MS_AP9_SD14_UC4_PS1_AW_Carlin_M.fits
    ├── 20111024
    │   ├── F5902
    │   │   ├──spec-55859-F5902_sp01-001.fits.gz
    │   │   └── ****.fits.gz
    │   └── ***/
    ├── 20111025
    │   ├── B6001
    │   │   ├──spec-55860-B6001_sp01-001.fits.gz
    │   │   └── ****.fits.gz
    │   └── ***/
    └── ***/

Note

The APOGEE and Gaia folder structure should be consistent with APOGEE and gaia_tools python package by Jo Bovy, tools for dealing with APOGEE and Gaia data

A dedicated project folder is recommended to run astroNN, always run astroNN under the root of project folder. So that astroNN will always create folder for every neural network you run under the same place. Just as below

_images/astronn_master_folder.PNG