What type of star is it? — on cAInvas

Identify the type of star using its characteristics and neural networks.

Photo by Alex Kunchevsky for OUTLΛNE on Dribbble

Classification of stars based on their characteristics is called stellar classification.

Here we classify them into 6 classes — Brown Dwarf, Red Dwarf, White Dwarf, Main Sequence, Supergiant, Hypergiant.

Implementation of the idea on cAInvas — here!

Dataset

On Kaggle by Deepraj Baidya | Github

The dataset took 3 weeks to collect for 240 stars which are mostly collected from the web. The missing data were manually calculated using equations of astrophysics.

The dataset is a CSV file with characteristics of a star like luminosity, temperature, colour, radius, etc that help classify them into one of the 6 classes — Brown Dwarf, Red Dwarf, White Dwarf, Main Sequence, Supergiant, Hypergiant.

There are 6 input attributes and 1 target column.

Snapshot of the dataset

Preprocessing

Looking into the values of the ‘Star color’ column —

‘Star color’ column values

There are many shades of colours mentioned, some similar like Yellowish white and White-Yellow, and many spelling for blue-white.

We can identify 5 basic colours from the given list — blue, white, yellow, orange, red. Let’s rewrite the column as 5 columns with multilabel values.

https://gist.github.com/AyishaR/75d9511272d2b57c7332704fc48f8f44

Now, the Spectral class column —

‘Spectral Class’ column values

The values in this column do not have any range dependency. They are one hot encoded using the get_dummies() function of the pandas library. The drop_first parameter is set to True. This means that if there are n categories in the column, n-1 columns are returned instead of n. i.e., each value is returned as an n-1 value array. The first category is defined by an array with all 0s while the remaining n-1 category variables are arrays with 1 in the (i-1)th index of the array.

https://gist.github.com/AyishaR/d00424356dafbb21fc84c0ca7bdc6166

The target columns have 6 class values that do not have any range dependency. They are one hot encoded using the get_dummies() function of the pandas library with drop_first set to False.

https://gist.github.com/AyishaR/53648c99ccf0cf75bdfcceb855a66b43

Let’s look into inferences derived from the dataset based on its values using the df.describe() function.

Snapshot of df.describe()

The range of values in the attributes and their standard deviation differ across the dataset. This can result in uneven weightage of attributes while training the model.

The dataset is split into train-validation-test before standardization (mean = 0, sd = 1).

https://gist.github.com/AyishaR/b39c186e82ee066d7cf5e00a4dc06261

The train set has 192 samples, the val and test set have 24 samples each.

The StandardScaler function of the sklearn.preprocessing module is used to implement this concept. The instance is first fit on the training data and used to transform the train, validation, and test data.

https://gist.github.com/AyishaR/fe0079a79d363bb80d96895a2a915605

The model

The model is a simple one with 3 Dense layers, 2 of which have ReLU activation functions and the last one has a softmax activation function that outputs a range of values that sum up to 1 (probability values for the 6 classes).

As it is a classification problem where the targets are one-hot encoded, the model is compiled using the categorical cross-entropy loss function. The Adam optimizer is used and the accuracy of the model is tracked over epochs.

The EarlyStopping callback function of the keras.callbacks module monitors the validation loss and stops the training if it doesn’t decrease for 5 epochs continuously. The restore_best_weights parameter ensures that the model with the least validation loss is restored to the model variable.

https://gist.github.com/AyishaR/c10ed90f1fdf1d9c5611b0b9a3b6bab0

The model was trained with a learning rate of 0.001 and an accuracy of ~100% was achieved on the test set.

Test set evaluation

The metrics

Plot of the losses
Plot of the accuracies

Prediction

Defining the classes according to their array indices —

Class name definition

Let’s perform predictions on random test data samples —

Prediction on a random test data sample

deepC

deepC library, compiler, and inference framework are designed to enable and perform deep learning neural networks by focussing on features of small form-factor devices like micro-controllers, eFPGAs, CPUs, and other embedded devices like raspberry-pi, odroid, Arduino, SparkFun Edge, RISC-V, mobile phones, x86 and arm laptops among others.

Compiling the model using deepC —

Code: deepC compilation

Head over to the cAInvas platform (link to notebook given earlier) and check out the predictions by the .exe file!

Credits: Ayisha D