We extract features from the processed images using different pre-trained models and train corresponding shallow networks. We sample the dataset without replacement into buckets of 80%(training) & 20%(test). For the training process, we use Adam\citealt{kingma2014adam} optimization algorithm with the initial learning rate(\(\alpha\)) of 10-4. The training dataset is increased multiple folds by image augmentation as described in Section \ref{717120}. The trained model is evaluated on the validation data generated on the fly and is finally tested (reported results) on the independent test dataset.