TY - JOUR
T1 - Filter pruning by image channel reduction in pre-trained convolutional neural networks
AU - Chung, Gi Su
AU - Won, Chee Sun
N1 - Publisher Copyright:
© 2020, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2021/8
Y1 - 2021/8
N2 - There are domain-specific image classification problems such as facial emotion and house-number classifications, where the color information in the images may not be crucial for recognition. This motivates us to convert RGB images to gray-scale ones with a single Y channel to be fed into the pre-trained convolutional neural networks (CNN). Now, since the existing CNN models are pre-trained by three-channel color images, one can expect that some trained filters are more sensitive to colors than brightness. Therefore, adopting the single-channel gray-scale images as inputs, we can prune out some of the convolutional filters in the first layer of the pre-trained CNN. This first-layer pruning greatly facilitates the filter compression of the subsequent convolutional layers. Now, the pre-trained CNN with the compressed filters is fine-tuned with the single-channel images for a domain-specific dataset. Experimental results on the facial emotion and Street View House Numbers (SVHN) datasets show that we can achieve a significant compression of the pre-trained CNN filters by the proposed method. For example, compared with the fine-tuned VGG-16 model by color images, we can save 10.538 GFLOPs computations, while keeping the classification accuracy around 84% for the facial emotion RAF-DB dataset.
AB - There are domain-specific image classification problems such as facial emotion and house-number classifications, where the color information in the images may not be crucial for recognition. This motivates us to convert RGB images to gray-scale ones with a single Y channel to be fed into the pre-trained convolutional neural networks (CNN). Now, since the existing CNN models are pre-trained by three-channel color images, one can expect that some trained filters are more sensitive to colors than brightness. Therefore, adopting the single-channel gray-scale images as inputs, we can prune out some of the convolutional filters in the first layer of the pre-trained CNN. This first-layer pruning greatly facilitates the filter compression of the subsequent convolutional layers. Now, the pre-trained CNN with the compressed filters is fine-tuned with the single-channel images for a domain-specific dataset. Experimental results on the facial emotion and Street View House Numbers (SVHN) datasets show that we can achieve a significant compression of the pre-trained CNN filters by the proposed method. For example, compared with the fine-tuned VGG-16 model by color images, we can save 10.538 GFLOPs computations, while keeping the classification accuracy around 84% for the facial emotion RAF-DB dataset.
KW - CNN filter compression
KW - Facial emotion classification
KW - Image channel reduction
KW - Network pruning
UR - http://www.scopus.com/inward/record.url?scp=85088519888&partnerID=8YFLogxK
U2 - 10.1007/s11042-020-09373-9
DO - 10.1007/s11042-020-09373-9
M3 - Article
AN - SCOPUS:85088519888
SN - 1380-7501
VL - 80
SP - 30817
EP - 30826
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 20
ER -