We presen t LBW-Net,an efficient optimization based method for qua nt ization and training of the low bit-width convolutional neural networks(CNNs).Specifically,we quantize the weights to zero or powers of 2 by minimizing the Euclidean distance between full-precision weights and quantized weights during backpropagation(weight learning).We characterize the combinatorial nature of the low bit-width quantization problem.For 2-bit(ternary)CNNs,the quantization of N weights can be done by an exact formula in O(N log N)complexity.When the bit-width is 3 and above,we further propose a semi-analytical thresholding scheme with a single free parameter for quantization that is computationally inexpensive.The free parameter is further determined by network retraining and object detection tests.The LBW-Net has several desirable advantages over full-precision CNNs,including considerable memory savings,energy efficiency,and faster deployment.Our experiments on PASCAL VOC dataset show that compared with its 32-bit floating-point counterpart,the performance of the 6-bit LBW-Net is nearly lossless in the object detection tasks,and can even do better in real world visual scenes,while empirically enjoying more than 4× faster deployment.
Journal of Computational Mathematics
NSF grants DMS-1522383,IIS-1632935
ONR grant N00014-16-1-2157.
Penghang Yin,Email:email@example.com;Shuai Zhang,Email:firstname.lastname@example.org;Yingyong Qi,Email:email@example.com;Jack Xin,Email:firstname.lastname@example.org.