SphereFace+ is released under the MIT License (refer to the LICENSE file for details).
Inspired by prior knowledge that weights of classifier represent the center of each class respectively, we propose SphereFace+ by applying Minimum Hyperspherical Energy (MHE), which can effectively enhance inter-class feature separability, to SphereFace. Our experiments verify MHE's abilities of improving inter-class feature separability and further boosting the performance of SphereFace for face recognition. Our paper is available at arXiv (SphereFace+ is described in Section 5.2 of the main paper).
As our paper stated, SphereFace+ uses a mini-batch approximation for the original MHE loss (see Section 3.6 in the paper) to reduce the computational cost of computing pair-wise similarity (i.e., kernel) among large amount of classifiers in the final output layer.
If you find SphereFace+ useful in your research, please consider to cite the following paper:
@InProceedings{LiuNIPS18,
title={Learning towards Minimum Hyperspherical Energy},
author={Liu, Weiyang and Lin, Rongmei and Liu, Zhen and Liu, Lixin and Yu, Zhiding and Dai, Bo and Song, Le},
booktitle={NIPS},
year={2018}
}
and the original SphereFace:
@InProceedings{Liu2017CVPR,
title = {SphereFace: Deep Hypersphere Embedding for Face Recognition},
author = {Liu, Weiyang and Wen, Yandong and Yu, Zhiding and Li, Ming and Raj, Bhiksha and Song, Le},
booktitle = {CVPR},
year = {2017}
}
Matlab
Caffe
and matcaffe
(see: Caffe installation instructions)MTCNN
(see: MTCNN - face detection & alignment) and Pdollar toolbox
(see: Piotr's Image & Video Matlab Toolbox).Attension: If you used other CUDA or cuDNN versions, the training process would fail frequently.
Clone recursively the SphereFace-Plus repository. We'll call the directory that you cloned SphereFace-Plus as SPHEREFACE_PLUS_ROOT
. The installation basically follows SphereFace.
Build Caffe and matcaffe
cd $SPHEREFACE_PLUS_ROOT/tools/caffe-sphereface
# Now follow the Caffe installation instructions here:
# http://caffe.berkeleyvision.org/installation.html
make all -j8 && make matcaffe
If you have any questions about installation caffe with cudnn 6.0, try to refer to caffe issue #1325.
After successfully completing the installation, you are ready to run all the following experiments.
the same as SphereFace preprocessing
Note: In this part, we assume you are in the directory $SPHEREFACE_PLUS_ROOT/preprocess/
Download the training set (CASIA-WebFace
) and test set (LFW
) and place them in data/
.
mv /your_path/CASIA_WebFace data/
./code/get_lfw.sh
tar xvf data/lfw.tgz -C data/
Please make sure that the directory of data/
contains two datasets.
Detect faces and facial landmarks in CAISA-WebFace and LFW datasets using MTCNN
(see: MTCNN - face detection & alignment).
# In Matlab Command Window
run code/face_detect_demo.m
This will create a file dataList.mat
in the directory of result/
.
Align faces to a canonical pose using similarity transformation.
# In Matlab Command Window
run code/face_align_demo.m
This will create two folders (CASIA-WebFace-112X96/
and lfw-112X96/
) in the directory of result/
, containing the aligned face images.
Note: In this part, we assume you are in the directory $SPHEREFACE_PLUS_ROOT/train/
Get a list of training images and labels.
mv ../preprocess/result/CASIA-WebFace-112X96 data/
# In Matlab Command Window
run code/get_list.m
The aligned face images in folder **CASIA-WebFace-112X96/**
are moved from preprocess folder to train folder. A list CASIA-WebFace-112X96.txt
is created in the directory of data/
for the subsequent training.
Get pretrained models from Google Drive | BaiduYunDisk.
Download all pretrained models from Google Drive | BaiduYunDisk. And move them into $SPHEREFACE_PLUS_ROOT/train/pretrained_model/
. We initialize our network with such pretrained models for computing inter class distances better.
Pretrained Models | Single | Double | Triple | Quadruple |
---|---|---|---|---|
ACC | 96.22% | 98.87% | 98.93% | 99.27% |
Train the sphereface model.
For m = 4
bash train_sfplus.sh
We use 2 GPUs to run training. If you want to use only one GPU, please set iter_size = 2
in code/sfplus/sfplus_solver.prototxt
and change train_sfplus.sh
manually. After training, a model sfplus_model_iter_8000.caffemodel
and a corresponding log file sfplus_train.log
are placed in the directory of result/
.
For m = 1
bash train_m_single.sh
For m = 2
bash train_m_double.sh
For m = 3
bash train_m_triple.sh
See more traing detail in Training Notes
Note: In this part, we assume you are in the directory $SPHEREFACE_PLUS_ROOT/test/
Get the pair list of LFW (view 2).
mv ../preprocess/result/lfw-112X96 data/
./code/get_pairs.sh
Make sure that the LFW dataset andpairs.txt
in the directory of data/
Extract deep features and test on LFW.
matlab -nodisplay -nodesktop -r evaluation
Finally we get the accuracy on LFW.
Attention: You can also test sfplus_model_iter_7000.caffemodel
by changing test/code/evaluation.m
.
For m = 4, we go through the entire pipeline for 10 times. The accuracies on LFW are shown below. And we release model #7.
Experiment | #1 | #2 | #3 | #4 | #5 | #6 | #7(released) | #8 | #9 | #10 |
---|---|---|---|---|---|---|---|---|---|---|
ACC | 99.23% | 99.30% | 99.25% | 99.28% | 99.20% | 99.27% | 99.35% | 99.18% | 99.33% | 99.28% |
Released Training Log & Model File Google Drive | BaiduYunDisk
For m = 1, we go through the entire pipeline for 5 times. The accuracies on LFW are shown below. And we release model #3.
Experiment | #1 | #2 | #3(released) | #4 | #5 |
---|---|---|---|---|---|
ACC | 97.48% | 97.32% | 97.48% | 97.18% | 97.53% |
Released Training Log & Model File Google Drive | BaiduYunDisk
For m = 2, we go through the entire pipeline for 8 times. The accuracies on LFW are shown below. And we release model #3.
Experiment | #1 | #2 | #3(released) | #4 | #5 | #6 | #7 | #8 |
---|---|---|---|---|---|---|---|---|
ACC | 98.95% | 98.98% | 99.05% | 99.08% | 98.90% | 99.02% | 99.05% | 98.83% |
Released Training Log & Model File Google Drive | BaiduYunDisk
For m = 3, we go through the entire pipeline for 8 times. The accuracies on LFW are shown below. And we release model #5.
Experiment | #1 | #2 | #3 | #4 | #5(released) | #6 | #7 | #8 |
---|---|---|---|---|---|---|---|---|
ACC | 98.93% | 99.05% | 99.08% | 99.05% | 99.08% | 98.90% | 99.13% | 99.00% |
Released Training Log & Model File Google Drive | BaiduYunDisk
All models can find in Google Drive | BaiduYunDisk
Pretraining is a very effective way to avoid training difficulty.
As one can learn from our implementation, we use the pretrained model from the original SphereFace, and finetune the SphereFace model using the new loss of SphereFace+. It can effectively reduce the training difficulty of the new loss and improve the results consistently.
Finetuning the CASIA-pretrained model on new datasets could potentially stablize the training difficulty.
When you are using our model for some new datasets, you can also consider finetuning the CASIA-trained models on the new datasets.
Lixin Liu and Weiyang Liu
Questions can also be left as issues in the repository. We will be happy to answer them.