# multi-task-learning **Repository Path**: hackerTeam2019/multi-task-learning ## Basic Information - **Project Name**: multi-task-learning - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2019-10-09 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Smile detection, gender and age estimation using Multi-task Learning On the original paper [DEX: Deep EXpectation](https://www.vision.ee.ethz.ch/en/publications/papers/proceedings/eth_biwi_01229.pdf) the authors were able to display remarkable results in classifying the age of an individual based on a given image alone. Let see how accuracy (bad I guess), with limited resources, we can get with self-construct architecture. And not only age, we also classifying gender and smile by using multi-task training technique. ## Dependencies * Python 3.x * OpenCV 2 * tensorflow * numpy * pandas * matplotlib * scipy * seaborn * [MTCNN](https://github.com/ipazc/mtcnn) for face detection ## Usage ### Download datasets First download [GENKI4K](https://github.com/ipazc/mtcnn) and [IMDB-WIKI](https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/) (cropped face version) dataset. ### Preprocess data First run [imdb-preprocess.ipynb](./imdb-preprocess.ipynb). This step preprocess IMDB-WIKI dataset, filter out images with more than one face, gender is NaN and invalid age. All saved in a pickle file. Next run [prepare-data.ipynb](./prepare-data.ipynb). This will split two datasets in to training parts and test parts. The IMDB-WIKI dataset we will split into two separate datasets, gender and age datasets. So we have 3 datasets: age gender age. Each file will run into MTCNN network to detect and crop face, resize to 48x48 and convert to gray, then all datasets are saved into pickle files. ### Training Run [CNN2Head_train.ipynb](./CNN2Head_train.ipynb), using exponential decay to reduce learning rate over time. Change your datasets folder links, training parameters in [const.py](./const.py). Tensorboard's log will be saved in `summary` folder, training result in each batch is written in [log.csv](./log.csv). ```python WEIGHT_INIT = 0.01 NUM_TASKS = 3 IMG_SIZE = 48 INIT_LR = 0.01 USE_BN = True BN_DECAY = 0.99 EPSILON = 0.001 WEIGHT_DECAY = 0.01 DECAY_STEP = 1000 DECAY_LR_RATE = 0.95 BATCH_SIZE = 128 USE_GPU = True SAVE_FOLDER = './save/current/' NUM_EPOCHS = 2000 DROP_OUT_PROB = 0.5 ``` ### Evaluation Run [evaluate.ipynb](./evaluate.ipynb) to evaluate learned model on test datasets. ### Testing For single file, see [test.ipynb](./test.ipynb) file for more example. ![demo single file](./screenshots/predict.png) For demo on webcam, run [demo.py](./demo.py) file.
### TODO * Augmentation for balanced label, reduce overfitting * Improve accuracy ### References and Acknowledgments This code is part of my GR project at HUST University. Following papers and codes are referred: 1. Dinh Viet Sang, Le Tran Bao Cuong, and Do Phan Thuan. 2017., Facial Smile Detection Using Convolutional Neural Networks. In The 9th International Conference on Knowledge and Systems Engineering (KSE 2017). 138–143. 2. Dinh Viet Sang, Le Tran Bao Cuong, Pham Thai Ha, Multi-task learning for smile detection, emotion recognition and gender classification, December 2017 3. [IMDB-WIKI: trying a small model for age classification ](https://surfertas.github.io/deeplearning/2017/04/18/imdbwiki.html) 3. [MultiTask-MergeDataset - Le Tran Bao Cuong](https://gitlab.com/ltbclqd2805/MultiTask-MergeDataset)