TR2017-140

Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks

- Ziming, Z., Brand, M., "Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks", Advances in Neural Information Processing Systems (NIPS), December 2017.
  BibTeX TR2017-140 PDF
  - @inproceedings{Ziming2017dec,
  - author = {Ziming, Zhang and Brand, Matthew},
  - title = {Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks},
  - booktitle = {Advances in Neural Information Processing Systems (NIPS)},
  - year = 2017,
  - month = dec,
  - url = {https://www.merl.com/publications/TR2017-140}
  - }
MERL Contact:
- Matthew
  Brand
Research Areas:

Artificial Intelligence, Computer Vision, Machine Learning

Abstract:

By lifting the ReLU function into a higher dimensional space, we develop a smooth multi-convex formulation for training feed-forward deep neural networks (DNNs). This allows us to develop a block coordinate descent (BCD) training algorithm consisting of a sequence of numerically well-behaved convex optimizations. Using ideas from proximal point methods in convex analysis, we prove that this BCD algorithm will converge globally to a stationary point with R-linear convergence rate of order one. In experiments with the MNIST database, DNNs trained with this BCD algorithm consistently yielded better test-set error rates than identical DNN architectures trained via all the stochastic gradient descent (SGD) variants in the Caffe toolbox.

Related Publication

Zhang, Z., Brand, M., "Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks", arXiv, November 2017.

BibTeX arXiv

@article{Zhang2017nov2,
author = {Zhang, Ziming and Brand, Matthew},
title = {Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks},
journal = {arXiv},
year = 2017,
month = nov,
url = {https://arxiv.org/abs/1711.07354}
}

MERL Contact:

MatthewBrand

Research Areas:

Abstract:

Matthew
Brand