In the past decade, sparsity-driven methods have led to substantial improvements in the capabilities of numerous imaging systems. While traditionally such methods relied on analytical models of sparsity, such as total variation (TV) or wavelet regularization, recent methods are increasingly based on data-driven models such as dictionary-learning or convolutional neural networks (CNN). In this work, we propose a new trainable model based on the proximal operator for TV. By interpreting the popular fast iterative shrinkage/thresholding algorithm (FISTA) as a CNN, we train the filters of the algorithm to minimize the error over a training data-set. Experiments on image denoising show that by training the filters, one can substantially boost the performance of the algorithm and make it competitive with other state-of-the-art methods.