A repository of papers accompanying nPlan's machine learning paper club at Google Campus.
[07/11/2019] Gary presents: Cobb, A. D., Roberts, S. J., & Gal, Y. (2018). Loss-calibrated approximate inference in Bayesian neural networks. arXiv preprint arXiv:1805.03901.
-
Daniely, A., Lazic, N., Singer, Y., & Talwar, K. (2016). Short and deep: Sketching and neural networks.
-
Frankle, J., & Carbin, M. (2018). The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635.
-
Zhou, H., Lan, J., Liu, R., & Yosinski, J. (2019). Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask. arXiv preprint arXiv:1905.01067.
For those new to machine learning, these are some recommended reading material:
-
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.
-
Goldberg, Y. (2016). A primer on neural network models for natural language processing. Journal of Artificial Intelligence Research, 57, 345-420.
-
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., & Yu, P. S. (2019). A comprehensive survey on graph neural networks. arXiv preprint arXiv:1901.00596.
The wide and deep model implementation that Carlos presented can be found here https://github.com/caledezma/wide_deep_model. Why not download it, play with it, and let us know your findings at paper club?
We regularly record the presentations made during the Meetup (subject to the presenter's and attendees' approval). These videos are then uploaded to our YouTube channel so that those that can't attend are still able to profit from the presentations. If you's like to stay up to date with the presentations, just hit the subscribe button!
The papers that have been (and will be) discussed in Paper Club meetings are.
-
[07/11/2019] Gary presents: Cobb, A. D., Roberts, S. J., & Gal, Y. (2018). Loss-calibrated approximate inference in Bayesian neural networks. arXiv preprint arXiv:1805.03901.
-
[31/10/2019] Arvid presents (slides): Chapelle, Olivier, and Lihong Li. "An empirical evaluation of thompson sampling." Advances in neural information processing systems. 2011.
-
[24/10/2019] Ivan presents: Chelombiev, I., Houghton, C., & O'Donnell, C. (2019). Adaptive estimators show information compression in deep neural networks. arXiv preprint arXiv:1902.09037.
-
[10/10/2019] Ivan presents: Saxe, A. M., Bansal, Y., Dapello, J., Advani, M., Kolchinsky, A., Tracey, B. D., & Cox, D. D. (2018). On the information bottleneck theory of deep learning.
-
[03/10/2019] Ivan presents: Shwartz-Ziv, R., & Tishby, N. (2017). Opening the black box of deep neural networks via information. arXiv preprint arXiv:1703.00810.
-
[26/09/2019] Carlos presents (with demo): Cheng, H. T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., ... & Anil, R. (2016, September). Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems (pp. 7-10). ACM.
-
[19/09/2019] Alan presents: Mosca, A., & Magoulas, G. D. (2018). Distillation of deep learning ensembles as a regularisation method. In Advances in Hybridization of Intelligent Methods (pp. 97-118). Springer, Cham.
-
[12/09/2019] Carlos presents: Papernot, N., McDaniel, P., Wu, X., Jha, S., & Swami, A. (2016, May). Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP) (pp. 582-597). IEEE.
-
[05/09/2019] Carlos presents: Frosst, N., & Hinton, G. (2017). Distilling a neural network into a soft decision tree. arXiv preprint arXiv:1711.09784.
-
[29/08/2019] Alan presents: Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
-
[22/08/2019] Gary presents: Lee, J., Lee, I., & Kang, J. (2019). Self-Attention Graph Pooling. arXiv preprint arXiv:1904.08082.
-
[15/08/2019] Vahan presents: Yao, L., Mao, C., & Luo, Y. (2019, July). Graph convolutional networks for text classification. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp. 7370-7377).
-
[08/08/2019] Carlos presents: Wu, F., Zhang, T., Souza Jr, A. H. D., Fifty, C., Yu, T., & Weinberger, K. Q. (2019). Simplifying graph convolutional networks. arXiv preprint arXiv:1902.07153.
-
[25/07/2019] Arvid presents: Enßlin, T. A., Frommert, M., & Kitaura, F. S. (2009). Information field theory for cosmological perturbation reconstruction and nonlinear signal analysis. Physical Review D, 80(10), 105005.
-
[18/07/2019] Gary presents: Zhang, G., Wang, C., Xu, B., & Grosse, R. (2018). Three mechanisms of weight decay regularization. arXiv preprint arXiv:1810.12281.
-
[11/07/2019] Auke presents: Oord, A. V. D., Li, Y., & Vinyals, O. (2018). Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748.
-
[04/07/2019] François presents: Kool, W., van Hoof, H., & Welling, M. (2019). Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement. arXiv preprint arXiv:1903.06059.
-
[20/06/2019] Vahan presents: Kipf, T. N., & Welling, M. (2016). Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
-
[13/06/2019] Alessio presents: Dobriban, E., & Liu, S. (2018). A new theory for sketching in linear regression. arXiv preprint arXiv:1810.06089.
-
[06/06/2019] François presents: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
-
[30/05/2019] Arvid presents: Schölkopf, B., Smola, A., & Müller, K. R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural computation, 10(5), 1299-1319.
-
[23/05/2019] Auke presents: Alaa, A. M., & van der Schaar, M. (2018). Autoprognosis: Automated clinical prognostic modeling via bayesian optimization with structured kernel learning. arXiv preprint arXiv:1802.07207.
-
[16/05/2019] Carlos presents: Dhamija, A. R., Günther, M., & Boult, T. (2018). Reducing Network Agnostophobia. In Advances in Neural Information Processing Systems (pp. 9175-9186).
-
[09/05/2019] Naman presents: Geifman, Y., & El-Yaniv, R. (2017). Selective classification for deep neural networks. In Advances in neural information processing systems (pp. 4878-4887).
-
[02/05/2019] Gary presents: Gal, Y., & Ghahramani, Z. (2016, June). Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning (pp. 1050-1059).
-
[25/04/2019] Vahan presents: Lakshminarayanan, B., Pritzel, A., & Blundell, C. (2017). Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems (pp. 6402-6413).
-
[18/04/2019] Vahan presents: Vyas, A., Jammalamadaka, N., Zhu, X., Das, D., Kaul, B., & Willke, T. L. (2018). Out-of-distribution detection using an ensemble of self supervised leave-out classifiers. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 550-564).
-
[11/04/2019] Carlos presents: Bendale, A., & Boult, T. E. (2016). Towards open set deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1563-1572).
-
[04/04/2019] Arvid presents: Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., ... & Sabeti, P. C. (2011). Detecting novel associations in large data sets. science, 334(6062), 1518-1524.
-
[28/03/2019] Joao presents: Chen, B., Medini, T., & Shrivastava, A. (2019). SLIDE: In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems. arXiv preprint arXiv:1903.03129.
-
[21/03/2019] Joao presents: Oord, A. V. D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., ... & Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio.. arXiv preprint.
-
[14/03/2019] Vahan presents: Wright, J., Ganesh, A., Rao, S., Peng, Y., & Ma, Y. (2009). Robust principal component analysis: Exact recovery of corrupted low-rank matrices via convex optimization. In Advances in neural information processing systems (pp. 2080-2088).
-
[07/03/2019] Vahan presents: Candes, E. J., Romberg, J. K., & Tao, T. (2006). Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics: A Journal Issued by the Courant Institute of Mathematical Sciences, 59(8), 1207-1223.
-
[28/02/2019] Arvid presents: Dietterich, T. G., & Bakiri, G. (1994). Solving multiclass learning problems via error-correcting output codes. Journal of artificial intelligence research, 2, 263-286.
-
[21/02/2019] Gary presents: Mnih, A., & Kavukcuoglu, K. (2013). Learning word embeddings efficiently with noise-contrastive estimation. In Advances in neural information processing systems (pp. 2265-2273).
-
[14/02/2019] Carlos presents: Ziko, I., Granger, E., & Ayed, I. B. (2018). Scalable Laplacian K-modes. In Advances in Neural Information Processing Systems (pp. 10062-10072).
-
[07/02/2019] Carlos presents: Wang, W., & Carreira-Perpinán, M. A. (2014). The Laplacian K-modes algorithm for clustering. arXiv.
-
[31/01/2019] Gary presents: Hoffer, E., Hubara, I., & Soudry, D. (2017). Train longer, generalize better: closing the generalization gap in large batch training of neural networks. In Advances in Neural Information Processing Systems (pp. 1731-1741).
-
[24/01/2019] Alessio presents: McInnes, L., & Healy, J. (2018). Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.
-
[17/01/2019] Chris presents: Maaten, L. V. D., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of machine learning research.
-
[10/01/2019] Carlos presents: Chen, T. Q., Rubanova, Y., Bettencourt, J., & Duvenaud, D. (2018). Neural Ordinary Differential Equations. arXiv:1806.07366.
-
[20/12/2018] Gary presents: Wilson, A. C., Roelofs, R., Stern, M., Srebro, N., & Recht, B. (2017). The marginal value of adaptive gradient methods in machine learning. In Advances in Neural Information Processing Systems.
-
[13/12/2018] Carlos presents: Lin, H., & Jegelka, S. (2018). ResNet with one-neuron hidden layers is a Universal Approximator. In Advances in Neural Information Processing Systems.
-
[06/12/2018] Auke presents: Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2018). Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 9446-9454).
-
[29/11/2018] Vahan presents: Zhang, C., Bengio, S., Hardt, M., Recht, B., & Vinyals, O. (2016). Understanding deep learning requires rethinking generalization. arXiv:1611.03530.
-
[22/11/2018] Gary presents: Smith, S. L., Kindermans, P. J., Ying, C., & Le, Q. V. (2017). Don't decay the learning rate, increase the batch size. arXiv:1711.00489.
-
[15/11/2018] Joao presents: Bai, S., Kolter, J. Z., & Koltun, V. (2018). An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv:1803.01271.
-
[01/11/2018] Vahan presents: Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences.
-
[18/10/2018] Carlos presents: Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.
-
[11/10/2018] dos Santos, C., & Gatti, M. (2014). Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers.