| 1 | An Empirical Study of Example Forgetting during Deep Neural Network Learning | ICLR | 2019 | 
    
      | 2 | ROBUSTNESS May Be at ODDS WITH ACCURACY | ICLR | 2019 | 
    
      | 3 | Critical Learning Periods in Deep Networks | ICLR | 2019 | 
    
      | 4 | LEARNING ROBUST REPRESENTATIONS BY PROJECTING SUPERFICIAL STATISTICS OUT | ICLR | 2019 | 
    
      | 5 | Classification from Positive, Unlabeled and Biased Negative Data | ICLR | 2019 | 
    
      | 6 | Select Via Proxy: Efficient Data Selection For Training Deep Networks | ICLR | 2019 | 
    
      | 7 | Using Pre-Training Can Improve Model Robustness and Uncertainty | ICML | 2019 | 
    
      | 8 | On Learning Invariant Representations for Domain Adaptation | ICML | 2019 | 
    
      | 9 | Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks | ICML | 2019 | 
    
      | 10 | Gradient Descent Finds Global Minima of Deep Neural Networks | ICML | 2019 | 
    
      | 11 | When Samples Are Strategically Selected | ICML | 2019 | 
    
      | 12 | The Odds are Odd: A Statistical Test for Detecting Adversarial Examples | ICML | 2019 | 
    
      | 13 | Bias Also Matters: Bias Attribution for Deep Neural Network Explanation | ICML | 2019 | 
    
      | 14 | Escaping Saddle Points with Adaptive Gradient Methods | ICML | 2019 | 
    
      | 15 | Parameter-Efficient Transfer Learning for NLP | ICML | 2019 | 
    
      | 16 | Visualizing the Loss Landscape of Neural Nets | NeurIPS | 2018 | 
    
      | 17 | Modern Neural Networks Generalize on Small Data Sets | NeurIPS | 2018 | 
    
      | 18 | Generative modeling for protein structures | NeurIPS | 2018 | 
    
      | 19 | On Binary Classification in Extreme Regions | NeurIPS | 2018 | 
    
      | 20 | The Description Length of Deep Learning models | NeurIPS | 2018 | 
    
      | 21 | L1-regression with Heavy-tailed Distributions | NeurIPS | 2018 | 
    
      | 22 | Dynamic Network Model from Partial Observations | NeurIPS | 2018 | 
    
      | 23 | Learning Invariances using the Marginal Likelihood | NeurIPS | 2018 | 
    
      | 24 | How SGD Selects the Global Minima in Over-parameterized Learning: A Dynamical Stability Perspective | NeurIPS | 2018 | 
    
      | 25 | On the Local Minima of the Empirical Risk | NeurIPS | 2018 | 
    
      | 26 | Human-in-the-Loop Interpretability Prior | NeurIPS | 2018 | 
    
      | 27 | Processing of missing data by neural networks | NeurIPS | 2018 | 
    
      | 28 | Maximum-Entropy Fine Grained Classification | NeurIPS | 2018 | 
    
      | 29 | Deep Structured Prediction with Nonlinear Output Transformations | NeurIPS | 2018 | 
    
      | 30 | Large Margin Deep Networks for Classification | NeurIPS | 2018 | 
    
      | 31 | Towards Understanding Learning Representations: To What Extent Do Different Neural Networks Learn the Same Representation | NeurIPS | 2018 | 
    
      | 32 | Norm matters: efficient and accurate normalization schemes in deep networks | NeurIPS | 2018 | 
    
      | 33 | Query K-means Clustering and the Double Dixie Cup Problem | NeurIPS | 2018 | 
    
      | 34 | Bilevel learning of the Group Lasso structure | NeurIPS | 2018 | 
    
      | 35 | Loss Functions for Multiset Prediction | NeurIPS | 2018 | 
    
      | 36 | Active Learning for Non-Parametric Regression Using Purely Random Trees | NeurIPS | 2018 | 
    
      | 37 | Model compression via distillation and quantization | ICLR | 2018 | 
    
      | 38 | The power of deeper networks for expressing natural functions | ICLR | 2018 | 
    
      | 39 | Decision Boundary Analysis of Adversarial Examples | ICLR | 2018 | 
    
      | 40 | On the Information Bottleneck Theory of Deep Learning | ICLR | 2018 | 
    
      | 41 | Sensitivity and Generalization in Neural Networks: an Empirical Study | ICLR | 2018 | 
    
      | 42 | Generating Wikipedia by Summarizing Long Sequences | ICLR | 2018 | 
    
      | 43 | Can Neural Networks Understand Logical Entailment? | ICLR | 2018 | 
    
      | 44 | Towards Reverse-Engineering Black-Box Neural Networks | ICLR | 2018 | 
    
      | 45 | The High-Dimensional Geometry of Binary Neural Networks | ICLR | 2018 | 
    
      | 46 | Detecting Statistical Interactions from Neural Network Weights | ICLR | 2018 | 
    
      | 47 | The Implicit Bias of Gradient Descent on Separable Data | ICLR | 2018 | 
    
      | 48 | Learning how to explain neural networks: PatternNet and PatternAttribution | ICLR | 2018 | 
    
      | 49 | GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models | ICML | 2018 | 
    
      | 50 | Which Training Methods for GANs do actually Converge? | ICML | 2018 | 
    
      | 51 | Nonoverlap-Promoting Variable Selection | ICML | 2018 | 
    
      | 52 | An Alternative View: When Does SGD Escape Local Minima? | ICML | 2018 | 
    
      | 53 | Stability and Generalization of Learning Algorithms that Converge to Global Optima | ICML | 2018 | 
    
      | 54 | Scalable Deletion-Robust Submodular Maximization: Data Summarization with Privacy and Fairness Constraints | ICML | 2018 | 
    
      | 55 | On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization | ICML | 2018 | 
    
      | 56 | Escaping Saddles with Stochastic Gradients | ICML | 2018 | 
    
      | 57 | Deep Asymmetric Multi-task Feature Learning | ICML | 2018 | 
    
      | 58 | GNN Explainer: A Tool for Post-hoc Explanation of Graph Neural Networks | KDD | 2018 |