Shi Feng

I am a Ph.D. candidate in Computer Science at University of Maryland advised by Jordan Boyd-Graber.

Research topics that I'm interested in include: interpretation and realistic evaluation of models designed for natural language processing, and learning through interaction and /or with human in the loop.


  • Calibrate Before Use: Improving Few-shot Performance of Language Models
    Tony Z. Zhao*, Eric Wallace*, Shi Feng, Dan Klein, Sameer Singh
    preprint [arxiv]
    A more effective way to use GPT-3/2.

  • Customizing Triggers with Concealed Data Poisoning
    Eric Wallace*, Tony Z. Zhao*, Shi Feng, Sameer Singh
    NAACL 2021 [arxiv] [blog]
    We devise a training data poisoning attack that allows the adversary to create triggers.

  • Universal Adversarial Triggers for Attacking and Analyzing NLP
    Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
    EMNLP 2019 [arxiv] [blog]
    We discover special phrases that causes GPT-2 to generate toxic content whenever we append the phrase to the beginning of a sentence. Similarly, we find phrases that cause misclassification in reading comprehension, natural language inference, and sentiment classification.

  • Misleading Failures of Partial-input Baselines
    Shi Feng, Eric Wallace, Jordan Boyd-Graber
    ACL 2019 (short) [arxiv]
    Partial-input baselines (e.g., hypothesis-only models for SNLI) are useful sanity checks of dataset quality. But what does it really mean to pass theses checks? Are the "hard" examples we identify really hard? We highlight potential pitfalls when using these baselines for dataset quality control, on both artificial and real datasets.

  • Quizbowl: The Case for Incremental Question Answering
    Pedro Rodriguez, Shi Feng, Mohit Iyyer, He He, Jordan Boyd-Graber
    In submission [arxiv]
    Quizbowl is a factoid QA dataset that challenges both computers and humans in unique ways. It's also a fruitful platform that lead to many exciting research projects. This is our latest version of the definitive guide to Quizbowl.

  • Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation
    Sahil Singla, Eric Wallace, Shi Feng, Soheil Feizi
    ICML 2019 [arxiv]
    Many model explanations use gradients and implicitly make a local first-order approximation of the model. Being approximations, they can't match the model prefectly. We relax this first-order assumption and enable the interpretation to capture inter-feature dependencies. We also propose a method to efficiently compute it for ReLU models and analyze when adding high-order information helps.

  • Trick Me If You Can: Human-in-the-loop Generation of Adversarial Examples for
    Question Answering

    Eric Wallace, Pedro Rodriguez, Shi Feng, Jordan Boyd-Graber
    TACL 2019 [arxiv]
    Crafting adversarial examples for NLP is difficult. Most exicting work use limited perturbations to the input. Can we create more diverse and less templated adversarial examples? One direction we demonstrate in this paper is to put a human in the loop. Importantly, use model interpretations to guide the human writers.

  • What can AI do for me: Evaluating Machine Learning Interpretations in Cooperative Play
    Shi Feng, Jordan Boyd-Graber
    IUI 2019 [arxiv] [try it out!]
    In many real world scenarios, the best way to deploy a machine learning system is not to replace the humans completely, but to have them work together. How can we make this cooperation more effective? In this paper, we explore how different intepretations assist this cooperation with both experts and non-experts.

  • Pathologies of Neural Models Make Interpretation Difficult
    Shi Feng, Eric Wallace, Alvin Grissom II, Mohit Iyyer, Pedro Rodriguez, Jordan Boyd-Graber
    EMNLP 2018 (oral) [arxiv] [press] [talk] [slides]
    Many interpretation methods in NLP derive feature importance from model confidence. But there are known issues regarding the confidence of a deep neural network: DNNs without calibration show over-confidence; they make high confidence incorrect predictions on adversarial examples. How do these issues affect confidence-based interpretation methods? We use input reduction to expose pathological model predictions and (partly) answer this question.

  • Interpreting Neural Networks with Nearest Neighbors
    Eric Wallace*, Shi Feng*, Jordan Boyd-Graber
    BlackboxNLP @ EMNLP 2018 [arxiv] [blog]
    Can we create inherently interpretable NLP models? We show this is possible on various task. It turns out you can replace the softmax layer with deep k-nearest neighbor search without sacrificing accuracy, and the model is inherently interpretable.

  • The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task
    Amr Sharaf, Shi Feng, Khanh Nguyen, Kianté Brantley, Hal Daumé III
    WMT @ EMNLP 2017 [pdf]
    Can you learn to adapt a machine translation system without having ground-truth translation pairs from the target domain? We show how to use reinforcement learning algorithms to adapt a MT system using weak feedback.

  • Improving Attention Modeling with Implicit Distortion and Fertility for Machine Translation
    Shi Feng, Shujie Liu, Nan Yang, Mu Li, Ming Zhou, Kenny Q. Zhu
    COLING 2016 [pdf]
    One of the first attempts at adding structure and inductive biases to the attention mechanism for machine translation.


  • June 1 2020 Intern @ Salesforce Research
  • Apr 25 2019 NLP Highlights Podcast on interpretation of NLP models
  • Mar 2019 Invited talks at UPenn, UCSD, UCI
  • Best reviewer award @ EMNLP 2018, 2020
  • Summer 2018 Research Intern @ Microsoft Research