

There are papers trying to detect implicit hate speech, with e.g., by reasoning and fine-grained classification. For example, the corpus collected from social media and online forums, if not carefully preprocessed, could be the source of hate speech of our next chatbot. This is a direct direction where NLP researchers can improve the status of the society. The impacts to society should be never forgotten by researchers. Making the society a better place is one of the more important goals in areas with wide applications in the industry (e.g., sentiment classification, text mining and question-answering). Many papers doing dialogue systems evaluate on the existing metrics. With that being said, there are already many metrics, including the expensive human evaluation. A best paper runner-up, (Mathur et al., 2020), mentions the impacts of e.g., outliers and heteroskedasticity on common evaluation metrics of machine translation systems. Note that similar critiques apply to evaluation methods of generated dialog as much as the generated translations. While there are exponential growth of papers, they are far from settling down agreements on e.g., what are the qualities of good dialog systems.

The number of publications relevant to dialog grows from 20 (2018) and 40-ish (2019) to 80-ish (2020).

E.g., VAE, dropout, consistency loss, regularization, curriculum learning, and analysis methods inspired by information theory. There are various ML methods applied to NLP systems. I saw many papers about Transformers with structural improvements. We need to understand what happened in NLP.Linguistics and cognitive psychology needs attentions.This blog presents these trends, and the note is attached below. Thanks to the format, I was able to listen to many talk sessions that would otherwise be held in parallel, and was able to observe some trends from a collection of ~150 papers I took notes about.
