Contextual Answer Validation

Baksi, Arkadeep

Contextual Answer Validation

Baksi, Arkadeep

Date: 2022-07

Abstract:

Answer generation for a question, given a context has gained tremendous popularity in the NLP research space. Benchmark datasets like SQuAD[9] have propelled the research and recent years have seen many transformer based models achieving state of the art (SOTA) results on Question Answering tasks even beating human level accuracy. However the second step to a Question Answering System that Contextual Answer Validation is a much less attempted space in NLP. For the past few years India has seen a tremendous growth in the Edtech industry. These edtech firms are sitting on a gold mine of data primarily in Question Answer- ing space. As a result there is a growing demand for automatic Answer Validation Systems as well which can bypass the norm of human evaluation, automating the process. Apart from these, demand for such systems is also there in the Chatbot space to validate junk/spam responses and smoothen the chatbot experience overall. In our work we attempted the answer validation problem with the additional con- straints of the answer being single sentence long and having 10 words atleast. However due to the unavailability of exact datasets we had to generate synthetic data based on the SQuAD dataset. We build our model inspired from paraphrase detection and fine-tuned it against various datasets clubbed with the synthetic data we generated. Our model on final evaluation even hit an accuracy of 0.83 on the highly complex PAWS dataset which typically contains lexically highly overlapped examples