Transfer learning in NLP

January 24, 2018

Sebastian Ruder wrote in his blog about perspectives of Neural Networks in NLP. He thinks that Few-Shot and Transfer learning will give huge impact in this area. I found his arguments convincing, so now I’m making experiments with Few-Shot learning. There are a lot of good datasets for Few-Shot learning problem in Computer Vision, e.g. Omniglot or any dataset for face recognition. The most closest analogue in NLP is Conversations and Question answering dataset. But result evaluation of this tasks is very specific: we can’t check if random text answers some question without human because it’s impossible to align all texts with a set of predefined labels like it is accomplished in Omniglot. There is however a NLP task which fits Few-Shot learning approach - Named Entity Linking or NEL for short. In NEL we should assign exact concept link for each mention in text. You can think about concepts like about Wikipedia articles. The number of possible concepts is very large and grows with time, so it’s impossible to apply regular classification techniques. It makes NEL task perfect for Few-Shot learning. I prepared WikiLinks based dataset This dataset consists of entity mentions with corresponding wikipedia links. There are also kNN baseline which gives 70% accuracy. KNN uses only Bag Of Words features without any word2vec or synonyms. My next step was to try use well known neural architectures for text matching. It worth detailed description, so next few posts will be about it.