NLP(76)
-
[2022-01-21] 오늘의 자연어처리
Improving Neural Machine Translation by Bidirectional Training We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation. Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally. To achieve bidirectional updating, we simply reconstruct the training samples from "src$\rightarrow$tg..
2022.01.21 -
[2022-01-20] 오늘의 자연어처리
Syntax-based data augmentation for Hungarian-English machine translation We train Transformer-based neural machine translation models for Hungarian-English and English-Hungarian using the Hunglish2 corpus. Our best models achieve a BLEU score of 40.0 on HungarianEnglish and 33.4 on English-Hungarian. Furthermore, we present results on an ongoing work about syntax-based augmentation for neural ma..
2022.01.20 -
[2022-01-19] 오늘의 자연어처리
This Must Be the Place: Predicting Engagement of Online Communities in a Large-scale Distributed Campaign Understanding collective decision making at a large-scale, and elucidating how community organization and community dynamics shape collective behavior are at the heart of social science research. In this work we study the behavior of thousands of communities with millions of active members. ..
2022.01.19 -
[2022-01-18] 오늘의 자연어처리
NLP in Human Rights Research -- Extracting Knowledge Graphs About Police and Army Units and Their Commanders In this working paper we explore the use of an NLP system to assist the work of Security Force Monitor (SFM). SFM creates data about the organizational structure, command personnel and operations of police, army and other security forces, which assists human rights researchers, journalist..
2022.01.18 -
[2022-01-17] 오늘의 자연어처리
How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generation Graph neural networks (GNNs), as a group of powerful tools for representation learning on irregular data, have manifested superiority in various downstream tasks. With unstructured texts represented as concept maps, GNNs can be exploited for tasks like document retrieval. Intrigued by how ca..
2022.01.17 -
[2022-01-14] 오늘의 자연어처리
How Does Data Corruption Affect Natural Language Understanding Models? A Study on GLUE datasets A central question in natural language understanding (NLU) research is whether high performance demonstrates the models' strong reasoning capabilities. We present an extensive series of controlled experiments where pre-trained language models are exposed to data that have undergone specific corruption..
2022.01.14