NLP(76)
-
[2022-02-03] 오늘의 자연어처리
Towards a Theoretical Understanding of Word and Relation Representation Representing words by vectors, or embeddings, enables computational reasoning and is foundational to automating natural language tasks. For example, if word embeddings of similar words contain similar values, word similarity can be readily assessed, whereas judging that from their spelling is often impossible (e.g. cat /feli..
2022.02.03 -
[2022-01-28] 오늘의 자연어처리
Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models The advent of transformer-based models such as BERT has led to the rise of neural ranking models. These models have improved the effectiveness of retrieval systems well beyond that of lexical term matching models such as BM25. While monolingual retrieval tasks have benefited from large-scale training collections such..
2022.01.28 -
[2022-01-27] 오늘의 자연어처리
Modeling Multi-level Context for Informational Bias Detection by Contrastive Learning and Sentential Graph Network Informational bias is widely present in news articles. It refers to providing one-sided, selective or suggestive information of specific aspects of certain entity to guide a specific interpretation, thereby biasing the reader's opinion. Sentence-level informational bias detection is..
2022.01.27 -
[2022-01-26] 오늘의 자연어처리
Bias in Automated Speaker Recognition Automated speaker recognition uses data processing to identify speakers by their voice. Today, automated speaker recognition technologies are deployed on billions of smart devices and in services such as call centres. Despite their wide-scale deployment and known sources of bias in face recognition and natural language processing, bias in automated speaker r..
2022.01.26 -
[2022-01-25] 오늘의 자연어처리
Context-Tuning: Learning Contextualized Prompts for Natural Language Generation Recently, pretrained language models (PLMs) have made exceptional success in language generation. To leverage the rich knowledge encoded by PLMs, a simple yet powerful mechanism is to use prompts, in the form of either discrete tokens or continuous embeddings. In existing studies, manual prompts are time-consuming an..
2022.01.25 -
[2022-01-24] 오늘의 자연어처리
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation Neural machine translation (NMT) needs large parallel corpora for state-of-the-art translation quality. Low-resource NMT is typically addressed by transfer learning which leverages large monolingual or parallel corpora for pre-training. Monolingual pre-training approaches such as MASS (MAsked Sequence to Seq..
2022.01.24