[2021-12-27] 오늘의 자연어처리

2021. 12. 27. 17:40paper-of-the-day

반응형

Are E2E ASR models ready for an industrial usage?

 

The Automated Speech Recognition (ASR) community experiences a major turning point with the rise of the fully-neural (End-to-End, E2E) approaches. At the same time, the conventional hybrid model remains the standard choice for the practical usage of ASR. According to previous studies, the adoption of E2E ASR in real-world applications was hindered by two main limitations: their ability to generalize on unseen domains and their high operational cost. In this paper, we investigate both above-mentioned drawbacks by performing a comprehensive multi-domain benchmark of several contemporary E2E models and a hybrid baseline. Our experiments demonstrate that E2E models are viable alternatives for the hybrid approach, and even outperform the baseline both in accuracy and in operational efficiency. As a result, our study shows that the generalization and complexity issues are no longer the major obstacle for industrial integration, and draws the community's attention to other potential limitations of the E2E approaches in some specific use-cases.

 

ASR(Automated Speech Recognition) 커뮤니티가 큰 변화를 경험하고 있습니다. 전체 신경(End-to-End, E2E) 접근법의 증가를 지적합니다. 에서 동시에, 전통적인 하이브리드 모델은 여전히 표준적인 선택으로 남아 있다. ASR의 실제 사용 이전 연구에 따르면 E2E ASR의 채택은 실제 애플리케이션에서는 두 가지 주요 제한, 즉 능력으로 인해 장애가 발생했습니다. 보이지 않는 도메인과 높은 운영 비용을 일반화합니다. 이 논문에서, 우리는 포괄적인 수행을 통해 위에서 언급한 두 가지 결점을 조사한다. 여러 현대 E2E 모델과 하이브리드 모델의 다중 도메인 벤치마크 기준선의 우리의 실험은 E2E 모델이 실행 가능한 대안임을 입증한다. 하이브리드 접근 방식에서, 그리고 심지어 정확성과 기준보다 더 뛰어난 성능을 발휘합니다. 운영 효율성에 있어. 결과적으로, 우리의 연구는 일반화가 그리고 복잡성 문제는 더 이상 산업계의 주요 장애물이 아닙니다. 통합, 기타 잠재적 제한에 대한 커뮤니티의 관심을 끈다. 일부 특정 사용 사례에서 E2E 접근법의. 

 

 

Automatic Product Copywriting for E-Commerce

 

Product copywriting is a critical component of e-commerce recommendation platforms. It aims to attract users' interest and improve user experience by highlighting product characteristics with textual descriptions. In this paper, we report our experience deploying the proposed Automatic Product Copywriting Generation (APCG) system into the this http URL e-commerce product recommendation platform. It consists of two main components: 1) natural language generation, which is built from a transformer-pointer network and a pre-trained sequence-to-sequence model based on millions of training data from our in-house platform; and 2) copywriting quality control, which is based on both automatic evaluation and human screening. For selected domains, the models are trained and updated daily with the updated training data. In addition, the model is also used as a real-time writing assistant tool on our live broadcast platform. The APCG system has been deployed in this http URL since Feb 2021. By Sep 2021, it has generated 2.53 million product descriptions, and improved the overall averaged click-through rate (CTR) and the Conversion Rate (CVR) by 4.22% and 3.61%, compared to baselines, respectively on a year-on-year basis. The accumulated Gross Merchandise Volume (GMV) made by our system is improved by 213.42%, compared to the number in Feb 2021.

 

 

 

 

TFW2V: An Enhanced Document Similarity Method for the Morphologically Rich Finnish Language

 

Measuring the semantic similarity of different texts has many important applications in Digital Humanities research such as information retrieval, document clustering and text summarization. The performance of different methods depends on the length of the text, the domain and the language. This study focuses on experimenting with some of the current approaches to Finnish, which is a morphologically rich language. At the same time, we propose a simple method, TFW2V, which shows high efficiency in handling both long text documents and limited amounts of data. Furthermore, we design an objective evaluation method which can be used as a framework for benchmarking text similarity approaches.

 

 

 

 

반응형

'paper-of-the-day' 카테고리의 다른 글

[2021-12-29] 오늘의 자연어처리  (0) 2021.12.29
[2021-12-28] 오늘의 자연어처리  (0) 2021.12.28
[2021-12-24] 오늘의 자연어처리  (0) 2021.12.24
[2021-12-23] 오늘의 자연어처리  (0) 2021.12.23
Paper of the Day  (0) 2021.12.23