Paper
1 March 2023 Heterogeneous models ensemble for Chinese grammatical error correction
Yeling Liang, Lin Li
Author Affiliations +
Proceedings Volume 12588, International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2022); 125880F (2023) https://doi.org/10.1117/12.2667512
Event: International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2022), 2022, Chongqing, China
Abstract
Grammatical error correction (GEC) aims to automatically identify and correct grammatical errors in a sentence. Neural machine translation (NMT) models are the mainstream approaches for the GEC task. However, the models require a large amount of data to be adequately trained, the variety of grammatical errors and the dependencies between errors in a sentence make it difficult for a single NMT model to correct multiple errors at once. In the work, we propose an ensemble approach for heterogeneous models, which integrates rule-based, NMT, and pre-trained language model-based GEC models through the recurrent generation approach, the approach can exploit the strengths of each model and cover a wider range of errors in a sentence. We also mitigate the scarcity of task-specific data for the GEC task through the data augmentation approach. We conduct extensive experiments on the NLPCC2018 shared task dataset to demonstrate the effectiveness of our proposed methods, and reaches the F0.5 value of 37.26, outperforming the best model in the shared task.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Yeling Liang and Lin Li "Heterogeneous models ensemble for Chinese grammatical error correction", Proc. SPIE 12588, International Conference on Artificial Intelligence, Virtual Reality, and Visualization (AIVRV 2022), 125880F (1 March 2023); https://doi.org/10.1117/12.2667512
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Error control coding

Data modeling

Education and training

Error analysis

Performance modeling

Transformers

Semantics

Back to Top