Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
|
 |
A Trigram Statistical Language Model Algorithm for Chinese Word Segmentation
| |
|
A Trigram Statistical Language Model Algorithm for Chinese Word Segmentation
Jun Mao1, Gang Cheng1, Yanxiang He1 and Zehuan Xing2
| (1) |
Computer School, Wuhan University, Wuhan 430072, P. R. China |
| (2) |
Department of Linguistics, Central China Normal University, Wuhan 430079, P. R. China |
Abstract
We address the problem of segmenting a Chinese text into words. In this paper, we propose a trigram model algorithm for segmenting
a Chinese text. We also discuss why statistical language model is appropriate to be applied to Chinese word segmentation and
give an algorithm for segmenting a Chinese text into words. In particular, we solve the problem of searching which often leads
to low performance brought by trigram model. Finally, the issue of OOV word identification is discussed and merged to trigram
model based method in order to improve the accuracy of segmentation.
Fulltext Preview (Small, Large)
 References secured to subscribers.
|
|
|
|
|
|