ArticleAuthors: Trieu, Hai-Long; Nguyen, Phuong-Thai; Nguyen, Le-Minh (2015)
The sentence alignment approach proposed by Moore, 2002 (M-Align) is an effective method which gets a rela-tively high performance based on mbination of length-based and word correspondences. Nevertheless, despite the high precision, M-Align usually gets a low recall especially when dealing with sparse data problem. We pro-pose an algorithm which not only exploits advantages of M-Align but overcomes the weakness of this baseline method by using a new feature in sentence alignment, word clustering. Experiments shows an mprovement on the baseline method up to 30% recall while precision is reasonable.