THUMT: An Open Source Toolkit for Neural Machine Translation

Contents

Introduction

THUMT is a data-driven machine translation system developed by the Natural Language Processing Group at Tsinghua University.

Machine translation is a natural language processing task that aims to translate natural languages using computers automatically. Recent several years have witnessed the rapid development of end-to-end neural machine translation, which has become the new mainstream method in practical MT systems.

On top of Theano, THUMT is an open-source toolkit for neural machine translation with the following features:

User Manual

This user manual describes how to install and use THUMT.

Documentation

This documentation provides detailed information about the functions in THUMT.

Downloads

Stable version

Link Size Description Date
THUMT-v1.01.tar.gz 530K The package contains the source code of the system and example datasets 2017-07-02

Latest version

Please visit GitHub to obtain the latest version.

History

Version Size Updates Date
v1.01 530K Bug fix 2017-07-02
v1.0 529K First version 2017-06-20

License

The source code is dual licensed. Open source licensing is under the BSD-3-Clause, which allows free use for research purposes. For commercial licensing, please email thumt17@gmail.com.

Citation

Please cite the following paper:
Jiacheng Zhang, Yanzhuo Ding, Shiqi Shen, Yong Cheng, Maosong Sun, Huanbo Luan, Yang Liu. 2017. THUMT: An Open Source Toolkit for Neural Machine Translation. arXiv:1706.06415.

Development Team

Project leaders: Maosong Sun, Yang Liu, Huanbo Luan

Project members: Jiacheng Zhang, Yanzhuo Ding, Shiqi Shen, Yong Cheng

Contact

If you have questions, suggestions and bug reports, please email thumt17@gmail.com.

FAQ

Q: Does THUMT support the latest version of Theano?
A: Yes. THUMT also supports Theano 0.9.0 released on 2017/03/20. We notice that there is a small problem with building the optimizer. Fortunately, this error does not affect running THUMT. We are working on solving this problem.



© 2017 Natural Language Processing and Computational Social Science Lab, Tsinghua University