ABOUT ME:


Cinque Terre
Dien Dinh
  • I am currently the Director of Computational Linguistics Center at University of Science, Vietnam National University of HCM City, website: www.clc.hcmus.edu.vn

  • I am also an Associate Professor (granted since 2007) at the Knowledge Engineer Department of Information Technology Faculty at University of Science, Vietnam National University of HCMC.

  • Languages: English(C1), Chinese(A2), Korean(A1)

  • My Resume is available HERE.

  • E-mail: ddien@fit.hcmus.edu.vn


Year
Study process
2002-2005:

PhD in Comparative Linguistics (excellent rank) at University of Social Sciences and Humanities, Vietnam National University of HCMC with the dissertation titled “Building and Exploiting the English-Vietnamese Parallel Corpus” (the excellent rank) supervised by Prof. Nguyen Duc Dan.

1997-2002:

PhD in Computer Science (excellent rank) at University of Sciences, Vietnam National University of HCMC with the dissertation titled “English-Vietnamese Machine Translation using Bitext Transfer Learning” supervised by Prof. Hoang Kiem and Prof. Eduard Hovy (USC, US).

1998-2001:

MA in Comparative Linguistics at University of Social Sciences and Humanities, Vietnam National University of HCMC.

1993-1996:

MSc in Computer Science at University of Natural Science, Vietnam National University of HCMC.

1988-1993: Engineer in Electronics at Polytechnic University, Vietnam National University of HCMC.
1984-1988:

BSc in Physics at University of Natural Sciences, Vietnam National University of HCMC.

Year
Award
2015-2017:

Congratulatory Certificates from the Director of Vietnam National University at Hochiminh City.

1999:

Merit Medal from the Vietnam Blind Association.

1998:

Top-10 Exemplary Youths of Vietnam (field: Scientific Research) and was awarded the Congratulatory Certificate from Prime Minister of Vietnam.

Year
Info
2017:

Multilingual Electronic Dictionaries, English teaching the blind children.

2016:

Korean-Vietnamese Parallel Corpus, for SYSTRAN corporation.

2015:

OALD8 (Oxford Advanced Learners’ Dictionary 8th edition) with Vietnamese Translation, cooperated with Oxford University Press, UK.

2014-2015:

English-Vietnamese/Korean-Vietnamese Parallel Corpus, Korean-Vietnamese Dictionaries, for Samsung Electronics, Korea.

2009-2013:

Vietnamese Annotated Corpora , English-Vietnamese/Chinese-Vietnamese Parallel Corpora, Bilingual Dictionaries for I2R (Institute for infocomm Research), Singapore.

1995-2015:

Pocket electronic dictionaries with 16 languages trademark KimTuDien, cooperated with GSL company in Hong Kong.

1998:

English-Vietnamese Dictionary (in Braille) for the blind.


I am interested in Building (automatically) Annotated Parallel Corpora and Exploiting them for Machine Translation, Computer-Assisted-Translation, Contrastive Linguistics and teaching Vietnamese for Foreigners. In order to achieve above-mentioned goals, we have prepared necessary language resources as follows:

  • Building (manually) large Vietnamese corpora (approx. 300k sentences, 7M words) with linguistic annotations, e.g.: word segmentation, POS, NER, etc.
  • Building (manually) large Vietnamese-related Parallel Corpora (approx. 200k - 2000k pairs of sentences, 8M-80M words) for English/Chinese/Korean-VNese.
  • Building (manually) large Vietnamese-related bilingual dictionaries (approx. 50k – 150k entries), e.g.: /n English/French/Chinese/Japanese/Korean/German/Russian-to-Vietnamese and vice versa.
  • Building (semi-automatically) a large unannotated Vietnamese corpus (approx. 14M sentences, 440M morpho-syllables, 340M words)
  • Building (manually) monolingual Vietnamese dictionaries (approx. 40k entries) with fields: orthography, POS, grammar, meaning, examples, frequency, etc.
  • Translating (manually) OALD8 (Oxford Advanced Learners’ Dictionary 8th ed.) and LLOCE (Longman Lexicon of Contemporary English) into Vietnamese.
  • Building (manually) Vietnamese-WordNet (manually translating the English WordNet 3.0 into Vietnamese and adding new Vietnamese concepts).
  • Building Vietnamese-NLP tools, e.g.: Word Segmenter (accuracy: 99%), POS-tagger (98%), NER (95%), etc.

Cinque Terre
Cinque Terre

Besides, we have been upgrading above language resources by:

  • Collecting (automatically) Vietnamese-related Parallel Texts from available sources (e.g. TED, online news, e-books, etc.)
  • Translating (manually) popular annotated corpora into Vietnamese, e.g. PTB (Penn Tree Bank), SUSANNE, SEMCOR, BTEC, etc.

 

TEACHING:

Year
Teaching at the University
1997-now:

teaching Natural Language Processing; supervising master, PhD students at University of Science, VNU-HCMC.

2006-now:

teaching Lexicography, Corpus Linguistics, Computational Linguistics, Vietnamese for foreigners; supervising master, PhD students at University of Social Sciences and Humanities, VNU-HCMC.

2015-now:

teaching Contrastive Linguistics; supervising master students at SaiGon University (SGU) in HCM City

SUPERVISING:

  1. Computer Science:
    1. Trần Thanh Phước, “Tích hợp tri thức từ vựng trong dịch máy Hoa - Việt” (Integrating the lexical knowledge into the Chinese-Vietnamese Machine Translation), PhD thesis, University of Science, VNU-HCM, 2018.
    2. Đỗ Đức Hào, “Gán nhãn ngôn ngữ cho song ngữ Anh-Việt theo tiếp cận học bán giám sát” (Linguistic Tagging for the English-Vietnamese Bilingual Corpus using the Semi-Supervised Learning Approach), MSc thesis, University of Science, VNU-HCM, 2017.
    3. Trần Văn Tri, “Dịch tự động WordNet từ tiếng Anh sang tiếng Việt dựa vào từ điển Oxford Anh-Việt” (WordNet automatic translation from English into Vietnamese using English-Vietnamese Oxford Dictionary), MSc thesis, University of Science, VNU-HCM, 2017.
    4. Dương Văn Đeo, “Chiếu nhãn ngữ pháp phụ thuộc từ tiếng Anh sang tiếng Việt trong ngữ liệu song ngữ Anh - Việt” (Projecting the Dependency Grammar Labels from English into Vietnamese in the English-Vietnamese Parallel Corpus), MSc thesis, University of Science, VNU-HCM, 2017.
    5. Võ Văn Trị, “Chiếu nhãn cây cú pháp từ tiếng Anh sang tiếng Việt trong ngữ liệu song ngữ Anh - Việt” (Projecting the Syntactic Trees from English into Vietnamese in the English-Vietnamese Parallel Corpus), MSc thesis (in progress), University of Science, VNU-HCM, 2017.
    6. Trần Trung Hiến, “Gán nhãn vai trò ngữ nghĩa cho ngữ liệu song ngữ Anh - Việt” (Semantic Role Labeling in the English-Vietnamese Parallel Corpus), MSc thesis (in progress), University of Science, VNU-HCM, 2017.
    7. Lê Phước Nghĩa, “Gán nhãn ngữ nghĩa cho từ tiếng Anh trong song ngữ Anh-Việt” (Semantic tagging for English words in the English-Vietnamese Parallel Corpus), MSc thesis (in progress), University of Science, VNU-HCM, 2017.
    8. Lê Tuấn Thu, “Chiếu nhãn đồng tham chiếu từ Anh sang Việt trong ngữ liệu song ngữ Anh-Việt” (Projecting the Co-Reference Labels from English into Vietnamese in the English-Vietnamese Parallel Corpus), MSc thesis (in progress), University of Science, VNU-HCM, 2017.
    9. Hoàng Khuê, “Công cụ gióng hàng từ trong ngữ liệu đa ngữ song song” (A visual tool for Word Alignment in the Multi-Lingual Parallel Corpus), MSc thesis (in progress), University of Science, VNU-HCM, 2017.
    10. Lương An Vinh, “Xây dựng Mô hình đo độ khó văn bản tiếng Việt” (Building a Model for Measuring the Vietnamese Text Readability), PhD thesis (in progress), University of Science, VNU-HCM, 2016.
    11. Lê Văn Hiếu, “Xây dựng công cụ hỗ trợ dịch thuật Anh - Việt dựa trên mã nguồn mở” (Building a Computer-Assisted-Translation tool for English-Vietnamese translation using open source), MSc thesis, University of Science, VNU-HCM, 2016.
    12. Huỳnh Quang Đức, “Gán nhãn ngữ nghĩa cho song ngữ Anh - Việt” (Semantic Tagging in the English-Vietnamese Parallel Corpora), MSc thesis, University of Science, VNU-HCM, 2015.
    13. Nguyễn Hồng Bửu Long, “Gán nhãn thực thể trong song ngữ Anh - Việt” (Named Entity Tagging in the English-Vietnamese Parallel Corpora), MSc thesis, University of Science, VNU-HCM, 2014.
  2. Linguistics:
    1. Phạm Thị Xuân Hân, “Phân tích hệ thuật ngữ tiếng Hoa chuyên ngành Công nghệ Thông tin (so sánh với tiếng Việt)” (Analyzing the Information Technology Terminologies in Chinese (in comparison with Vietnamese), MA thesis, University of Social Sciences & Humanities, VNU-HCM, 2017.
    2. Trương Thị Hồng, “Áp dụng độ khó của văn bản trong việc xây dựng ngữ liệu giáo trình tiếng Việt cho người nước ngoài” (Applying the Text Readability in Building the Corpora of Vietnamese Textbooks for Foreigners), MA thesis, Sai Gon University, HCMC, Vietnam, 2017.
    3. Nguyễn Thị Như Điệp, “Các yếu tố ngôn ngữ ảnh hưởng đến độ khó của văn bản tiếng Việt” (Linguistic Factors affecting the Vietnamese Text Readability), PhD thesis (in progress), University of Social Sciences & Humanities, VNU-HCM, 2016.
    4. Trần Lê Tâm Linh, “Phân tích những lỗi dịch thuật của phần mềm Google Translate” (Analyzing the translation errors of Google Translate), PhD dissertation, University of Social Sciences & Humanities, VNU-HCM, VN, 2016.
    5. Phạm Thị Kim Uyên, “Xây dựng bộ luật văn phạm tiếng Việt theo ngôn ngữ hình thức” (Building a Vietnamese grammar ruleset in formal language), MA thesis, University of Social Sciences & Humanities, VNU-HCM, VN, 2015.
    6. Nguyễn Phạm Thiên Nhi, “Xây dựng bộ nhãn từ loại tiếng Việt trong ngành Ngôn ngữ học Máy tính” (Building a Vietnamese POS tagset in the Computational Linguistics), MA thesis, University of Social Sciences & Humanities, VNU-HCM, VN, 2010.
  3. TESOL (Teaching English to Speakers of Other Languages):
    1. Nguyễn Thị Thu, “Using the Web in teaching ESP Reading to the 2nd year English non-majored Students at the People’s Police University”, MA thesis, University of Social Sciences & Humanities, VNU-HCM, 2010.
    2. Đỗ Thị Mai Hương, “Applying FrameNet to Teaching the Usage of English verbs”, MA thesis, University of Social Sciences & Humanities, VNU-HCM, 2009.
    3. Nguyễn Thị Xuyên, “A Corpus-based Approach to Teaching ESP for Information Technology Students”, MA thesis, University of Social Sciences & Humanities, VNU-HCM, 2007.
Year
OTHER
2006:

Head of Vietnamese NLP group joined in the Multilingual Text Mining for Biocaster project at NII (National Institute for Informatics), Tokyo.

1996:

Joined in the Localization project “Windows 95 for Vietnamese” at Microsoft Corp., in Redmond, WA, USA.

Numerical
Paper
1

Phuoc Tran, Dien Dinh, Tan Lê, and Long H. B. Nguyen. “Linguistic-Relationships-Based Approach for Improving Word Alignment”, ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2017. DOI: 10.1145/3133323 (SCI-E).

2

Phuoc Tran, Dien Dinh, and Hien T. Nguyen. “Improving Word Alignment Based on Named Entity”, International Journal of Innovative Computing, Information and Control – ICIC Express Letters, Part B: Applications, Volume 8, Issue 7, July 2017. (Scopus)

3

Nguyen Le Thanh, Dien Dinh. “English- Vietnamese Cross-Language Paraphrase Identification Method”, 2017 8th International Symposium on Information and Communication Technology (SoICT 2017), Nha Trang, Vietnam, 2017. DOI: 10.1145/3155133.3155187.

4

An-Vinh Luong, Diep Nguyen, Dien Dinh. “Examining the Text-length Factor in Evaluating the Readability of Literary Texts in Vietnamese Textbooks”, 2017 9th International Conference on Knowledge and Systems Engineering (KSE), Hue, Vietnam, 2017, pp. 36-41. DOI: 10.1109/KSE.2017.8119431

5

Dien Dinh, 김위정, Diep N. “Exploiting the Korean – Vietnamese Parallel Corpus in teaching Vietnamese for Koreans”, Interdisciplinary Study on Language Communication in Multicultural Society, the Int’l Conference of ISEAS/BUFS, May 2017, pp.11-23.

6

Ngoc Tan Le, Long Nguyen, Damien Nouvel, Fatiha Sadat, and Dien Dinh. “Tandam: Named entity recognition in twitter messages in French”, in CAP, 2017.

7

Ngoc Tan Le, Long Nguyen, Alexsandro Fonseca, Fatma Mallek, Billal Belainine, Fatiha Sadat, and Dien Dinh. “Reconnaissance des entités nommées dans les messages twitter en français”, in CAP, 2017.

8

Long H. B. Nguyen, Dien Dinh, and Phuoc Tran, “An Approach to Construct a Named Entity Annotated English-Vietnamese Bilingual Corpus”, ACM Trans. Asian Low-Resour. Lang. Inf. Process. 16, 2, Article 9 (October 2016), 17 pages. DOI: https://doi.org/10.1145/2990191

9

Phuoc Tran, Dien Dinh, and Long H. B. Nguyen, “Word Re-Segmentation in Chinese-Vietnamese Machine Translation”, ACM Trans. Asian Low-Resour. Lang. Inf. Process. 16, 2, Article 12 (November 2016), 22 pages. DOI: https://doi.org/10.1145/2988237

10

Phuoc Tran, Dien Dinh, and Hien Nguyen. “A Character-Level-Based and Word-Level-Based Approach for Chinese-Vietnamese Machine Translation”, Computational Intelligence and Neuroscience, Volume 2016 (2016), Article ID 9821608, DOI: 10.1155/2016/9821608 (SCI-E)

11

Nguyen Le Thanh, Toan Nguyen Xuan and Dien Dinh, “Vietnamese plagiarism detection method”, The Seventh International Symposium on Information and Communication Technology (SoICT ’16), December 08-09, 2016, Hochiminh City, Vietnam, © 2016 ACM. ISBN 978-1-4503-4815-7/16/12

12

NTH.Nhung, L.Q.Vinh, N.Q.Minh, D.Dien (2015). “A general approach for word reordering in English-Vietnamese-English Statistical Machine Translation”, Int’l Journal on Artificial Intelligence Tools, ISSN: 0218-2130, DOI: 10.1142/S0218213015500244. (SCI)

13

T.Phuoc, D.Dien (2014), “A novel approach for handling unknown word problem in Chinese – Vietnamese machine translation”, International Journal of Computational Linguistics and Chinese Language Processing (IJCLCLP) , Vol.19, No.1, March 2014, pp. 1-10, ISSN: 1027-376X.

14

Ngoc Tan Le, Ngoc Tien Le, Dien Dinh (2013), “An Approach of Chunk Alignment for French – Vietnamese Bilingual Corpora”, IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 2, No 3, March 2013, pp. 111-117.

15

Phuoc Tran, Dien Dinh, Tan Le, Thao Nguyen (2013), “Handling Organization Name Unknown Word in Chinese – Vietnamese Machine Translation”, presented at Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF) , 2013 IEEE RIVF International Conference on, pp. 242-247, 10-13 Nov. 2013.

16

Quoc Hung Ngo, Dinh Dien, Winiwarter, W., “A Hybrid Method for Word Segmentation with English – Vietnamese Bilingual Text”, presented at Control, Automation and Information Sciences (ICCAIS), 2013 International Conference on, pp. 48-52, 25-28 Nov. 2013.

17

Long M. Truong, Tru H. Cao, Dien Dinh (2013), “Towards Vietnamese Entity Disambiguation”, in Proc. the Fifth International Conference KSE 2013, Volume 2, Part II, pp. 285-294.

18

Phuoc T., Linh T., Dien D. (2013), “Resolving Named Entity Unknown Word in Chinese – Vietnamese Machine Translation”, presented at The Fifth International Conference on Knowledge and Systems Engineering (KSE 2013, 17-19 October, Ha Noi, Vietnam, Volume 2), pp. 273-284

19

Giang Thanh Nguyen, Dien Dinh (2012), “Improving English – Vietnamese Word Alignment Using Translation Model”, presented at Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on, 2012, pp. 1-4

20

Quy Nguyen, An Nguyen, Dien Dinh, “An Approach to Word Sense Disambiguation in English – Vietnamese-English Statistical Machine Translation”, presented at Computing and Communication Technologies, Research, Innovation, and Vision for the Future (RIVF), 2012 IEEE RIVF International Conference on, pp.1-4, Feb. 27 2012-March. 1 2012.

21

Hien V. M., Dien D., Nhung N.T.H. (2012), “A Fast Decoder Using Less Memory”, in Proc. 4th International Conference on Knowledge and Systems Engineering (KSE 2012), pp. 173-180, August 17-18, 2012, Danang.

22

An N., Dien D. (2012), “A Vietnamese Part-of-speech Tagging based on Maximum Entropy Bidirectional Dependency Network Model”, in Proc. VLSP-workshop, RIVF’ 12 – The 9th IEEE International Conference in Computer Science: Research, Innovation and Vision of the Future 2012, pp. 28-33, HCMC, Vietnam, Feb. 2012.

23

Phuoc T.T., Dien D. (2012), “Identifying and Reordering Preposition in Chinese – Vietnamese Machine Translation”, in Proc. VLSP-workshop, RIVF’ 12 – The 9th IEEE International Conference in Computer Science: Research, Innovation and Vision of the Future 2012, pp. 41-46, HCMC, Vietnam, Feb. 2012.

24

Phuoc Tran, Dien Dinh, “Identifying and reodering prepositions in Chinese – Vietnamese machine translation, First International Workshop on Vietnamese language and speech processing (VLSP)”, In conjunction with 9th IEEE-RIVF conference on Computing and Communication Technologies (RIVF 2012), pp. 41-46, 2012.

25

Nghiem Quoc Minh, Dinh Dien and Nguyen Thi Ngoc Mai (2008), “Improving Vietnamese POS-Tagging by Integrating a Rich Feature Set and Support Vector Machines”, in Proc. RIVF’ 08 – The 6th IEEE International Conference in Computer Science: Research, Innovation and Vision of the Future 2008, HCMC, Vietnam, Jul. 2008.

26

Hoang Cong Duy Vu, Ngo Thi Kim Mai and Dinh Dien (2008), “A Dependency-based Word Reordering Approach for Statistical Machine Translation”, in Proc. RIVF’ 08 – The 6th IEEE International Conference in Computer Science: Research, Innovation and Vision of the Future 2008, HCMC, Vietnam, Jul. 2008.

27

Tran Quoc Tri, Pham Thi Xuan Thao, Ngo Quoc Hung, Dinh Dien, Nigel Collier (2007), “Named Entity Recognition in Vietnamese Documents”, Journal of “Progress in Informatics”, NII (National Institute for Informatics), Tokyo, Japan, Vol. 2007, No.4, pp.1-9

28

Hoang Cong Duy Vu, Nguyen Le Nguyen, Ngo Quoc Hung, Dinh Dien (2007), “A comparative study for Vietnamese text classification methods”, in Proc. RIVF’ 06 – The 5th IEEE International Conference in Computer Science: Research, Innovation and Vision of the Future 2007, HCMC, Vietnam, Mar. 2007.

29

Nigel Collier, Ai Kawazoe, Mika Shigematsu, Kiyosu Taniguchi, Lihua Jin, John McCrae, Dinh Dien, Quoc Hung, Koichi Takeuchi, Asanee Kawtrakul (2007), “Ontology-driven influenza surveillance from Web rumours”, in Proc. the 2007 Options for the Control of Influenza VI (Options) , pp. 225-226, Toronto, Ontario, Canada.

30

Nigel Collier, Ai Kawazoe, Lihua Jin, Mika Shigematsu, Dinh Dien, Roberto Barrero, Koichi Takeuchi, Asanee Kawtrakul (2007), “A multilingual ontology for infectious disease outbreak surveillance: rationale, design and challenges”, Journal of Language Resources and Evaluation, Springer Netherlands, Volume 40, Issue 3, pp 405-413, 2006. (SCI)

31

Thao Pham T. X., Ai Kawazoe, Dinh Dien and Nigel Collier (2007), “Construction of a Vietnamese corpora for named entity recognition”, in Proc. Recherche d’Information Assistee par Ordinateur (RIAO 2007), Carnegie Mellon University, Pittsburgh, PA, USA, 2007, pp 719-724.

32

Hoang Cong Duy Vu, Nguyen Le Nguyen, Dinh Dien and Nigel Collier (2007), “Topic-based Vietnamese News Document Filtering in the BioCaster Project”, in Proc. The 6th International Conference on Advanced Language Processing and Web Information Technology, Luoyang, China, Aug. 2007.

33

Pham Thi Xuan Thao, Tran Quoc Tri, Dinh Dien, Nigel Collier (2007), “Named entity recognition in Vietnamese using classifier voting”, ACM Transactions on Asian Language Information Processing, vol.6, no.4, Article No.1-18. (SCI)

34

Dinh Dien, Vu Thuy (2006), “A maximum entropy approach for Vietnamese word segmentation”, in Proc. 4th IEEE International Conference on Computer Science – Research, Innovation and Vision of the Future 2006 (RIVF’06). Ho Chi Minh City, Vietnam, Feb 12-16, 2006, pp 247 – 252.

35

Dinh Dien, Hoang Kiem (2005), “State of the Art of Machine Translation in Vietnam”, AAMT Journal – Special issue – MT Summit X, Oct.2005, report IV, pp. 14-15.

36

Dinh Dien (2005), “A Knowledge-based approach for English – Vietnamese Machine Translation”, in Proc. PAN-ASIATIC Linguistics (the 5th international symposium on Languages and Linguistics, HCMC, Vietnam, vol.2, pp. 118 – 125.

37

Dien Dinh (2005), “Building an Annotated English – Vietnamese parallel Corpus”, MKS: A Journal of Southeast Asian Linguistics and Languages, Vol.35, pp. 21-36.

38

Dien Dinh, Kiem Hoang (2003), “POS-Tagger for English – Vietnamese Bilingual Corpus”, in Proc. HLT-NAACL Workshop: Building and Using Parallel Texts: Data Driven Machine Translation and Beyond, Edmonton, Canada, 6/2003, pp. 88-95.

39

Dien Dinh, Kiem Hoang, Eduard Hovy (2003), “BTL: a Hybrid Model in the English – Vietnamese Machine Translation System”, in Proc. the MT Summit IX, Louisiana, USA, 2003, pp. 87-94.

40

Dien Dinh, Thuy Ngan, Xuan Quang, Chi Nam (2003), “A hybrid approach to word-order transfer in the English – Vietnamese Machine Translation System”, in Proc. the MT Summit IX, Louisiana, USA, 2003, pp. 79-86.

41

Dien Dinh and Kiem Hoang (2002), “Bilingual corpus and word sense disambiguation in the English-to-Vietnamese Machine Translation”, Proceedings of the 1st APIS, Bangkok, Thailand, pp.8-15.

42

Dien Dinh (2002), “Building a training corpus for word sense disambiguation in the English-to-Vietnamese Machine Translation”, in Proc. Workshop on Machine Translation in Asia, COLING-02, Taiwan, 9/2002, pp. 26-32.

43

Dien Dinh (2002), “Cognitive Linguistics Approach to Vietnamese Noun Compounds”, MKS: A Journal of Southeast Asian Linguistics and Languages, Vol.32, pp. 145-161.

44

Dien Dinh, Kiem Hoang (2001), “An approach to parsing Vietnamese Noun Compound”, in Proc. IWPT’01 (The 7th International Workshop on Parsing Technologies), 10/2001, Beijing, China, pp. 213-216.

45

Dien Dinh, Kiem Hoang, Toan Nguyen Van (2001), “Vietnamese Word Segmentation”, in Proc. NLPRS’01 (The 6th International Conference on Natural Language Processing Pacific Rim Symposium), Tokyo, Japan, 11/2001, pp. 749-756.