Corpora

  1. CLC_EVC – English-Vietnamese bilingual corpus.

    Download Samples

  2. CLC_FVC – French-Vietnamese bilingual corpus.

    Download Samples

  3. CLC_KVC – Korean-Vietnamese bilingual corpus.

    Download Samples

  4. CLC_LVC – Lao-Vietnamese bilingual corpus.

    Download Samples

  5. CLC_VCC – Vietnamese-Chinese bilingual corpus.

    Download Samples

  6. CLC_VTB – Vietnamese treebank corpus.

    Download Samples

  7. CLC_BTEC – Basic Travel Expression Corpus.

    Multilingual speech corpus containing tourism-related sentences similar to those that are usually found in phrasebooks for tourists going abroad.

    Download Samples

  8. Specification documents

    Vietnamese word segmentation
    Vietnamese POS Tagset
    Vietnamese NER Tagset