LINE

Text:AAAPrint
Culture

China builds Mongolian language corpus

1
2016-01-22 13:03Xinhua Editor: Wang Fan

A Mongolian language database containing 80 million words has been launched, after ten years of collection and research, the Inner Mongolia Academy of Social Sciences said.

The Mongolian corpus is a part of the 200-million word corpora used by ethnic minorities in northern and northeastern China including the Duar, Ewenk and Oroqen languages. The project is slated for completion in 20 years.

The compilers identified 97 locations across eight Chinese provincial regions that have a Mongolian population as well as five provinces and cities in Mongolia, the Buryat Republic and the Republic of Kalmykia in Russia. They collected 4,192 hours of oral data from 6,725 mongolian speakers as well as over 4,000 hours of written data.

The corpora projects aims to help protect disappearing ethnic languages,and will be a precious linguistic resource, according to the academy.

The project has two stages. The first stage, the Mongolian corpus, is finished and the second stage, the database for the other three languages, is under way.

  

Related news

MorePhoto

Most popular in 24h

MoreTop news

MoreVideo

News
Politics
Business
Society
Culture
Military
Sci-tech
Entertainment
Sports
Odd
Features
Biz
Economy
Travel
Travel News
Travel Types
Events
Food
Hotel
Bar & Club
Architecture
Gallery
Photo
CNS Photo
Video
Video
Learning Chinese
Learn About China
Social Chinese
Business Chinese
Buzz Words
Bilingual
Resources
ECNS Wire
Special Coverage
Infographics
Voices
LINE
Back to top Links | About Us | Jobs | Contact Us | Privacy Policy
Copyright ©1999-2018 Chinanews.com. All rights reserved.
Reproduction in whole or in part without permission is prohibited.