Mentamizh Word Processor
Tamil Word processor with all utilities.
Based on Natural language understanding (NLU) and Generation (NLG) / Computational Linguistics / Language and Speech Technology, to develop R & D tools for Text Processing (Corpus Cleaner, Normalizer, Parsers, and Linguistic Annotators), Speech Processing (Speech Synthesizer ), and Image Processing.
To develop Language Technology applications ranging from Corpus Analyzer to Human-Machine Interaction:
To break the barriers existing between digital savvy and digital deprived people ("Digital Divide”) in India.
To develop adequate computer processing for Indian languages, particularly Tamil, to obtain their due place in the global information society.
Tamil Word processor with all utilities.
18 types of Unicode Keyboard Drivers.
MS Office ADD-IN, Plugins for Microsoft Word, Powerpoint, Excel and Publisher.
ASCII to unicode conversion for 16 fonts.
Written Tamil spell checker for accurate Language Correction.
Automatic sandhi checker for written Tamil.
An automatic hyphenation tool based on linguistic principles.
Word class Tagger creates Annotated Text.
Experience our Exquisite E-Dictionary.
Robust corpus Enriched with Extensive samples.
From Phonology to Semantics, tools for language analysis.
Reach us for cleaned/ Transliterated/ Annotated Data.
To represent any natural language in a computer, the key component is the Keyboard driver. The number of characters or graphemes varies from one language to another.
Cross-linguistic NLP tools such as morphological, syntactic, and semantic analyzers for natural languages and also the Corpus Development and Corpus Analysis tools will be made available.
Production of linguistically motivated computational models of language understanding.
Development of e-learning materials to teach Tamil and other Indian languages; development of e-dictionaries, e-grammars, and other necessary computer-oriented materials for e-language teaching.
Application software program to create, edit, save, and print documents.; it includes a spellchecker, grammar checker, dictionaries, etc. for a language.
Tools necessary for generating natural language texts from the contents, a speaker or writer likes to represent in a natural language.
NLP-Helps computers to understand Human Natural Language.
The corpus will be collection of text-speech samples which should be processed further.
For both Rule-based as well AI-based NLP tools for natural languages, the methodologically developed corpus is essential.
For the development of cross-linguistic analysis, the texts of languages should be transliterated.
The corpus may be available in two formats: one is, plain text and the other is linguistically annotated text.
It is the strong assumption of linguistics that all natural languages are very much structured; this structure could be captured by proper rules. with these tools for analysis, the NLP may proceed.
Instead of providing linguistic knowledge to the corpus, if the quantity of the corpus is the biggest one, the computer itself could learn on its own the linguistic features of any language: these language models - "Large Language Modelling" - could be built using Probability and Deep Learning techniques. If at all necessary, the LLM may be fine-tuned for any specific purpose.