CantoInput

About

What is it?

CantoInput is a freely available, Unicode-based Chinese input method (IME) which allows you to type both traditional and simplified characters using Cantonese romanization. Both the Yale and Jyutping methods are supported. A Mandarin Pinyin mode is also available.

Why does the world need another Chinese input method?

While there already exist excellent phonetic input methods based on Mandarin Pinyin pronunciation, there is a general lack of support for Cantonese. As a Cantonese learner, I was frustrated by the difficulty of typing Chinese, especially Cantonese-specific colloquial characters. Most existing Cantonese input methods require a Chinese version of Windows and operate using non-Unicode encodings such as BIG5 or GB, while non-phonetic methods such as Cangjie have a very steep learning curve. I originally wrote this program in 2006 for my own personal use but decided to make it freely available since I felt that other Cantonese speakers and learners might also find it useful. It’s still pretty basic, but hopefully I’ll have time to add more features in the future.

Screenshot

System Requirements and Installation

CantoInput will run on any operating system with a relatively recent Java runtime environment. I have tested it on Windows and Linux, and have been told that it works on the Mac as well. If you don’t already have Java installed, you can download it for free from: http://java.com

After extracting the ZIP file to a directory of your choice, you’ll see an executable JAR file. In Windows, you should be able to start it by double clicking on the CantoInput.jar file. I keep a link to it on my desktop for easy access. It can also be started by running the following command: java -jar CantoInput.jar

Usage

First, make sure you have a Chinese font installed. You may need to experiment with multiple fonts in order to find one which looks reasonable and contains glyphs for all the characters. To change the font used by CantoInput, click on “Select Font” from the Settings menu. On Windows, I recommend either SimSun or MingLiU, both of which should be available after you enable East Asian language support through the Control Panel / Regional and Language Options.

Usage should be fairly intuitive if you have used other input methods before. After selecting your preferred input method and whether you want traditional or simplified characters from the Settings menu, simply start typing the pronunciation for a character. For example, type “ngo” to enter the character for “I” in Cantonese – you’ll see the letters “ngo” in the lower left corner, with a list of possible character choices to the right. You can select one of these characters by typing the corresponding number. To select the first character in the list, press Space. You’ll also notice a page index in the box in the lower right – this tells you that there are more possible choices available. To see these, press one of the following keys: “]”, “=”, “.”, “PgDn”, the Down arrow key, or the Right arrow key. To move back to the previous list of characters, press one of the following keys: “[“, “-“, “,”, “PgUp”, the Up arrow key, or the Left arrow key. To enter Chinese text in another program, simply copy and paste from the CantoInput window.

CantoInput also supports a large number of compound (i.e. multi character) words. To input the Cantonese word for “we”, type “ngodei”.

To toggle between Chinese and English mode, press “Ctrl-Enter”. You can then type directly into the text area.

Credits

A quick mention of the sources I used for determining Chinese character frequency and romanization while compiling the data files:

Romanization is based on data from Aaron Chan’s excellent HanConv utility: http://www.icycloud.tk

Chinese character frequency is based on 1993-1994 Usenet statistics compiled by Shih-Kun Huang at the National Chiao Tung University in Taiwan. List is located at Chih-Hao Tsai’s site: http://technology.chtsai.org/charfreq/

Compound character data derived from CEDICT, Copyright (C) 1997, 1998 Paul Andrew Denisowski: http://www.mandarintools.com/cedict.html

I’ve also made numerous manual tweaks to these data sets to better accommodate Cantonese colloquial characters and usage.

Disclaimer

This program may be freely distributed, and is provided ‘as is’ without warranty of any kind.

Written by jburket

February 28, 2009 at 9:45 pm

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: