The NIE Corpus of Spoken Singapore English

David Deterding and Low Ee Ling


In recent years, much research in phonetics, grammar and lexicography has used computer-based corpora of speech. One advantage of using such corpora is that various researchers can access the same data and therefore make direct comparisons of their findings. Furthermore, it eliminates the need for each researcher to collect and painstakingly transcribe a completely new set of data. Typical corpora that have been used in the analysis of English include the International Corpus of English (ICE) (Greenbaum 1996) and the Machine-Readable Spoken English Corpus (MARSEC) (Roach et al. 1993).

Data from Singapore comprises one component of the ICE corpus, known as ICE-SIN, and some useful findings have emerged from this project (e.g. Ni & Ler 2000, Ooi 2001). Although ICE-SIN does comprise a substantial amount of spoken data such as conversations, phone calls, business transactions, demonstrations and unscripted and scripted speech (Ni & Ler 2000: 170), there is a need to collect more spoken data, and this has resulted in the Grammar of Singapore English Corpus (GSEC) funded by the National University of Singapore (NUS). However, as these recordings were made in natural conditions, there is too much extraneous noise and also too many unintelligible or untranscribable segments (Zhu 1999). Such recording conditions certainly encourage natural speech, which is valuable for the analysis of patterns of grammar and word usage. However, it means that detailed acoustic/phonetic analysis is often difficult or even impossible.

The main goals of the NIE Corpus of Spoken Singapore English (NIECSSE) are:

Two types of data have been recorded: interviews, where the subjects discuss past vacations and future plans; and read speech, principally subjects reading the North Wind and the Sun text, a standard passage widely used for phonetic research (IPA 2000).

Recording Conditions

All recordings were made in the Phonetics Laboratory at NIE. The laboratory is quiet but not soundproofed. In all cases, a high-quality Sure SM48 dynamic microphone was positioned just a few inches from the lips of the subjects. (For the interviews, the interviewer's voice is much quieter: it is comprehensible and has been fully transcribed, but the focus of the data is intended to be the subjects, not the interviewer.)

All recordings were made directly onto a computer using CSL software from KAY. The sampling rate was 22050, to ensure a high-quality recording. This is exactly half the most common sampling rate for music on compact disks, but it is rather higher than that generally adopted for recorded speech (Hayward 2000:68). It allows an analysis up to the Nyquist frequency of 11025 Hz, which is more than adequate for the complete description of speech.

The recordings were saved using the standard .WAV format to ensure that the data can easily be used by other researchers without the need to download any special software.

All interviews lasted for five minutes. At the end of each recording, the subject was asked if any of the material should be deleted, and a few subjects did ask for some of what they had said to be removed. A few of the interviews are therefore shorter than five minutes.


The subjects are educated Singaporeans, most of them trainee teachers at NIE. All of them speak English well, and for many of them, English is their best language, although nearly all speak at least one other language fluently. Brief biographical details, including age, ethnic group and languages spoken, are included with the data of each speaker.

The interviewer is a British lecturer at NIE, and had taught most of the speakers, in some cases for a number of courses, before the interviews. Because they were talking to their lecturer, and also because they were acutely aware that they were being recorded, the speakers will have been using a style of speech among the most formal of their repertoire (Pakir 1991). On the diglossic model of Gupta (1992), the students will have been using their H(igh) variety and not the colloquial variety ("Singlish") that is more likely to be found in informal conversations between friends.

Similarly, for the read texts, the subjects will have been using their most formal style. Although there is a question about the naturalness of read speech, there are substantial advantages in having prepared texts that allow the comparison of the same material from different speakers. The intention is to expand this section of the data, with recordings of data specially designed to analyse specific features of pronunciation (such as vowel length, voicing of fricatives, consonant cluster simplification, and use of dental fricatives).

Some data is also included of recordings of young British English speakers. Although this is not the main focus of the corpus, it may be useful to provide a comparison: if a feature of speech is claimed to be special to Singapore, it may be valuable to check whether this feature also occurs with British English speakers. These speakers were recorded under exactly the same conditions as the Singaporean speakers.


The NIECSSE is available on-line at:

The interviews have been cut into segments of between 20 seconds and one minute in length, and the transcription (in HTML format) is kept along with the speech file. The size of each speech file is up to 2 Mb. With a broad-band Internet link, these files can be downloaded in a few seconds. With a slower link, they may take rather longer to download.

Each of the recordings of the North Wind and the Sun passage is also about 2 Mb in size.

Fellow researchers are welcome to use these recordings for research purposes, with suitable acknowledgements.


Work on this corpus is supported by a grant from NIE Research Project RP 11/99 LEL: An Acoustic Analysis of Singapore English with special reference to its pedagogical applications.

(SAAL Qarterly, Nov 2001, pp.2-5)