The challenge that language variation poses to language description – a LIP recap

The July edition of LIP was led by David Gil from the Max Planck Institute for the Science of Human History, Jena. The night was well attended with representatives from all the usual suspects: University of Melbourne, Monash, and La Trobe. Attendees this month also came from the University of New England, ANU, as well as from SOAS, London and NTNU, Norway. The evening’s discussion centred on issues related to Malay and Indonesian languages and varieties, but also included discussion of language documentation and description in general.

We began the evening by discussing the linguistic situation in the Riau Islands in Indonesia. After moving to Singapore over 20 years ago, David found that, when travelling in the neighbouring islands, the main variety of Indonesian spoken there ‘was not behaving how grammars said it should’. Working with the Riau Indonesian variety involved not only documenting and describing the grammar, but also justifying and delimiting the language in terms of who, where and when it is spoken and to whom. Working with a previously unrecognized variety turned out to involve considerable sociolinguistic work.

The linguistic situation in Indonesia is different to that of the Anglophone world. Varieties of Indonesian are not necessarily mutually intelligible standard varieties. Instead speakers are often bi- or multidialectal in Indonesian as well as multilingual. In some regions there may be as many as half a dozen varieties of Indonesian in everyday use. We discussed the necessity for a holistic approach to language description. In highly multilingual situations, such as that in the Riau islands, linguists need to be familiar not only with the sociolinguistic context of the language being worked with, but also the sociolinguistic context of the contact language(s) and varieties. As David put it, we need to look at the language as multidimensional, not just looking at the geographic latitude and longitude but as at who is speaking, to whom and when.

This led to a discussion about particular considerations and issues surrounding working with undescribed and underdescribed languages, with a focus on regions where varieties of Malaysian and Indonesian are spoken. In some cases it can be difficult to convince others that what is being spoken is even its own variety or language, especially when that variety is socially stigmatised. In the case of Riau Indonesian, it has, at times, been difficult to convince others of its existence as it is not listed in Ethnologue and had not been previously mentioned in the literature. Also, some speakers’ grammaticality judgements were based on standard Indonesian and did not align with Riau textual data.

The discussion of speakers providing grammaticality judgements at odds with text data led to a debate on the merits and uses of elicitation, experimental and naturalistic data. One topic discussed was the difference between what is technically grammatical acceptable in a language and what is actually a likely construction that one would find in naturalistic speech. Elicitation can easily be influenced by the linguist’s own language background and biases or by regional contact and prestige languages. This can lead to grammars that look very similar to contact language(s) of the region. One example from Riau Indonesian was recursion, which is technically grammatical but not present in naturalistic corpus data. How does one choose what to include and exclude, and should these decisions be the job of those writing grammars? While many of us hope to address and integrate questions relating to frequency vs. grammaticality judgements and sociolinguistics in our work, it was also mentioned that time and resource constraints can require us to prioritise certain factors over others. One linguist working their entire career with just one language community is unlikely to do justice to all or even most areas of the language.

David also spoke about the different sorts of data he needed working with a very isolating language like Riau Indonesian vs. working with the more synthetic Roon language (also Austronesian). In his experience, far more elicitation was required when working with the Roon in order to document extensive verbal morphology than with Riau Indonesian. Another attendee spoke about how they are now using a large corpus with good metadata collected for more traditional descriptive purposes to answer sociolinguistic questions. Corpus metadata can also be a useful tool to help delimit language and variety boundaries.  As has been discussed at previous gatherings, presenting elicitation and text data as a dichotomy is not helpful given their varied and complementary purposes. If both kinds of data are not used, then we can end up with a ‘blind men groping at an elephant’ situation.

We concluded the evening with a slight digression, considering whether standard Indonesian is the most appropriate variety of the language to be taught to second language learners in places like Australia. One sentiment was that colloquial regional varieties are more natural than a standard variety that is in some ways the ‘other’ and derivative of the colloquial varieties. In this context, Jakarta Indonesian could potentially serve as a better option for Indonesian language learners, as it is a more widely understood and used lingua franca within the archipelago than standard Indonesian.

