Corpus Linguistic Studies of Standard Bahamian English: A Comparative Study of Newspaper Usage

While several studies have investigated the unique features of Bahamian Creole, there have been to date no academic studies analysing Standard Bahamian English, the language of formal communication in the Bahamas. This study fills this gap in the literature. Using the methods of corpus linguistics, the study presents some of the unique features of Standard Bahamian English in comparison to other international variants of English, specifically British, American, and Jamaican English. For methodological purposes the texts analyzed are limited to the genre of newspaper reportage. Features analyzed include keywords and word counts.


INTRODUCTION
Studies of Bahamian English have, up to now, focused almost exclusively on the spoken creole rather than the written or prestige/acrolectal forms of usage.
Scholarly studies have largely attempted either to tease out the distinct features of spoken Bahamian Creole (BC) (Hackert, 2004;Donnelly, 1997) or to present the unique issues faced by speakers of BC as they learn to compose in Standard English (Bain, 2005).In several of these studies, the goal has been to understand the place of BC in the Caribbean creoles and/or to support pan-linguistic theories of creole development (this is especially the case in the work of Holm, 1989).
As Sand (1999) notes, throughout the Caribbean, "the interest of linguistic research has mostly been directed towards the study of the 'pure' creole rather than educated usage" (p.13).While this focus on basilectal forms of usage is understandable, given that the basilectal forms have unique features, it is clear that educated registers have been neglected by researchers.Introducing her work on a Bahamian component of the International Corpus of English, Hackert (2010) makes a convincing case for the importance of collecting information on Standard English used within creole-speaking communities.What's more, if The Bahamas ever becomes keen on standardizing its own national variety of English, it appears likely that more creolized usages could never become the core of a codified national variant, as basilectal forms remain stigmatized among many educated speakers.Sand (1999) further notes: … the first linguists who finally tackled the question of an emerging regional standard of English in the Caribbean were teachers … who had experienced first-hand the growing gap between actual usage and the postulated norm concerning the language use of teachers and students alike within the classroom.(pp. 3-4).Donnelly (1997), herself an educator, distinguishes between students' home language, Bahamian Dialect (BD), and educated Bahamian speech, what she calls Standard Bahamian English (StBE) both of which differ from international Standard English (StE).For the purposes of this study, I shall use these terms.
Although BC is the preferred term among linguists for the mother tongue of most Bahamians as it captures the fact that the language developed out of the contact of English and myriad African languages, Bahamian Dialect is the preferred term among the Bahamian population, partly because many Bahamians are not totally convinced of the uniqueness of their tongue, and also because Bahamians associate the notion of creole with the stigmatized Haitian minority (Hackert, 2004).In this study, the terms BC and BD are understood as interchangeable.
Indeed, it is arguable that the language of formal communication in The Bahamas, StBE, what we might also call "ZNS English"2 , is a unique variety, distinct from BC, the mother tongue of most Bahamians, and distinct from other international forms of English.However, to this date there exist no formal descriptions of StBE.
The aim of this study is to tease out some of the distinctive features, if only in frequencies of usage of certain forms, of StBE and to gauge StBE's affinities with other international forms of English.The international variants focused on in this study are: British English, the former international standard for English in postcolonial West Indian nations and the language of the later colonial project in The Bahamas3 ; American English, the language of The Bahamas' close and economically powerful neighbour, a neighbour whose culture many Bahamians complain has swamped traditional Bahamian forms of expression; and, finally, Jamaican English, the most-studied Caribbean variant of English, which is also the language of the country that dominates the cultural landscape of the English-speaking Caribbean.
To date the most extensive catalogue of the unique terms of BC is Holm's masterful but now somewhat dated (1982) volume Dictionary of Bahamian English (DBE).The current study is different from the currently available guides and dictionaries describing Bahamian English (DBE or Glinton-Meicholas' Talkin' Bahamian series) in that this study focuses on formal rather than on creole or rural usage.(DBE does not focus exclusively on creole or rural usage.)While Allsopp and Allsopp's Dictionary of Caribbean Usage (2003Usage ( /1996) ) does include attention to formal and educated registers, much of the book focuses on creole and mesolectal forms.Moreover, the volume gives only small attention to Bahamian English.The most comprehensive studies of the socio-historical development of BC are Holm (1982), Lawlor (1996), Holm (1989), Hackert (2004), Hackert and Huber (2007), and Hackert and Holm (2009).

METHOD
This study uses the methods of corpus linguistics.Simply stated, corpus linguistics means linguistic research in which evidence is gleaned solely from a body of texts (the corpus).The texts in the corpus can be either written or transcribed versions of spoken conversations.Although corpus linguistic methods have existed for a long time, they have in recent decades been reinvigorated by the rise of computer technology as computers can scan large corpora much more quickly and accurately than humans.Another motivation for corpus linguistic research is that it avoids appeals to speakers' intuitions of well-formedness, as has been the primary source of evidence in Chomskyan linguistics, the approach to the study of language that has claimed and held the banner of 'scientific' linguistics throughout much of the last half-century.Corpus linguistics, on the other hand, avoids the "armchair" examples formulated to support a specific point that dominate in Chomskyan linguistics.
Accordingly, corpus linguistic scholars have claimed that its evidence is more valuable and less artificial.In general, corpus linguists pride themselves on the fact that their methods are quantitative and empirical.However, this preference for quantitative measures may make the findings of corpus linguistic research appear to outsiders as naïve or banal empiricism.
The closest previous study to this article in terms of method is Andrea Sand's (1999) book, Linguistic Variation in Jamaica: A Corpus-Based Study of Radio and Newspaper Usage.However, the present study does not include evidence from radio transcripts.
The corpus of StBE I have developed consists of newspaper reportage from the front page and business sections.As StBE's unique patterns of usage appear to be largely statistical (that is in terms of word or syntactic pattern frequencies rather than unique terminology), I have used the same range of texts in the reference corpora.Indeed, consistency in genres is particularly important when looking more for the frequency of usage of particular forms than for unique lexical items.
I have chosen newspaper articles as the genre of focus in this study for several reasons, most relating to methodological practicalities.First of all, corpus linguistic studies are made much easier when the source material is available originally in electronic form, as the material need not be transcribed.What's more, as a small developing nation, The Bahamas still has a somewhat limited body of freely-accessible material on the web.Newspapers are one of the few sources of substantial prose available freely in electronic form.
Moreover, much of the information available on the web about The Bahamas is descriptions by outsiders for an audience of foreign tourists.That is to say, newspapers are one of the few electronic discourses about The Bahamas that Bahamian writers actually control.
The corpus of newspaper reportage I have developed comes solely from newspaper articles in The Bahamas' three mainstream newspapers printed in Nassau, New Providence Island (The Nassau Guardian, The Nassau Tribune, The Bahama Journal) and the two main newspapers of the two most-populous Family Islands: Grand Bahama (The Freeport News, an arm of The Nassau Guardian) and Abaco (The Abaconian).All of the 300 articles in the corpus were originally published between August and November 2009.The articles were electronically copied from these websites and saved in .txtformat for analysis.
I have put to use three reference corpora for the study as shown in Table 1.For British English I have used the newspaper reportage component of the British National Corpus Baby edition (BNC-Baby).The BNC is a 100 million word corpus of late 20 th century English usage in England.The BNC-Baby is a 4 million word sample of the entire BNC developed so that researchers can perform robust analyses of the BNC on their home computers.
For American English, I have used the newspaper reportage component of the Brown Corpus, a now somewhat dated body of texts published originally in 1967 that is, however, still frequently used for research today.(The Brown Corpus is packaged with the BNC-Baby.)For the Jamaican English component, I have used the newly-developed Jamaican component of the International Corpus of English (ICE).As the newspaper reportage component of the ICE-JA is somewhat limited, I have supplemented the corpus with current articles from The Jamaica Observer and The Jamaica Gleaner.
The primary corpus linguistic software I relied upon for this study is AntConc (2007) developed by Laurence Anthony of Waseda University in Japan, a commonly-used freeware package.AntConc allows for analysis of: word count, keywords (words determined to occur in a much higher frequency in a particular corpus with regard to a particular reference corpus), and collocations (the co-occurence of certain terms).I also have extensively used the freeware Simple Concordance Program (2009) software developed by Alan Reed.The Simple Concordance Program is particularly useful for whole corpus statistics (e.g., word counts).An assumption, although certainly not an entirely unproblematic one, is that the writing of mainstream newspapers in The Bahamas can stand in for standard, educated usage.Although I have done my best to verify that features I discuss are not merely a result of the stylistic tics of the editors of the few Bahamian newspapers, I cannot be entirely certain that these elements aren't present.I look forward to extending my analysis of StBE to include spoken formal Bahamian English and other genres in StBE.
For each language corpus, I have developed a subcorpus of approximately 3,000 words consisting purely of direct quotes from the articles in the main corpus.The purpose of the quote subcorpora is to factor out problems that may stem from certain corpora having more and longer quotes than others.For example, first-person personal pronouns occur at a greater rate in quotes in newspapers than in the main body of articles, and as I note later, the frequency of personal pronouns is one of the more remarkable elements of the StBE corpus.
It is important to keep in mind that the Bahamian corpus includes more, and longer, direct quotes than the other corpora (especially the British and American corpora).The British English corpus is much larger than the other corpora due to the large number of texts available in the BNC-Baby package.

RESULTS
While written StBE does not appear to differ greatly from other forms of English in terms of its lexicon (terminology), it does differ substantively in terms of the frequency of several important lexical and grammatical function words.This is not to say that uniquely Bahamian terms do not exist in StBE; it is merely to say that these uniquely Bahamian terms were not judged by the corpus linguistic software to be frequent enough to be keywords.A catalogue of the uniquely Bahamian terms used in StBE, both stemming from BC and not, could be the object of a later study.Sand (1999) notes that written Standard Jamaican English (StJE) newspaper reporting often includes Jamaican Creole (JC) terms, especially for the transcription of quotes (e.g., mi taking the placing of I in StE).Moreover, StJE newspaper writing routinely includes distinctly Jamaican terms such as ganja for marijuana.The transcription of JC terms in Jamaican newspaper writing indicates that journalists recognize-and perhaps even accept-the fact that the language of their informants differs substantively from StE; Jamaican journalists acknowledge orthographically that their informants speak something other than StE.However, phonetic transcription of BC terms appears to be largely absent from the corpus of StBE.Although quotes from sources are often long in the StBE corpus, these quotes rarely include strongly BC features.

Spellings
An easily gauged feature of the StBE corpus is the frequency of British and American spellings.Table 2 shows the number of tokens (corpus linguists' more precise name for 'words') of various British and American spelling variants in the StBE corpus.These findings support the intuition that many in The Bahamas have that American and British spellings can often be used interchangeably, even in formal writing.It is interesting to note, however, that, in general, The Tribune favors British spellings, while The Guardian uses American spellings.
The Bahama Journal, however, uses both.All of this suggests that neither advocates of British nor American spellings can properly claim dominance in current Bahamian formal writing.

Keywords
Corpus linguistic programs determine the keywords of particular corpora by comparing the frequency of words in one corpus versus another (the reference corpus) which is used as a baseline.Words that place higher on the ordered lists of keywords produced by the software have greater "keyness".The keywords of a particular corpus are terms that occur at a considerably higher rate in the original corpus versus the reference corpus, indicating more frequent use of the terms in the corpus and perhaps in the register or language under study as a whole.Of greater interest is the distribution of words that are not merely Bahamian cultural words or words relating to immediate events during the period the corpus texts cover.These terms are especially of interest when they are determined to be keywords when all three other corpora are used as reference corpora.The greater frequency of such terms cannot be understood to be the result of the influence of British, American, or Jamaican language and culture; they can be understood to perhaps be the result of distinctively Bahamian preferences for usage.

What follows in
Table 4 shows the rankings of several particularly interesting keywords for StBE with different reference corpora using the AntConc package.The rankings of 1 and 3 for said with the British corpus and the American corpus as respective reference corpora mean that in ordered lists of keywords, said places first and third with the two different reference corpora.However, said ranks significantly lower, but still relatively high (as the 23 rd keyword) when the Jamaican corpus serves as reference.Indeed, the prevalence of said in the Jamaican corpus (7.5 per 1000 words) is much closer to the rate of said in the StBE corpus.
Part of the reason said occurs at a greater rate in the Bahamian corpus no doubt stems from the simple fact that there are more quotes in the StBE corpus (see Table 1), with the primary function of said in journalistic writing being to introduce quotes.Nevertheless, the greater rate of said in the StBE corpus can, in part, be attributed to Bahamian newspaper writing preferring said over other reporting verbs (e.g., told, added, announced).As Table 5 suggests, the StBE corpus uses said both at greater frequency and as a greater percentage of the total reporting verbs than the reference corpora.As Tables 3 and 4 suggest, certain personal pronouns, especially first person nominative personal pronouns (we and to a lesser extent I), occur at greater rate in the StBE corpus than the reference corpora.
Table 7 shows the prevalence of personal pronouns per 1,000 words in all of the main corpora.a It seems possible that the greater rate of the definite article the in the StBE quote sub-corpus can in part be attributed to the greater rate of use of nominalized verbs in Bahamian formal English.For example, "We had hoped the recovery would be sooner, but our data says otherwise" (Lightbourne, 2009, para. 3).This stems from the Bahamian preference for formal constructions.Sand (1999) notes a similar phenomenon in StJE resulting from the greater rate ofing nominalizations (p.131).
One might speculate (and I emphasize speculate) that the greater rate of we in the main Bahamian corpus and the quote sub-corpus may stem from the Bahamian people's sense of shared purpose resulting from the fact that they share small islands.Indeed, the StBE corpus includes many examples of direct quotes in which the Bahamian speaker uses we to emphasize shared purpose and shared sacrifice.For example: "If we make less, we need to spend less and look at managing it and making sacrifices.We should be urged to give up something that we normally would not have gave up" (Noel, 2009, para. 6). 4 However, as one might expect, the most frequent form of usage of we in the StBE corpus and the StBE quote subcorpus occurs in quotes when the quoted individual speaks for an organisation, for example: Of course one would naturally expect that BEC would minimize such occurrences and we would hope that would be the case.However, there are times when an individual is disconnected on a Friday and that's what we're trying to avoid.(McCartney, 2009, p. A8).

Formalism in the Bahamian corpus
As Sand (1999) notes in her analysis of Jamaican newspaper and radio usage, StJE demonstrates frequent preference for highly formal terminology; some Jamaicans have a "deeply ingrained appreciation of formal language" (p.110). 5and refers to this preference for formalisms and archaisms as "colonial lag, where the former colony retains more formal features while the media usage in Great Britain is moving toward a more informal style" (p.109).
A similar preference for highly formal terms is clear in the StBE corpus.Like Jamaican (Sand, 1999, p. 104), StBE demonstrates preference for the legalistic persons (instead of people or individuals) even in non-legal settings.For example: The amount covered all fees associated with the Miss Universe pageant including all the transportation by air, sea and land for several hundred persons involved in the pageant and all the hosting costs across all the islands of The Bahamas visited by contestants, media and staff (Deveaux, 2009, para. 2).
Similarly, StBE frequently uses the legalistic term statement both when reporting on legal proceedings (e.g., "According to the oral statement, Brown said he saw some girls who were at the picnic communicating with these male outsiders near the men's bathroom" [Davis, 2009, p. A1]) and when reporting on serious (or not-so-serious) non-legal issues.
There is some evidence to suggest that Bahamians are highly sensitive to social titles (see Table 9), as several social titles (e.g., Ms., Dr., and Mr.) occur at a higher rate in the StBE corpus in comparison to the reference corpora.
The prominence of Ms., the title used for women of uncertain marital status, is, however, the most pronounced, suggesting that StBE speakers carefully avoid incorrect references to females' marital status.While one might be keen to read the greater rate of Mr. over Mrs. in the Bahamian, Jamaican and British corpora as evidence of continued patriarchy in those countries, one should notice that Mrs. occurs at a greater rate than Mr. in the American corpus before jumping to conclusions.One should, furthermore, consider that the texts in the American corpus stem originally from the 1960s, when the United States was a much more male-dominated country than it is today.(Coleman, 1997), greater prevalence of passive voice constructions is an important feature of legal writing.Indeed, it is arguable that Bahamians and Jamaicans alike routinely invoke the passive voice in order to increase the formality (or at least the seeming formality) of their spoken and written language.As Tables 3 and 4 demonstrate, the StBE corpus uses that at a higher rate than the reference corpora.In fact, that ranks as the second highest keyword in the StBE corpus when both the British corpus and the American corpus are used as reference, and third when the Jamaican corpus is used.
In order to investigate further the prevalence of that in the StBE corpus, it is important to recognize that there are several different grammatical forms for that.To narrow down what this greater rate stems from, one must first determine which of the five main forms of that appear at a greater rate in the Bahamian corpus.Following Burchfield's (2000) update of Fowler's Modern English Usage, there are at least five different grammatical forms of that, including: the conjunction (what is elsewhere referred to as the complementizer form), joining one clause to another: "Brown told Stubbs that he and the girl fell to the ground" (Davis, 2009, p. A9); the relative pronoun, a pronoun introducing a relative clause that points to a referent: "most of the accounts that have been disconnected are not in excess of $1,500" (McCartney, 2009, p. A8); the demonstrative pronoun, a pronoun pointing to an intended referent: "However, there are times when an individual is disconnected on a Friday and that's what we're trying to avoid."(McCartney, 2009, p. A8); the demonstrative adjective, a determiner with a referent: "if they have vacation time, they'll be paid for that vacation time."(Dames, 2009, p. BR1); and, finally, the demonstrative adverb, indicating the scale of verbs: "the boxers from The Bahamas were not that talented" (Johnson, 2009, p. C1).
Aware that StBE demonstrates affinity for overlyformal usage, we might propose that the greater frequency of that in the StBE corpus results from Bahamians' preference for the relative pronoun that over which in restrictive relative clauses with non-human antecedents, a common preference for usage expressed by prescriptive grammarians (e.g., Safire, 2005). 6This hypothesis requires some empirical verification.Table 11 shows the frequency of the different forms of that based on a random sample and analysis of 200 that tokens in each corpus.Although that occurs at a greater rate in the Bahamian corpus across all of the various forms of the word, it is in the relative pronoun and the demonstrative adjective forms that the most striking differences in frequency exist.As for that over which, there appears to be some evidence to support the hypothesis that Bahamians preferentially choose that over which, as which occurs least frequently in the StBE corpus in comparison to the other corpora (see Table 12).However, the difference between the rates of which in the various corpora appears too small to make a definitive statement about the issue.We might try to attribute the greater rate of that in the StBE corpus to the higher rate of quotes (see Table 1) in comparison to the reference corpora (e.g., that following reporting verbs such as said or added).This does not, however, appear to be the case.The total number of collocates of that with reporting verbs is only 567, too small to account for the keyness of that in the StBE corpus.What's more, reporting verbs collocating with that only lead to an increase in the conjunction/complementizer form of that, not the other forms.
A second minor class word that appears as a highranking StBE keyword with all three reference corpora is the ever-so-tricky to.While there are many uses for to, the main division in the myriad forms and uses of to is between its infinitive particle ("I want to go to the store") and its preposition ("I went to the house") forms.To attempt to account for the difference in to tokens among the different corpora, one must first try to narrow down how the to tokens are used in the different corpora.Table 13 shows the distribution of infinitive particle and prepositional forms of to in each corpus based on a random sample of 200 to tokens in each group of texts.As Table 13 suggests, the StBE corpus uses to at a higher rate than the reference corpora for both the infinitive particle and the preposition forms.The best explanation for the greater rate of to in both forms appears to be Bahamians' esteem for periphrastic or roundabout phrasing.Circumlocutions such as the following, often using the passive voice, occur commonly in the StBE corpus: "Pratt-Taylor said parents interested in having their children receive meals through the program are advised to speak with her at the school" (Swain, 2009, p. A9).Compare this with the decidedly less roundabout active voice construction: Pratt-Taylor suggests parents … speak with her.All of this suggests that study of preposition use in StBE may be an interesting place for future research.

CONCLUSION
As Taylor (2001) notes, a continuing problem for English educators in the Caribbean is that there do not exist clearly defined national or regional standards of usage distinct from American and British usage.The "people in the Caribbean are not perceived to own the English they speak and write" (p.109) and thus "it is impossible for the educator to proceed as she or he ought to, using the important notions of appropriateness and context" (p.109).If teachers' prescriptions of usage are always someone else's prescriptions of usage, it is difficult for educators to give clear messages about usage and correctness of certain forms.
Like other Caribbean nations, The Bahamas faces an interesting and challenging road ahead as it searches for standards that can properly describe and define its own unique variant of English.These standards may come in the form of an orthography for the creole or they may come as standards of usage for formal Bahamian English writing.Of course, the current study attempts only to be descriptive rather than prescriptive.Nonetheless, I hope that the findings presented here may eventually support a dictionary or usage guide for Standard Bahamian English.

Table 1
Basic Corpora Information

Table 2
Frequency of British & American Spellings in the StBE Corpus Table 3 is a ranked list of StBE keywords with each of the reference corpora.Of course, most of the keywords produced by the software in a study of StBE are terms unique to The Bahamas (e.g., Bahamas, Bahamian, and various Bahamian last names, e.g., Rolle).The particularly interesting terms I have put in bold.

Table 3
Top 25 Keywords of StBE with Different Reference Corpora

Table 4
Rank of StBE Keywords with Different Reference Corpora

Table 5
Reporting Verb Statistics Based on Sample of 200 Verbs

Table 7
Personal Pronouns per 1000 Words the material within quotes in each of the corpora is compared, personal pronouns such as I, we, and it prove to be high-ranking keywords, as demonstrated in Table8.

Table 8
Keywords of Bahamian Quote Sub-Corpus with Different Reference Corpora Note:

Table 9
Social Titles per 1000 Words

Table 10
Percent Passive Voice Constructions in Different Corpora

Table 11
Different Forms of that per 1000 WordsAs Table11demonstrates, the StBE corpus uses that at a greater rate than all of the other corpora for all of the different forms of that.