A drastically condensed version appeared in Journal of Sociolinguistics 3/1, 1998, 128-139.

 

Language and society:

The real and the ideal in linguistics, sociolinguistics, and corpus linguistics

 

ROBERT DE BEAUGRANDE

abstract

The relations between real language and real society have often been marginalised in modern linguistics, which either has idealised language to be a stable and uniform system and disconnected it from society for motives of theoretical rigour and purity, or else has idealised the society of speakers to be stable and uniform as well. The emergence of sociolinguistics was thus delayed and was beset by uncertainties about its theoretical foundations and practical methods. Work with very large corpora of real language data now offers us major opportunities for a fresh assessment of our conceptions of “language” and of its relation to society. Some current and future implications are discussed and illustrated with data from the Bank of English at the University of Birmingham.

 

The unity of the social milieu and the unity of the immediate social event of communication are conditions absolutely essential [for] a language-speech fact. [But] the organised social milieu [and] the immediate social communicative situation are in themselves extremely complicated and involve hosts of multifaceted and multifarious connections, not all of which are equally important for the understanding of linguistic facts, and not all of which are constituents of language.

— Valentin N. Vološinov (1973 [orig. 1929]:47)

 

1. “Language,” “social,” and “society” in influential discourses of “modern linguistics”

 

1.1 Replacing real language with ideal language

 

For most people, the social aspects of language and its central roles in society should be readily obvious. But “modern linguists” have, from the early stages of their science, nurtured a deep-lying uncertainty about whether and how those aspects and roles should be taken into account. They were doubtless uneasy about the “multifaceted and multifarious connections” like those envisioned by Vološinov, whose critique was suppressed in the Soviet Union and ignored in the West until recently.

If we examine some influential discourses of early linguists, e.g., in frequently cited authoritative books, we might detect a range of positions like these:

 

(A) The social basis of language can be firmly acknowledged, and an active co-operation can be advocated between linguistics and social science or sociology, and possibly ethnography or anthropology as well. The work of Firth, Halliday, and Pike would fit here, e.g., when Firth (1957 [orig. 1936]:75) “stressed” “the very fine distinctions in speech behaviour, determined by typical recurrent social situations.”

(B) The social basis of language can be candidly acknowledged, but arguments can be advanced to show why linguistics should be programmatically independent of social science or sociology. Saussure’s Cours of lectures delivered in 1909-11 and published from student notes in 1916 would be a pivotal instance we shall return to in  moment.

(C) The social basis of language can be curtly acknowledged, but nowhere reflected in linguistic theory, as when Hjelmslev’s Prolegomena opened by declaring that “language” is “the ultimate and deepest foundation of human society” (1969 [orig. 1943]:3), but went on to propose a “linguistic theory” making no reference whatsoever to this “foundation.”

(D) A partial or temporary disconnection between language and society can be favoured on the assumption that a reconnection in a later stage will not encounter serious problems. This position has been quite pervasive but has usually remained implicit, so that its problematic status has not been adequately explored.

(E) The disconnection of language from society and social science can be expressly asserted and defended as matter of scientific principle. Such has been a theme of Chomsky’s middle and late work, e.g., when he stated that “very theoretical few proposals have been made” for “theories concerning the study of language in society” (1977:54).

(F) The social basis of language can be quietly left unacknowledged, e.g., when Chomsky’s early Syntactic Structures (1957) simply never mentioned “society” or a single “social” factor.

 

This range of positions on language and society does not appear to constitute a coherent historical sequence, partly because linguistics has manifested little sustained sense of its own history and historicity, i.e., its place within the evolution of society and the latter’s institutions (Beaugrande 1997b); and partly because “modern science” has often been idealised to be a disinterested search for general truths in studious detachment from the fluctuating concerns and pressures of day-to-day social life.

Still, we might discern some general trends. On the whole, early modern linguistics favoured guarded or non-committal acknowledgements of the social basis of language. Society was episodically invoked as the basis or source of the regularity, uniformity, and self-sufficiency the linguists felt a “language” must have in order to constitute a valid object of scientific inquiry. More recently, the discipline has become polarised between programmatic claims that, in principle, language either should or else should not be disconnected from society for purposes of investigation. The relation between language and society has indeed come to constitute a central dividing line for an intricate network of decisions about theory and practice within a science of language, even (or especially) when linguistics declined to address it explicitly.

A divisive scenario for modern linguistics was already prepared by Saussure’s (1966 [orig. 1916]:232) resounding credo that “the true and unique object of linguistics is language studied in and for itself” — “langue” situated in a pair of rigid dichotomies against “parole” (“speaking”) and “langage” (“speech”). Such a move doubtless seemed highly strategic when linguistics was anxious to establish and justify itself as a discipline, whence the staunch support from whole schools and generations of linguists. But if the term “language” indeed refers to “language by itself,” then it simply does not refer to “language” as humans actually encounter it, which is always language in society, even when society is represented by a group of professional linguists — an issue we shall return later on. The participial modifier “studied” in the Saussurian credo glossed over any reservations about whether language can be and should be “studied in and for itself.”

Saussure also abetted the ambiguity between a real “language” like English versus “language” in the abstract: a ideal construction underlying all real languages (cf. 4.1). Speculating that “all idioms embody certain fixed principles that the linguist meets again and again in passing from one to another,” he counselled us to “determine what is universal in them,” even though he also vowed that “each idiom is a closed system” (1966:99, 23). In precisely this context, he dejectedly remarked that “the ideal, theoretical form of a science is not always the one imposed upon it by the exigencies of practice; in linguistics, these exigencies” “account for the confusion that now predominates in linguistic research” (1966:99).

A thorough examination of some influential discourses of theoretical linguistics (in Beaugrande 1991) “theoretical linguistics,” taken here to be the accredited “scientific” discipline that deliberates on the nature and properties of language.  has led me to conclude that search for “the ideal, theoretical form of a science” has fomented the paradoxical enterprise of seeking scientific accreditation by replacing real language with ideal language (Beaugrande 1997c). Attempting to circumvent or bypass the “exigencies of practice” has encouraged projects that unwittingly just trade one mode of “confusion” for another. The resolve to describe “language by itself” as a uniform and static system manoeuvred Saussurian linguistics into pursuing the peculiar question of “what would language look like when the members of a society were not using it?,” without properly considering whether such a question might have no rational answer. As a close corollary, linguists have been rendered intensely self-conscious about which issues, factors, and so on, are either properly “linguistic” or else “external” and “extra-linguistic,” where further confusion has arisen from the portentous ambiguity of “linguistic” (and its direct translations) meaning “pertaining to language” versus “acknowledged by linguistics.”

The irony was perhaps too rich for Saussurian linguists and their doctrinaire successors to digest: the more pressure they exerted upon “language” to isolate its “true and unique self,” the vaguer both the term and the concept became. Just because “language by itself” cannot be encountered, neither can we determine exactly where its borders should be drawn and what should go inside or outside. A predictable recourse has been the unadventurous principle: “when in doubt, put it outside.”

For similar reasons, we may have difficulty determining when the very term “language” may have ceased to refer to what it would mean for most members of the society, including most scientists outside linguistics. The term may rather refer to a self-validating ideal system which linguistics feels authorised to construct. Since idealisations are by definition “abstracted away” from real data, the role of real data in constructing or validating a “theory of language” has been a continuing source of confusion. Debates in theoretical linguistics have often seemed to revolve around the invidious contention that “my idealisation is better than yours!”

A science “investigating” an ideal system it has to construct on its own is likely to be defensive, as we can surmise from some rhetorical moves by well-known linguists. They have reassured us that “idealisation is inevitable” (Lyons 1977:586) or even that “idealisation” “is the sole means of proceeding rationally” (Chomsky 1977:54) (cf. 3.1). Here, “rationality” too has acquired a peculiar meaning. Whilst defending “idealisation,” Lyons vowed “it is pointless to argue that there is no such thing as a homogeneous language-system underlying the language-behaviour of the whole language-community — this is true but irrelevant” (1977:586ff). With comparable equanimity, Chomsky, who has famously declared that “linguistic theory is primarily concerned with an ideal speaker-hearer in a completely homogeneous speech-community, who knows its language perfectly” (1965:4), cheerfully granted that the “speaker of an idealised system does not exist in the real world” (1977:192). What can be so “rational” about a science declaring that “there is no such thing” as its own object of investigation, and that its “primary concern” is a human being who “does not exist in the real world”? Instead of its usual meaning, “rational” would seem to mean “based on rationalism,” the “philosophic doctrine that reason alone is a source of knowledge and is independent of experience” (Random House Webster’s  p. 1119), as famously argued by Descartes.

Other defensive moves have worked in the reverse direction by suggesting that studying  language in society would be the irrational enterprise. Saussure’s (1966:14, 9, 11) own optimistic declaration that “language is a well-defined object in the heterogeneous mass of speech facts” was  accompanied by his grim reservation that we won‘t find it by examining those “facts”: “speech cannot be studied,” nor indeed can it be “put in any category of human facts, for we cannot discover its unity.” Saussure’s proceedings evidently prevented him from seeing “the unity of the social milieu and the unity of the immediate social event” invoked by Vološinov in my opening quote. The tenor was the same when Chomsky (1965:4, 20) declared that “observed use of language” “surely cannot constitute the subject-matter of linguistics, if this is to be a serious discipline”; and that “sharpening the data by objective test is a matter of small importance for the problems at hand.”

Again, we see the quest for the “ideal, theoretical form of a science” attempting to circumvent or bypass the “exigencies of practice.” For Chomsky (1957:52), “it is unreasonable [i.e.. not “rational” in his special Cartesian meaning] to demand of linguistic theory that it provide” “methods of analysis that an investigator might actually use, if he had the time, to construct a grammar of a language from the raw data”; “it is very questionable that this goal is attainable in any interesting way.” His objection is circular: the “demand” is unreasonable” if the “investigator” is really an idealiser who has no intention of expending the “time to construct a grammar” from “data,” and who vows to “never consider the question of how one might have arrived at the grammar,” because “questions of this sort are not relevant to the programme of research we have outlined above; one may arrive at a grammar by intuition, guess-work, all sorts of partial methodological hints, reliance on past experience, etc.” (1957:56).

Circular too was the denial that “useful procedures of analysis” could be “formulated rigorously, exhaustively, and simply enough to qualify as practical and mechanical” (1957:56): such “procedures” are obviously not “practical” when the “analysis” is actually a process of converting real data into ideal data. (cf. section 2.2). And such is precisely the function of the “analysis” and “description” by means of “derivation,” “transformation,” “formalisation,” and so on: these operations propose to “explain” or “account for” data by getting rid of them in favour of data whose “structures” and “features” the linguist is authorised to invent. The operations are made to seem innocuous by avoiding real data from social discourse and using isolated invented sentences, where a good share of the idealising has been anticipated by the inventors.

And circular yet again were the denials that “elaborate and complex analytic procedures” could “provide answers for many important questions about the nature of linguistic structure”; and that “reliable operational criteria for the deeper and more important notions of linguistic theory” “will ever be forthcoming,” just because “knowledge of the language, like most facts of interest and importance, is neither presented for direct observation nor extractable from data” (Chomsky 1957:53; 1965:18f). These “important questions” and “deeper notions” had been deliberately formulated to be wholly inaccessible to “analytic procedures,” “direct observation,” and “extraction from data.” Our real question here should be what makes these “notions” so “deep” and “important” at all.

A “linguistic theory” that doesn’t provide “methods of analysis” can expediently assume that “language” is given in advance (cf. section 3.1). After Chomsky decided to “consider a language to be a set (finite or infinite) of sentences” his thematic resolve was to “assume that the set of sentences is somehow given in advance” (1957:13 85, 103, 18, 54). Circular yet again: the set is not “given” at all, even if we assume, against the grain of other formulations (e.g. Chomsky 1957: 23f; 1965: 16, 142) that the set is not “infinite” but “finite.” What is given for any real language is a very large set, finite but open, of discourse data (cf. section 4).

Saussure’s above-quoted notion of  fixed universal principles” that all idioms embody” now returns as the call for “a theory of linguistic structure in which the descriptive devices utilised in particular grammars are presented and studied abstractly, with no specific reference to particular languages”; “each grammar is related to the corpus of sentences in the language its describes in a way fixed in advance for all grammars by a given linguistic theory” (1957:5, 14). For a “theory” of this kind, we would indeed be “unreasonable to demand methods of analysis that an investigator might actually use,” let alone “objective tests for sharpening the data.” By “not referring to particular languages” and by “fixing all grammars in advance,” the “theory” has become wholly independent of data; and the replacement of real language with ideal language is ordained.

Anyone who has not yet registered that the term “language” being used here does not refer to “language” as the term is normally used should take note when a formalist announces in his inaugural lecture for a university chair that “linguistics is not about language, or languages, it is about grammar” (Smith 1983:4). Perhaps he intended a magisterial admonition for hold-outs who, like myself, still believe, nay insist, that  linguistics is about language. But he could have saved himself the trouble, since the linguistics he favoured has worked so hard to establish that “language” and “grammar” both “refer to” the same thing. He could have far more aptly said: “our kind of linguistics is not about what most people mean by ‘language’ or ‘languages’; it is about what we mean by ‘language,’ namely, ‘grammar.’

 

1.2 Replacing real society  with ideal society

 

Saussure did not deny the social basis of language, but he did invoke it in non-committal ways that implicitly marginalised it. His empty assertion that “the concrete object of linguistic science is the social product deposited in the brain of each individual” (1966:23) presented a wholly inaccessible “object” as a “social product” whilst skipping over the social questions about how it might have gotten “deposited in the brain” and whether and why (to keep his Swiss banking metaphor) some specific social groups might get smaller or larger “deposits.”

His most significant invocation of the “social” ironically accompanied his most famous idealisation: “in separating language [langue] from speaking [parole] we are at the same time separating what is social from what is individual” (cf. 3.2); “language” “is the social side of speech” and “exists only by virtue of a sort of contract signed by the members of a community” (1966:14), where we might well ask what social obligations the “contract” would stipulate. The “social” was also enlisted for the idealised stability of “language” in Saussure’s “synchronic” viewpoint, viz.: “of all social institutions, language is the least amenable to initiative; it blends with the life of society, and the latter, inert by nature, is a prime conservative force”; and because “language” is “a product of both the social force and time, no one can change anything in it” (1966:74, 76). This unexplained “social force” allowed Saussure to waffle by acknowledging that “evolution is inevitable” whilst maintaining that “no individual, even if he willed it, could modify” the language “in any way,” and that “the community itself cannot control so much as a single word” (1966:76, 71). The central control got consigned instead to “arbitrariness,” which further blotted out all the social and individual motivations Saussure’s conception of “language” had declared “external.”

A similar waffling was performed shortly after by Sapir (1921:206, 221): “language” “is probably the most self-contained, the most massively resistant of all social phenomena”; yet “language” “is the most fluid of mediums.” He too attributed the complex but tidy order of language to some unexplained social force, or, in his favourite term, “drift,” viz: “back of the face of the history are powerful drifts that move language, like other social products, to balanced patterns” (1921:122). Later on, we find Chomsky (1965:59) asserting that “the structure of particular languages may very well be largely determined by factors over which individual has no conscious control and concerning which society may have little choice or freedom”; but his own account invoked “principles of neurological organisation” plus the human “capacity to acquire knowledge,” these two factors uniting in his well-known “language acquisition device.” As I have documented elsewhere in detail, such invocations of neurology and biology signal the intent to convert linguistics from a social science into a natural science without working out the details (Beaugrande 1997d).

But for the present discussion, we should emphasise that the same move effectively bypassed social factors by moving onto a plane where total uniformity — and  along with it, the validation for the theory — gets imposed by biological necessity, recalling Saussure’s already cited “social product deposited in the brain.” Thus, Chomsky (1991:66) appealed to “a highly determinate, very definite structure of concepts and of meaning that is intrinsic to our nature; and as we acquire language or other cognitive systems these things just kind of grow in our minds, the same way we grow arms and legs.” As for Saussure’s “deposits,” no explanation was given of why this process might not work out well for specific social groups; the implication is rather that it must work the same for everybody. And “rationalism” in the philosophic sense of “knowledge being independent of experience” becomes both the mode of explanation and the phenomenon to be explained. We then need not surprised by the otherwise wildly irrational denial that “information regarding situational context” “plays any role in how language is acquired, once the mechanism is put to work and the task of language learning [sic; should be: acquisition] is undertaken by the child” (Chomsky 1965:33).

Still less should we be surprised that influential linguists have suggested a reciprocity whereby language derives its uniformity from society whilst helping to keep society uniform. For Bloomfield (1933:42), “the close adjustment among individuals which we call society” “is based on language.” For Sapir (1921:148), “something like an ideal linguistic entity dominates the speech habits of members of each group,” so that “the sense of unlimited freedom which each individual feels in the use of his language is held in leash by a tacitly directing norm,” and “the individual’s variations” “are silently ‘corrected’ or cancelled by the consensus of usage.” Chomsky (1965:3) could then portray “the position of the founders of modern general linguistics” to have been, as we noted, that “linguistic theory is primarily concerned with an ideal speaker-hearer in a completely homogeneous speech-community, who knows its language perfectly.” Here, not just the term “language” but also the terms “speaker” or “community” carry special meanings and refer to abstract idealisations. Just as we saw “language” getting separated from real data, the community gets separated from real speakers.

With reality safely out of the way, the validation of the “theory” can be built right into the terminology. Then, a “theory of language” is automatically valid because “language” is  defined  to be precisely identical with the “theory” and vice versa. The same holds for both “theory of grammar” and “grammar of language” . What any of the three terms actually refers to in a human society has remained strategically vague, since after all “there is no such thing” as “language” in this sense (Lyons); whatever it is, all three terms refer to it.

Such is exactly the import of “using the term ‘grammar’ with a systematic ambiguity to refer, first, to the native speaker’s internally represented ‘theory of his language’ and, second, to the linguist’s account of this”; and of “using the term ‘theory’ — in this case ‘theory of language’ rather than ‘theory of a particular language’ — with a systematic ambiguity to refer both to the child’s innate predisposition to learn a language of a certain type and to the linguist’s account of this” (Chomsky 1965:25). These two “ambiguities,” which sustain a third one (already noted for Saussure’s discourse) between “language” and  “a particular language,” oblige anyone using the “terms” to take it as given that “the native speaker” does hold an “internally represented theory of his language,” that the “child” does have an “innate predisposition,” and, best of all, that “the linguist” does have the valid “account.” The terms are defined in ways calculated to forestall inopportune questions.

But where does the “linguist” get the “account,” once real language and real speakers have been replaced with idealisations, and once we have disowned “methods of analysis that an investigator might actually use” (1.1)? The popular but problematic answer: by “constructing a description, and, where possible, an explanation, for the enormous mass of unquestionable data concerning the linguistic intuition of the native speaker, often himself” (Chomsky 1965:20, my emphasis). The chief (though rarely noticed) problem stems from the discourse of those same linguists emphatically denying that  the “speaker of a language,” who has “mastered and internalised a generative grammar, is aware of the rules of the grammar or even” “can become aware of them” (Chomsky 1965:8). Such denials, though they were presumably intended to bolster the defences against real speakers and real data, should justly apply to linguists whenever they assume the role of native speakers. Otherwise, they would be purporting to hold super-human powers for “becoming aware” of the “perfect knowledge” constituting the “grammar” of the “ideal speaker-hearer.” Such super-powers would ostensibly be conferred by an academic degree in “theoretical linguistics”; and we would need to investigate just how degree programmes could achieve so momentous a result.

The mutual and parallel idealising of language and society may have been strategies for evading the problems inherent in linguists being members of both society at large and of the specialised society of academic linguistics. They are socially positioned and implicated in respect to language but have been encouraged since Saussure’s time by the decorum of “science” to proceed as if they were positioned outside of language. Manoeuvring for such a positioning would tend to alienate the linguists from the society and, through sheer theoretical bootstrapping, to position language outside of itself  by making the term “language” mean something other than the socio-semiotic system (to use Halliday’s term) they themselves use in their ordinary lives and in their professional work. At advanced stages, this process proliferates theories whose respective merits or validity (“adequacy,” “power,” etc.) can never be conclusively determined because the competing theorists mean incompatible things by the term “language” but do not deal with the matter.

This impasse has fomented a procession of moves whereby the discourse of theoretical linguists has acknowledged that real “language” differs from their own ideal image, yet has cheerfully proceeded as if the differences were irrelevant for scientific inquiry. Linguists have long recognised that a language consists of multiple dialects yet treated it as a single uniform “standard”; they have noted the importance of language change whilst describing the language as a static (or “synchronic”) system; and they have declared the spoken language of the whole society (or community) to be the primary or even the sole concern whilst drawing both the theoretical and the methodological orientation from written language. Since the “ideal speaker-hearer” is simulated by the theoretical linguist, the “language” and “grammar” can quietly incorporate the features of the dialect of white, male, middle-class academics (cf. Cameron 1992).

The truly rational solution — if we use “rational” in its ordinary sense — is neither to ordain that “idealisation is inevitable” (Lyons)  and briskly go on speculating about an ideal speaker who admittedly “doesn’t exist in the real world” (Chomsky); nor to flatly “reject idealisation,” as Chomsky (1977: 58f) has accused “sociology” and “sociolinguistics” of trying to do. Instead, we can rationally inquire how the varying conceptions of “language” in respective social  groups might entail definable classes of idealisations, such as Sapir’s above-cited “ideal linguistic entity dominating the speech habits of members of each group,” and what social consequences result (4.1; 4.2). Idealisation would finally come under investigation as a constellation of socially real  processes adapting to the goals of groups of real speakers: story-tellers, film and television actors, advertisers, politicians, bureaucrats,  administrators, teachers and learners of language (or their parents), compilers of dictionaries or grammar-books, and, yes, linguists.

 

2. The order of language

 

2.1 Moving through the “levels”

 

The notion that linguistics can disconnect language from society for purposes of investigation should also be understood within the evolution of the discipline through the “levels” into which language was subdivided during early research. If we arranged the progression of levels according to the respective size and constituency of their theoretical units (as proposed for instance by Bloomfield 1933), we might have “phonemes - morphemes - lexemes - syntagmemes” corresponding to the respective practical units of sounds - word parts/words - words - phrases/clauses. To be sure, this progression is not clear-cut, e.g., about whether words match morphemes or lexemes; nor was it distinctly reflected in the evolution actually documented in the major discourses of the discipline. But it does shed light upon the enduring aspirations of linguistics to reapply successful methods of analysis and description from one level to another.

In early research, “phonology” plus “phonetics” confirmed the aspirations of modern linguistics to discover an ideal theoretical and uniform system of stable and deterministic underlying units plus a set of well-defined practical methods for the analysis of language sounds in terms of “phonemes,” each described by its “features.” Linguists working in phonology and phonetics candidly acknowledged that, in practice, the members of a society actually pronounce any one sound within a range of variations; indeed that, in fine detail, each production is a unique event. But (much as with Sapir’s “silent corrections”) these variations could be safely discounted as irrelevant  to the stable and deterministic status of the underlying “phoneme.” Real speakers proceed as if all its realisations were equivalent, so linguists are socially justified in doing the same.

The situation was already less reassuring in “morphology,” which adopted an outlook parallel to phonology by postulating a theoretical system of stable and deterministic form-units (the “morphemes”) persisting much like the system of sound-units (the “phonemes”). But for most languages that have developed a morphology, the system was plainly larger and less uniform. And the members of a society may differ widely in their conscious or unconscious awareness of such units, notably in a language like English, whose morphological repertory is overlaid by exuberant importations from Greek, Latin, and French. Throughout the Early Modern period, these importations were the mainstay for coinages in specialised or technical vocabulary, and have conferred social privileges upon those who could recognise their components, ranging from managing the pedantic menagerie of English orthography over to participating in socially important discourse on “expert” issues.

Still, morphology shared with phonology the decisive advantage of postulating form-units that correspond to recordable and discoverable segments of real language data. Also, morphology achieved its early key successes through extensive fieldwork with real speakers, where real language was observed in the social contexts of situation, whether or not the relevant factors would count as “linguistic” either as “pertaining to language” or as “acknowledged by linguistics” (section 1.1). Unless fieldwork linguists see clear counter-evidence, they can safely assume that the members of society are using the language in ways which represent the underlying morphological system.

Had the discipline of linguistics expressly been moving from smaller toward larger units and constituents, the study of word-parts as “morphemes” on the “level” of morphology would have logically been followed by the study of whole words as “lexemes” on the “level” of “lexicology.” But that “level” was horrendously incompatible with the established conception of “language” in being far from stable nor uniform and in resisting a general description in terms of the tidy “units” and “features” that function so nicely in phonology. The lexicon of any real language represents the concepts and classifications for which diverse groups in a society provide motivations (4.1), such as the advances in technology that, as noted, have also powerfully affected the morphology of English. A direct consequence, for which modern linguistics was blankly unprepared, is that, apart from a few tidy “lexical fields,” neither the size nor the internal organisation of the system of lexemes could be consensually determined by establish theoretical methods. Units fade out or fade in, and their meanings steadily evolve during the social practices in and accompanied by language (cf. section 4). Moreover, the members of a society indisputably differ among themselves in the knowledge of lexemes much more sharply than in their knowledge of phonemes and morphemes; indeed, the lexical store of any one speaker might well be unique. For all these reasons, the relative neglect of lexicon and lexicology in modern linguistics could be grasped as a further reflex of the reluctance to seriously acknowledge the rich diversity and detail that a society maintains within the real language it speaks.

Linguistics preferred to seek new successes on the “level” of “syntax,” whose arrangements of units in sequences appeared vastly more amenable to rigorous analysis and description than did the lexicon. But appearances were deceiving. When you are not just identifying and labelling sound-units or form-units but trying to describe or explain the mutual positions within whole arrays of units, you need to inquire why the units might have been chosen and arranged in particular ways. Most formalist “theories of syntax” in modern linguistics have assumed on principle that the language system or its “grammar” subsumes a system of “rules” determining which units are positioned where in which sequences. And, as we saw in section 1.2, some “theories” have directly equated “language” with such a rule-system or “grammar” in a further step toward unrelenting idealisation. Predictably, most theories also assumed on principle that this rule-system could be cleanly differentiated from the motivations of specific speakers or social groups when they put words in one order rather than another, witness the opening chapter title of Chomsky’s Syntactic Structures: “the independence of grammar.”

Ironically, none other than Saussure had long before aired a canny reservation against any such project during his ruminations aimed at excluding syntax from his concept of “language” (“langue”). He wrote: “in the syntagm there is no clear-cut boundary between the language fact, which is a sign of collective usage, and the fact that belongs to speaking and depends on individual freedom; in a great number of instances it is hard to classify a combination of units because both forces have combined in producing it, and they have combined in indeterminate proportions” (1966:125). In light of the present discussion, Saussure’s reservation implied that syntax could definitely not sustain the disconnection from society, and that the border between socially determined “collective usage” versus “individual freedom” would remain “indeterminate” in principle (cf. section 4).

Paying no heed to these implications, “syntax” embarked upon a radical replacement of real language with ideal language. As we have seen in 1.2, real society was correspondingly replaced with an ideal society who “knows the language perfectly”; and real speakers were declared unable to “become aware of the rules of the grammar.” The self-confident proponents of such a “theory of syntax” studiously failed to notice how they were putting themselves in an untenable social position, both in theory and in practice, as adepts holding super-human powers. They could exploit a long tradition of disregarding the implication of linguists being speakers of one or more language whilst they scaled new heights in discounting the both the “observed use of language” and “methods of analysis” for dealing with it.

The “indeterminate” border between socially determined “collective usage” versus “individual freedom,” which had led Saussure to suspect that syntax would spread across both “language” and “speaking,” now vanished, because the “homogeneous community” and the “ideal speaker-hearer”  are fully interchangeable and yet fully intangible. Holding “perfect knowledge of the language,” this “speaker” in theory knows everything about the language via “competence” and in practice says nothing in the language after having been “idealised” out of “performance” (cf. Chomsky 1965:4); perhaps “he” stands transfixed in “tacit introspection” upon that wondrous “infinity of sentences” he would be “competent” to say. “He” is thereby the “ideal” representative for the language when the members of a society are not using it: just what Saussurian linguistics set out to describe in the first place.

The most significant long-term effect of placing ideal language and ideal society at the centre of “syntax” has been the fragmentation of linguistic science through a dramatic proliferation of competing theories and models. The contention that “my idealisation is better than yours!” has become acutely polemic as the field underwent a severe breakdown in consensus:

 

there seem to be a great many approaches “on the market” whose interrelationships remain as poorly understood as ever. In fact, it is not easy to even determine which of the thirty-odd major syntactic frameworks that have appeared over the last forty years continue “alive.” [Some might] not have been “theories” at all, but just “formalisms” built in such a minimalistic way from the very beginning that practically no progress was possible in principle (Escribano 1993:229f)

 

Unintentionally, the “minimalist framework” still on the market (e.g. Abraham et al. [eds.] 1996) symbolises how “syntax” as understood in this conception of  linguistic theory by no means constitutes a complete stable and deterministic system of “rules,” but merely a modest range of frozen islands which the grammar of a particular language happens to have accumulated (Beaugrande 1997a) — in many languages far fewer than in English, which, fittingly enough, has been most often subjected to formalist analysis.

A radical conclusion might be that the “level” or “component” of  “syntax,” in the sense predominating over the last forty years simply does not exist: it is a theoretical construct engendered by the peremptory resolve to disconnect “language by itself” from language in society. The disconnection is retraced and repeated all across the conceptions and terminologies, such as “competence” versus “performance,” “deep structure” versus “surface structure,” and “universal” versus “language-specific,” each pair offering the ideal in place of the real and implying the super-human powers of linguists over the rest of society.

Indeed, if “much of the actual speech observed consists of fragments and deviant expressions of a variety of sorts” (Chomsky 1965:201), then all the members of society, including linguists when they’re not on the job. are “deviant” speakers. In yet another rich and unintentional irony, we behold in a new guise the old disparagements cast upon everyday language by the self-appointed guardians and grammarians whose views had been indignantly rejected by early modern linguistics, notably by Bloomfield and Firth. Whereas the deviance had formerly been attributed to speakers being “ignorant,” “uneducated,” or “illiterate,” it would now be attributed to the failure of speakers, presumably distracted by social factors, to conform to the deterministic “grammaticalness” which the syntacticians confidently situated at the very base of “competence.”

 

2.2 Order and disorder

 

In their determination to establish the “perfect” order of “language by itself,” some influential linguists have evidently viewed the real language observable in society as a massive disorder. This view implied the remarkable corollary, which I have not yet found explicitly stated, that when the members of society use their language, it undergoes a special “catastrophe” in the technical sense of “catastrophe theory” (cf. Thom 1989), namely an abrupt transition from stable and integrative order to unstable and disintegrative disorder. I cannot conceive how such a system could operate at all, let alone with the impressive efficiency and precision we can observe in social interaction and communication. Speakers and hearers would be obliged to desperately convert language data back and forth between two totally disparate modes of order corresponding to ideal language and real language, respectively. And the failure of highly-trained formal linguists to agree upon how such conversions could be achieved, let alone to align ideal language with real language in any consensual way, already signals how implausible such a mode of operation would for be ordinary speakers.

A far more plausible conclusion would be that theoretical linguistics since Saussure has routinely attributed to “language” an inappropriate mode of order: the actual order of language elaborately supports the order of discourse without fully determining it. The transition from language into discourse specifies and applies numerous constraints that become decidable only on the plane of the actual discourse and not on the plane of the virtual system, as we shall see in section 4. Conversely, attempts to navigate a transition from discourse over to the abstract language system creates a margin of undecidability; and precisely that margin forecloses the prospects for any deterministic formal syntax or rule-system that could “generate all the sentences of a language.” So the breakdown of consensus in linguistics and especially in syntax is a foreseeable outcome of the unproductive assumption that “language by itself” comprises its own complete set of “purely linguistic” constraints or “rules,” concerning which discourse — “actual speech” with its “heterogeneous mass” (Saussure) and its “fragments and deviant expressions” (Chomsky) — is uninformative or downright misleading. The productive assumption would rather be that discourse is the domain wherein the constraints of the language are actualised, but also specified, modified, evolved, and so forth, in an ongoing dialectic with social interaction (4.1). If you discount that interaction and misrepresent the dialectic as a dichotomy, the language drifts out of control, and you may feel animated to start inventing an arbitrary and gratuitous apparatus of “rules” and “features” to re-impose control.

Now if, as I have suggested in section 1.1, the “analysis” and “description” by means of “derivation,” “transformation,” “formalisation,” and so on is in practice a process of converting real data into ideal data, then we would have an artificial transition from order into disorder in exactly the opposite direction as the one implied by Saussurian and Chomskyan linguistics. Such would be the outcome of the “idealisation” that Lyons (1976:588) has called “decontextualisation,” whereby “system-sentences” “are derived from utterances by elimination of all the context-dependent features.” Strictly speaking again, the results would not be “system-sentences” but uniformly meaningless strings of sounds or characters — the ultimate disorder of total entropy. In practice, the operation is never performed, because contexts are the ultimate basis for any linguist identifying the units and patterns of language. At most, linguists can pretend that the units and patterns are, like Chomsky’s monumental “set of sentences,” “somehow given in advance” (1.1). The “theoretical linguist” who consents to situate ideal language in the place of real language acquires, as a package deal, all its “deep” and “surface” entities suspended in a timeless context-free space. But when the same linguist sets about “eliminating context-dependent features from utterances,” the results are conspicuously not reliable or convergent, so we’re lucky they won’t be put to any use by real speakers in society (cf. 3.1).

 

3. Sociolinguistics between real language and ideal language

 

In the foregoing sections, I have essayed to sketch the complex array of issues and problems  in the evolution of modern linguistics regarding the relations between language and society and between ideal language and real language. I shall use that background for examining some issues and problems in the field of sociolinguistics.

First of all, the background might help explain why the consolidation of a discipline of “sociolinguistics” was postponed for decades or confined to programmatic statements such as Currie’s (1952), who was apparently the first to use the term, as far as I know. Apparently, the decisive motive for its eventual emergence in the 1960s was not a shared perception among theoretical linguists that idealising language and marginalising social factors had seriously misrepresented human language; on the contrary, the 1960s witnessed a fresh burst of radical idealisations, as I have noted. Instead, the motive was to attenuate the worsening socio-economic problems and inequalities in the 1960s through  institutional initiatives directed toward divergent language varieties within the society. When the dominant “Western economies” moved away from unskilled labour and factory production toward communication and information management, the actually prevailing variations among real languages or varieties were judged to be serious obstacles to “economic growth”, which seemed to call for a wider integration of the “working classes” and “minorities” by “improving” their language-dependent skills. Governmental institutions in the U.S, the U.K.,  and Western Germany among others, decided to sponsor extensive research in the field that came to be called “sociolinguistics.”

The new discipline might have adopted several scenarios:

 

(a) The theories and methods of linguistics could be substantially retained whilst modifying some of the available terms and concepts to refer to “social” aspects or factors, e.g., “sociolect” and “idiolect” as two further constructs of “linguistic competence.”

(b) The theories and method of linguistics could undergo cautiously regulated revisions to admit some socially relevant parameters of “variation” and restrict the uniformity and “homogeneity” assumed so far, e.g., by postulating “variable rules” alongside the usual “categorical rules.”

(c) Linguistics could be split apart into a “non-social” sector continuing as before and a “social” sector taking up a new programme for “sociolinguistics.”

(d) Linguistics could restore its focus upon fieldwork, which had continued along the margins of the mainstream, e.g., in the Summer Institute of Linguistics.

(e) Linguistics could stand aside and the research could be allotted to sociology proper.

(f) Linguistics and sociology as established so far would be combined.

(g) A novel discipline would be institutionalised, related to sociology and linguistics but developing new theories and methods.

 

To varying degrees, all of these scenarios have been favoured, at times in combination. But the willingness to deliberate and negotiate has been modest at best.

Linguistics might well have seemed a perplexing enterprise for social science and sociology, who would be nonplussed by announcements like “language exists perfectly within a collectivity” (Saussure 1966:14). Saussure himself had expressly posed the question, “must linguistics then be combined with sociology?” but had proceeded to suspend it by situating the concerns of sociology within an “external linguistics” tailored to subsume “everything” whose “exclusion” was “presupposed” by his “definition of language” (1966:6, 20).

Some later comments sound more defensive. Sapir blamed the “social sciences” for creating “the most powerful deterrent of all to clear thinking” by “instilling an evolutionary prejudice” that certain “familiar languages represent the highest development,” which modern linguistics sternly repudiated along with all “popular statements as to the poverty of expression to which primitive languages are doomed” (1921:123, 22). Chomsky was, as usual, bluntly dismissive: “most things in the social sciences” have “no intellectual depth” (1991:88).

On the other side, Firth (1957 [orig. 1935]:27) announced that “sociological linguistics is the great field for future research.” His own “schematic construct called ‘context of situation’” was intended to “make sure of the sociological component” (1957 [orig. 1950]:182). “We must take our facts from speech sequences verbally complete in themselves and operating in contexts of situation which are typical, recurrent, and repeatedly observable”; and these “contexts” should be “placed in sociological and linguistic categories within the wider context of culture” (1957 [orig. 1935]:35). Yet when Firth was conjecturing that it would be “much easier  for a student of linguistics to acquire sufficient” “sociology” than for a “sociologist to acquire the necessary linguistic technique,” he  recommended “building on the foundations of linguistics” more than “aiming at linguistic sociology” (1957 [orig. 1935]:28).

At all events, the term “sociolinguistics” has remained a signpost for the resolve that the field will be more “linguistics” than “sociology,” even though the former had long been indecisive about the relation between language and society. A troubling issue for sociolinguistics would be whether to maintain “theoretical linguistics” along with its conceptions of ideal language; or else to inaugurate a more “social linguistics” derived from real language.

 

3.1 Maintaining theoretical linguistics

 

To no one’s amazement, Chomsky has been a vociferous advocate of maintaining “theoretical linguistics” in the version he believes he can dominate. His distaste for “social science” he cannot dominate has hardened into a grim conspiracy theory about “social and political analysis being produced to defend special interests rather than to account for the actual events,” and to create the “false impression” that “only intellectuals equipped with special training are capable of such analytic work” by “pretending to be engaged in an esoteric enterprise, inaccessible to simple people” (Chomsky 1977:4f). His majestic unawareness of having done precisely this in linguistics is a bit breath-taking; but he evidently feels compelled to use every means for defending his own enterprise against “theories concerning the study of language in society” (1977:54). He has go so far as to allege that “the intellectually interesting, challenging, and exciting topics, in general, are close to disjoint from the humanly significant topics” (1991:88), with the marvellous corollary that the linguistics he favours would be all the more “interesting” for being “humanly insignificant.” And he made this corollary explicit too when he denounced the “real fallacy” in saying:

 

“I’m a linguist; therefore, in my time as a linguist I have to be socially useful.” That doesn’t make sense at all. […] your professional training as a linguist […] just doesn’t help you to be useful to other people. […] there is a lot of careerism in this. (Chomsky 1991:88).

 

Can this defiant repudiation of “social usefulness” have been provoked by swelling anxieties about the fate of the whole formalist programme in “theoretical linguistics,” which has been sustained all along by the frankest “careerism”?

When asked what “sociolinguistics” might do, Chomsky (1977:57) envisioned it “seeking to apply” “sociology to the study of language.” But his vision might astound many sociologists, since he again took ideal language to be given in advance (1.1). “The sole means of proceeding rationally” would be:

 

You study ideal systems, then afterwards you can ask yourself in what manner these ideal systems are represented and interact in real individuals. Perhaps sociolinguistics might come up with some sort of principle. (1977:54)

 

The term “study” can only mean here “invent and speculate about.” We might join him in being “sceptical” whether such a “study” could “draw much from or contribute much to sociology” or could “influence linguistic studies in some significant way” (Chomsky 1977:57, 192). Yet surely the “rational procedure” would be if anything just the reverse: to “study real systems” and “then afterwards” ask what mode of idealisation might lead to a suitable representation in terms of what those systems have in common (cf. section 4.1).

Chomsky’s strange vision accentuates still another rich irony: the “agreement,” diagnosed by a comprehensive survey of sociolinguistics like Dittmar’s (1976:132-3), that the “grammar model first proposed by Chomsky (1957, 1965) and later extended must be the starting point of all theoretical discussion” (as in Durbin and Micklin 1968; Kanngiesser 1972; Loflin 1970). A proximate step would be, as Chomsky’s own adumbrations intimated, to retain the heavily idealised deterministic, stable, and abstract notion of “language” whilst softening the assumption that language and society mutually render each other fully uniform.

Here, sociolinguistics might draw upon the conception of “dialects,” which has been prominent in historical philology and in fieldwork linguistics. Some early statements of linguists had episodically cited dialects among their reservations about how to delimit any language: “the dividing lines between languages, like those between dialects, are hidden in transitions,” and “it is impossible, even in our hypothetical examples, to set up boundaries between the dialects” (Saussure 1966:204); or “there is no absolute distinction to be made between dialect boundaries and language boundaries” (Bloomfield 1933:445). But these admission were quickly left aside; Saussure couldn’t see that using “hypothetical examples” was the greatest obstacle against “setting up boundaries.” And his notion that “given free reign, a language has only dialects” makes you wonder how the language might be “reined” when he himself had decreed, as we saw, that “the community itself cannot control so much as a single word” (1966:195, 71) (1.2).

In another typical waffling, Lyons conceded that “a linguist” “will normally restrict his description to some pre-theoretically distinct dialect,” but still justified the “assumption” of an “overall system” which is “relatively neutral” about “differences of dialect, situation, medium, and chronological period” (1977:588). Just as we saw Saussure darkly vowing that “speech cannot be studied” “for we cannot discover its unity” (1.1), Lyons marginalised “language varieties” by arguing that “it would be absurd to hope to describe, or even to determine, all these differences within what we call, pre-theoretically, English” (1977:587). In light of the foregoing discussion, the irony of calling a real language a “pre-theoretical” entity should be exquisitely savoured along with the spice of Lyons vowing (in the same passage) that “idealisation” will rescue us from “absurdity,” after idealisation has regaled us with absurdities for decades.

Perhaps because the older conception of “dialect” was not deemed sufficiently “theoretical” (i.e., idealised), a contrastive pair conceptions was introduced: the “idiolect” as “individually different speech behaviour, individual competence,” versus the “sociolect” as “speech behaviour specific to social groups, group-specific competence” (Dittmar 1976:133; cf. Decamp 1969). These conceptions might  have spelled the end of the “homogeneous language community” and the “ideal speaker-hearer” (cf. Klein 1974), but some sociolinguists had other plans. Through a minimal “extension of generative grammar,” the over-arching “grammar” in the Chomskyan sense was said to “generate all the idiolects of the language and only these,” where each “idiolect” is (what else?) “an infinite set of sentences” and has its own “idiolectal grammar” comprising (what else?) “a specific finite set of rules of an individual speaker-hearer’s linguistic competence” (Decamp 1969:18). Described in these terms, the “individual” is every bit as “ideal” as Chomsky’s own “speaker,” albeit no longer in a “completely homogeneous speech-community,” and still “knows the language perfectly,” even if he may be the only one around who knows it. Chomsky’s (1965:25) much-quoted idea that “as a precondition for language learning,” a child “must possess a linguistic theory that specifies the form of the grammar of a possible human language” would imply that the speaker’s “theory of language” given as an “innate predisposition” (1.2) also specifies the over-arching “grammar” which “generates his own idiolect” as well as the others. He might then be “multi-idiolectal” and prone to “idiolect-switching,” and his knowledge would encompass “multiple infinities” constrained only where the grammar stipulates which “idiolects” cannot be “generated.” The prospect arises of an extended “derviational history” wherein every sentence starts in the over-arching “grammar” and gets “generated” along into the “idiolectal grammar” before our “speaker” can say anything — although, if we use our terms strictly, as I pointed out in 2.1, the ideal speaker never does say anything because he has been “idealised” out of “performance,” quite apart from standing transfixed by at least one “infinity of sentences.”

Yet, as we saw, Chomsky’s “speaker” avowedly “does not exist in the real world” (1.1.) and is fully interchangeable with the “homogeneous community” (2.1). In terms of practice, these two factors would be highly inauspicious for sociolinguistic research on  “idiolects,” witness the absurdity salvaging the “homogeneous community” by assigning it only one “speaker-hearer.” If, as Labov (1969:759) has surmised, “constructing complete grammars for idiolects” is a “fruitless task,” then chiefly because the established conception of “grammar” is too idealised to admit of individual specifications, and because many specifications of an idiolect would be not “grammatical” but lexical or “lexicogrammatical” (cf. 4.1). Proposals such as Decamp’s raise the troublesome prospect of complicating the relation between a real language in society and the ideal “language” of Chomskyan linguistics with multiple “grammars” whose “complete construction” remains far out of reach. The already muddled status of  “methods of analysis  for constructing a grammar,” which Chomsky had excused “linguistic theory” from “providing” (1.1),  would become even more unruly.

In terms of theory, however, we might consider how the introduction of “idiolects” could bear on the problematic method, scrutinised above, of allowing the “ideal speaker” to be represented by a theoretical linguist whose “idiolect” might entail some untypical features, e.g., a proclivity to produce sample sentences like John is as sad as the book he read yesterday or is Brazil as independent as the continuum hypothesis? (Chomsky 1965:183). Such an “idiolect” (you might almost say “idiotolect”) is symptomatic for a highly unrepresentative “performance” calculated to provide reverse evidence for  “competence” through “deviations from the rules” that would not be found in “the actual use of language” (cf. Chomsky 1965:4)

At all events, sociolinguistics has focused not upon idiolects but upon “sociolects” on the reasonable though still unproven assumption that these are fairly well-defined and not unmanageably numerous or individualised. A prominent and paradigmatic contrast, no doubt encouraged by the mandate to integrate minorities in the U.S., was drawn between “Standard English,” which had hitherto been quietly identified with “English” per se by generative linguists, versus “Black English Vernacular” (Labov, Cohen, Robins, & Lewis 1968) or “Negro Nonstandard English” (Loflin 1969). Once more, the usual conceptions of “grammar” and “competence” were retained, but now implicating a fresh decision between two prospects, each entailing its own problem. The more pessimistic prospect (already raised for “idiolects”) would be that each variety has its own separate and independent “grammar”; then the acquisition of the “standard” by speakers of a “non-standard” would require essentially learning a second language against substantial interference from the first. The more optimistic prospect would be that the varieties of English share their “grammar” in respect to “deep structure” or “competence” and differ only in their “surface structure” or “performance” (cf. Loflin 1969); the problem there would be that “deep” and “surface” or “competence” and “performance” were not conceptualised to underwrite concrete language programmes. In fact, if “universality is claimed” for “deep structures” (Chomsky 1965:118), then they are equally and necessarily accessible to all speakers, and such programmes would be pointless (cf. Beaugrande 1997e).

Some sociolinguists did remark that a language variety is more stable and orderly than would be suggested by the orthodox view of “performance” (quoted above) having many “deviations from the rules.” We might postulate a new level in between “competence” and “performance,” e.g., as “systematic performance” in contrast to “actualised performance,” or as a “contingency grammar” (cf. Houston 1969, 1970). Real language would be circuitously described as “neither a set of rules nor a set of sentences” but as “actual sound realisation which completes well-formed sentences with hesitation pauses, repetitions, ungrammatical sequences, anacolutha etc.” (Houston 1970:11). We witness here a demonstration of Chomsky’s own notion of starting from ideal and moving toward real, again as if the set of “grammatical sentences” were given in advance of all “actualised performance,” whereas the rational ordering which linguists follow in practice — if they still work with “sentences” at all — must be just the reverse: taking real language and idealising it into “grammatical sentences” or into Lyons’ “system-sentences” (2.2).

An alternative option for sociolinguistics, one not too far removed from these notions, would be to retain the uniformity of the language system across a whole society whilst relaxing the determinacy within the system. Instead of multiple “language varieties,” we could then have “variable rules” within one language (e.g. Labov 1969). Predictably, some linguists protested that “variable rules” could foster “drastic and undesirable changes in current theories” (Bickerton 1971:460). A key issue there would again be the illustrious dichotomy  between “competence” versus “performance,” about which Labov (1969:759) did indeed feel “not sure whether this is a useful distinction in the long run,” fearing the “use of performance as a waste-basket category, in which all convenient [or inconvenient?] data on variation and change can be deposited.”

In retrospect, we should take special note of how the orthodox notion of “rules” was taken over even into programmes that otherwise departed quite dramatically from conventional linguistic theory, such as Hymes’ (1967) and Klein’s (1974). Back in sections 1.1 and 2.2, I aired the problem that rules, notably the “transformations” and “rewriting rules” we still see in these programmatic studies, may simply get rid of the data they are claimed to explain or account for. In particular, complex or variable data might get suspended by converting them into simple and uniform data even where variation was just what sociolinguistics set out to describe. Even odder would be the construction of “rules” to convert grammatical sentences into “ungrammatical” ones, as in Houston’s “contingency grammar,” since whatever the “rules” of “grammar” might do, generating “ungrammatical sentences” is surely the one thing they must not do.

So the status of the new types of “rules” remained somewhat evasive. For Labov et al. (1968: 88ff) (quoted in Dittmar 1976:134), “categorical rules are difficult to define, as they are never broken” and “are invisible to speakers”; and “variable rules” “are known to the analyst as a result of his investigation” whereas “normally speakers cannot make any direct pronouncements” about them. The very moves to postulate two different modes of “rules” already carried the reservation that both are “invisible” to speakers, which might remind us of Chomsky’s original denial (critiqued in 1.2) that the “speaker of a language” “is aware of the rules of the grammar or even” “can become aware of them.” A further problem impends if the concept of “categorical rules never being broken” might imply at least some domains of a language where “performance directly reflects competence,” which Chomsky (1965: 3f) has roundly declared “it obviously could not” “in actual fact,” though it could “under the idealisation” of the “speaker-hearer,” where we might wonder how to recognise the “direct reflection of an idealisation” when we see it. Such “categorical rules” would be empirically intractable if we could establish them only after demonstrating the impossibility of “breaking” them in an “infinite set of sentences” or even just in a corpus of real data so large that we can be reasonably certain we have covered all relevant cases; and I shall indicate in section 4 why we are still far from any such goal, although some of our corpora are several orders of magnitude larger than sociolinguistics could have envisioned during the stages examined here. A disturbing corollary would be that all rules may prove to be variable when we have enough further “results of the investigation”; and this would definitely lead to “drastic changes in current theories” (although my own proposals for shelving the concept of “rules” in section 4 will be considerably more drastic). Alternatively, we could define “rule-breaking” in some specialised terms, e.g., by unloading all “breakings” into the class of “errors,” which, virtually by definition, constitute negative confirmations of the “rules”; or by introducing strange “rules” whose sole function is to break other rules during “actualised performance” (Houston again). Either way, the border between “variations” versus “rule-breakings” would remain empirically intractable, like that between “categorical rules” versus “variable rules.” insofar as neither the  idealised “grammar” nor the “ideal speaker-hearer” who “knows” it “exist in the real world” (Chomsky), whereas real speakers might be rendered unreliable or unrepresentative by their own “idiolects.”

These then, are some problematic implications of those scenarios that would maintain the established conceptions of “theoretical linguistics” either in a separate non-social domain or else with some cautious revisions in “grammars,” “rules,” etc., adapted to “sociolinguistics.” The gravest source of problems has continued to be the ambition of sustaining ideal language and the “idealisations” which linguists since Saussure have expected would somehow make “language” into a “well-defined object in the heterogeneous mass of speech facts” but which, I submit, have cumulatively had just the opposite effect of keeping it ill-defined.

 

3.2 Inaugurating a social linguistics from real language

 

The converse scenarios for would be to inaugurate a genuinely “social linguistics” derived from the real systems of languages as we can observe them in society. For Labov (1970a), “it seemed natural enough that the basic data for any form of general linguistics would be language as it is used by native speakers communicating with each other in everyday life”; and Fishman (1971:9) expected a “real linguistics” to emerge as an “extended notion of speech analysis” “once it has been accepted that speech descriptions should take account of the social context” (cf. Dittmar 1976:131f). But we have surveyed a constellation of issues that would have to be resolved before we could expect a “real linguistics” to be “accepted” as “natural.”

Sociology exerted some pressure from the other direction in its efforts to adhere quite  closely to “reality” and to be sceptical about high-level theorising. A symptomatic stance in “Western” sociology has been called “positivism,” purporting to “objectively describe” a society and “construing its work as ideologically neutral,” without providing “any useful analysis of the social, cultural, and political implications of its practice” (Pennycook 1994:138).

Insofar as the field of sociolinguistics was expected to actually alleviate language-related problems, positivism was hardly an appropriate stance. In the 1960s, social change would have seemed to be a highly constructive and welcome motor for unlimited “economic growth,” to which sociolinguistics could materially contribute. Yet its official mandates did not specify how projects for alleviating language-related problems might merely forestall genuine social change or make some minor cosmetic changes to help the current structure of society work more smoothly and to de-fuse potential conflicts. along with their “social dynamite” (du Bois-Reymond 1971:40). “Pacifying the ghettos” (Dittmar 1976:ch. 7) would be the ideal evasion for not dealing with the fact that ghettos ought to be incompatible with a modern democracy.

To grasp the mandate of sociolinguistics within its wider context, I would  diagnose a pervasive discrepancy between theory and practice (Beaugrande 1997a, 1997e, 1997f). In “modern capitalist societies,” key terms like “social stability” and “economic progress” have theoretical meanings sharply at variance with their practical meanings; and assiduous effort goes into mystifying the variance. In theory, they designate the maintenance of a peaceful and orderly society, free of major crises and conflicts, where prosperity steadily rises for the benefit of all citizens. In practice, they designate the maintenance of conditions wherein the winners who really are benefiting do not get seriously challenged by the losers who are not, even when the winners are vastly less numerous than the losers, the gaps are wide and growing, and the real trends add up to a carefully concealed or denied economic shrinkage. The winners who benefit from the “flow of capital” are majestically indifferent to the social inequalities among the losers, who are constantly told by public media and “conservative” politicians to blame themselves alone (cf. Reich 1991).

Similarly, “civil rights,” “equal opportunity,” “free market,” and so on in theory designate the basic guarantees of a “capitalist democracy,” but in practice designate the mechanisms whereby the society can be “freely” reshuffled to suit the restless movement of “capital” (Martin & Schumann 1996; Ohmae 1996). In some stages of modern consumerism (e.g., the 1950s and 1960s), “economic growth” has meant spreading the capital around within a larger consumership who buys huge quantities of moderately-priced commodities; in others (e.g., the 1980s and 1990s), it has meant concentrating the capital within a smaller consumership who buys modest quantities of high-priced commodities, which are being multiplied by the runaway advances in expensive technologies with rapid turnovers,  and which can be swiftly distributed to a world-wide elite and proudly displayed as symbols that the whole society is improving its “modern way of life.” Whereas profits were formerly dispersed among workers in societies with strong unionisation and worker-benefit laws, profits are now being concentrated among the elite owners, managers, and shareholders of multinational corporations who withhold benefits from their workers and suppliers by operating wherever wages and raw-material prices are cheapest and labour laws are the weakest (Manley 1991; Reich 1991, 1993). By locating their headquarters in “offshore” tax havens and transferring their operating costs from place to place, these corporations pay little or nothing back into the social programmes of local governments, and even demand massive public subsidies for starting or maintaining production sites (Martin & Schumann 1996).

Back in the 1960s, the real “economic growth” in “capitalist democracies” made improvements in “civil rights” and “equal opportunity” seem affordable, indeed profitable, for integrating talented and industrious individuals from a wider spectrum of society. However, the integration was made contingent upon assimilating to the social order and accepting the allegiances and values of the “mainstream culture” (cf. Cross 1974). This contingency was duly reflected in the mandate for sociolinguistics: to investigate how a wider spectrum of the society could be included in “economic growth” on the condition of assimilating to the “standard language,” but not to attenuate language differences as sensitive factors in economic competition. Indeed, linguistic assimilation could be an excellent test for an individual’s diligence to subserve “economic growth” and the “mainstream culture” of its chief beneficiaries.

Under any conditions, scientists and academics tend to be anxious about vacating the serene position of ideological neutrality, particularly when they are pressured to consider whether and how society should be stabilised or transformed on the basis of their research. The anxiety would naturally be acute among sociolinguists, given the history of modern linguistics making it a foundational principle to renounce all traditional projects to change or “improve” language. Already in Saussure’s estimation, “no society” “has ever known language other than as a product inherited from preceding generations”; “we can conceive of a change only through the intervention of specialists, grammarians, logicians, etc., but experience shows us that all such meddlings have failed” (1966:71, 73).

Sociolinguists might have quietly suspected that they were being handed an ambivalent enterprise entailing a “moral dilemma,” to borrow a phrase from Paulston (1971). A provisional solution would be to view it as two distinct enterprises: (1) describing the linguistic status quo regarding language varieties and sociolects; and (2) designing programmes for interventions in the status quo. The first was where sThis dualistic solution had its precedents within sociology proper. There, the  results could be exploited by social institutions either to maintain the status quo by describing the “social order” as a set of “objectively given facts”; but those results were also a precondition for any realistic projects to transform the status quo (cf. Beaugrande 19%%97 world English ). ignificant advances were achieved in describing in language varieties and highlighting their major differences. But the second encountered substantial obstacles against recommending and implementing language changes through workable programmes.

Modern linguistics had in fact sustained its own version of the disconnection between theories of equality and inclusion versus practices of inequality and exclusion (Beaugrande 1997c). As we have seen, the “collectivity” or  “community” of speakers was conceived by linguists like Saussure and Chomsky to be “perfect” and “homogeneous,” possessing neither the “will” nor the “control” to shape or change the language. This counter-intuitive conception falls into place when we recognise that their term “language” refers to a ideal system, whereas speakers can only shape or change real systems. The next step in this reasoning converts both the “community” and the “speaker” into ideal beings who know the ideal system “perfectly” and are not troubled by the “incalculable accidents in the exercise of language (accidents de la parole)” (Hjelmslev 1969 [orig. 1943]:94). When Saussure’s (1966 [orig. 1916]:14) proposed to “separate language [langue] from speaking [parole]” in order to  “separate social from individual,” he also vowed to be “separating is essential from accidental” (cf. 1.2; 4.1). So the global and explicit inclusion in the “social” domain entailed the local and implicit exclusion of the “individual” real speaker, which ominously matched the strategies in “modern democracies” for crediting the society with the humane effects of the social order whilst blaming individuals for the inhumane effects. The social order is essentially democratic and fair, and only accidentally undemocratic and unfair — even when, as in the 1990s, the majority of the citizens rightly  suspect they are being treated unfairly, e.g., in the now-familiar scenario where a profitable company suddenly give its workers the choice between accepting lower wages or getting laid off.

The paradox of a fair society somehow totalling up from a mass of unfair “accidents” bears an eerie resemblance to the submerged paradox of a “perfect language” somehow totalling up from a “heterogeneous mass” of “ fragments and deviant expressions.” The second paradox implies yet another absurdity: “language” being not just independent of “speaking,” but a wholly different type of system.

Such a paradox would be a debilitating heritage for sociolinguistics by suggesting “good” values for sociolects and “bad” values for idiolects. The next absurdities soon follow: since every idiolect is to a large extent based upon at least one sociolect, exactly those features which distinguish the idiolect must be the “bad” ones; and developing an idiolect would be like surrendering to “accidents” or acting in socially “deviant” ways.

To make substantive progress, sociolinguistics had to proceed on quite different assumptions: some sociolects (e.g., those of discriminated minorities) do carry low values, whilst some idiolects (e.g., those of popular rock stars) do carry high values. The institutional mandate for “assimilation” ominously encouraged sociolinguistics to accept differential values as given and permanent  social facts. Depending on how the relations among sociolects and idiolects are defined — the options were compared in 3.1 — “compensatory” language programmes would assign to individual speakers one of two tasks: either to switch their whole sociolects from a bad “non-standard” one to a good  “standard” one; or else to would strip away just the “bad” features of their own idiolects and paste on  “good” features. Either task presupposed that an individual’s sociolect or idiolect is a matter of free personal choice; and that the specific features of a language variety and their relative values have been precisely and consensually defined; and sociolinguistics has provided overwhelming evidence to the contrary (cf. Pennycook 1995; Phillipson 1992).

So the inclusive theory of “standardisation” has persisted alongside exclusive practices. Speakers who have assimilated are judged “qualified” for “upward social mobility,” whereas those who have not or could not are  judged “unqualified” and perversely clinging to their “ignorance” and “illiteracy.”  Public outcry over a supposed “literacy crisis” has diverted attention toward a hunt for scapegoats, usually among the language teachers, and away from the social bias that mistakenly sees illiteracy in what is actually the language diversity that leaps into view when school and colleges finally adopt “open-door” policies (Beaugrande 1984).

In this ambience, the “deficit hypothesis” was highly likely to emerge, but just as likely to be misunderstood. In “capitalism,” whose very name announces that money is the primary factor in human society, speaking low-valued sociolects would be accounted a “deficit”: a pungently economic term intimating that youa lack something comparable to capital, and can expect to come up short. The “free society” would in turn offer to “qualified” individuals the chance to “pay back” or “pay in” the “deficit” by assimilating to the “standard.”

The fairness of the social order would be proven by offering a language ladder to the “socially disadvantaged,” this last being a term which, like most of its counterparts (e.g. “culturally deprived” and “educationally deficient”), cautiously avoided mentioning economic inequality (cf. Dittmar 1976:85). Like our other social ladders, this one continues to have some missing or slippery rungs toward the bottom, so that learners whose home sociolects are closer to the “standard” will climb much more easily, while the rest are subjected to a protracted process of disconfirming their “language competence” (Beaugrande 1997c, 1997e). Ironically, learners with “non-standard” home sociolects have to be far more diligent to succeed; and even then can be suspected of having profited unfairly from “reverse discrimination.” This suspicion exactly matches the obsessive campaign of right-wing discourse to prove, by simple repetition, that any social programmes to aid disadvantaged minorities constitute gross violations of “freedom” and “equality” (Ref 19)

The chief architects of the “deficit hypothesis” could hardly have appreciated how they were being set up to furnish an alibi for the very social order whose fairness they resolutely intended to debunk with their own work. They were also being set up to fall unluckily into the gap between two incompatible modes of idealising language, one in value-laden prescriptive education and one in value-free descriptive linguistics. The “hypothesis” must have unpleasantly reminded linguists of those “popular statements” contrasting “developed languages” against “primitive languages doomed to poverty of expression,” for which Sapir had emphatically criticised the “social sciences” long before (3.1). Yet a linguistics that had disconnected language from society could hardly find new solutions for the problems of language education in societies where popular biases about specific languages and language varieties exert a very real impact upon social practices. Such a linguistics would face the task of trying to dislodge popular idealisations with its own “scientific” idealisations, which were singularly unadapted to the task.

Sociolinguistics thus landed awkwardly in between the old idealisations sustained mainly in disorganised practices in search of “good usage” and “correct grammar,”; and the new idealisations sustained mainly in super-organised theories with no ambition to guide or transform practices. Bernstein’s (1961:169f) own characterisations of the “elaborated code” having “accurate grammatical order and syntax” and the “restricted code” having “unfinished sentences” and “poor syntactic form” landed somewhere between the “right” versus “wrong” of traditional grammar and the “grammatical” versus “ungrammatical” of theoretical linguistics.

If the formulation of the “deficit hypothesis” was fully predictable, so was the failure of the “remedial” language education proposed to “compensate” for the “deficit.” The failure was also a smug success for a curious spectrum of diverse groups: theoreticians who dismissed the hypothesis as “unscientific”; liberals who judged it discriminatory” or “racist”; and conservatives in education and in “mainstream” society who accepted the hypothesis but insisted that projects for compensating the deficit would erode “standards” in the schools and encourage “unqualified” persons in the job market. Such was the fierce alliance that confronted the hapless architects of “deficit hypothesis” and proceeded to excoriate them for both their theories and their practices.

Without wanting to be unfair, I cannot help wondering how far the sociolinguistic projects that maintained theoretical linguistics, which I sketched in section 3.1 ahead of the proper historical sequence, may have been animated by the furore that engulfed projects like Bernstein’s for developing a social linguistics derived from real language. The idealisations of generative linguistics and its technical terminology about an intangible “competence” might have offered a highly attractive alternative to the disputatious realities. Loflin’s (1970:29) counsel to “move away from the non-empirical approach” was pungently commented by  Dittmar (1976:149): “‘empirical’ means here: making deductions according to the principles of generative transformational theory”; and “‘evidence’” means “results” which “confirm principles that have been determined a priori” and which are “relatively independent of the existing empirical reality.”

The technicality of formalist sociolinguistics might exclude many policy-makers and educators from the discussion, and create one more publicly inaccessible domain of elitist knowledge (Eisenberg & Haberland 1972). Problematic or controversial social issues could be draped in convenient obscurity. Consider how this tactic was applied to Bernstein’s “concept of sociolinguistic code,” which he evidently intended to orient toward real language by “pointing to the social structuring of meanings and to their diverse but related contextual linguistic realisations” (Bernstein 1967:126). This “definition” got charged with “unacceptable circularity” between “speech codde” and “system of social relations” defined in terms of set theory (Dittmar 1976:10; cf. Kanngiesser 1972:89)  on the grounds that  “no notion Bi Î S1 (i = 1, 2,…, n) may be defined by reference to a notion Bj Î S2 (j = 1, 2,…, m) whose construction presupposes Si (or subsystems of Si), and vice-versa” (Kanngiesser 1972:89). But This imposing formula would presumably mean: “no notion in a net of may be defined by reference to a notion whose construction presupposes it or parts of it, and vice-versa,” but my restatement in accessible language is already subtly different. the “notions” of a theory about language and society are by no means as simple, well-bounded, and enumerable as the arbitrary symbols in set theory. The very nature of both language and society ensures that the major notions about either one  richly interconnect with (and “presuppose”) each other. What might look like a circularity in the idealised logical system of sets and elements can be a reciprocal dialectic in a social system (2.2; 4.1). So this purported refutation merely retreated once more into ideal language in search of formalistic arguments seeking to evade a reorientation toward real language.

Unfortunately, even the powerful work by Labov and similar sociolinguists to establish the validity and value of those “non-standard” language varieties popularly construed to incur a “deficit” was lessened in its public impact by the formality and technicality of the presentation, e.g., about the “simplification of monomorphemic consonant clusters” or the “deletion of the copula be” (Labov et al. 1968; Labov 1972; Wolfram 1969). Terms like “simplification” and “deletion” imply that certain phonological or grammatical elements are (or were) there in English and that “Black English Vernacular” (“BEV”) alters or removes them, whereas surely Labov’s key point as that “BEV” is a system in own right. The conception “monomorphemic consonant cluster” was derived from standardised English orthography and from a morphemic analysis which real speakers of “BEV” probably do not perform either consciously or non-consciously. The “copula” was postulated either by analogy to “standard English” or to other “grammatical environments” in  “BEV,” on the assumption that it is a system-property or “rule” for all utterances and must get “deleted” wherever it does not appear. Yet Standard Russian has no such “copula” in the present tense, and linguists would seem Anglocentric or merely fanciful to assert that Russian speakers are going around “deleting” it for some unknown reason, as if under a secretive Dostoyevskyan compulsion; so too it could seem “standardocentric” to describe  “BEV” this way. Such descriptions betray a residual orientation toward ideal language, whose “rules” are still clearly and formally defined but now with specific adjustments to accommodate “variability.”

To the extent that these sociolinguists retained the theoretical and terminological apparatus of formalist linguistics, its successful demonstrations of the “logic of non-standard” varieties (Labov 1970b) were likely to enlighten only those language professionals who already viewed the “deficit hypothesis” as critically as they did. In contrast, the people who saw an adaptive value in believing that speaking a “non-standard” variety causes a “confounding of reason and conclusion” (Bernstein 1961:169-70) or “a total lack of ability to use language as a device for acquiring and processing information” (Bereiter & Engelmann 1966:39) were themselves illogically “confounding reason and conclusion” and misunderstanding the cognitive interface of language, and were unlikely to “process the information” the studies had provided. Their views had not been based upon rational evidence to begin with, and would not be changed by it now.

At all events, the global recession from the mid-1970s to the present has effectively blunted the movements for linguistic equality. Literally all over the world, language has become the leading symbolic pretext for sustaining ethnic, racial, and gender-based inequalities that are no longer officially admissible as such. “Linguistic human rights” remain unprotected, even in relatively affluent regions with presentable “democracies”; the dimensions of the problems have been survey by in Phillipson, Skuttnabb-Kangas, & Ranut (1994). Today, languages and sociolects can tip the balance toward or against “economic survival” within the globalised “free market,” wherein ruthless competition ensures that very few agents in reality free, except in the ghastly ironic sense that they are being pushed ever nearer to the extremity of having to work “for free.”

Meanwhile, the “conservatives” have succeeded in pushing their educational agenda whilst deploying bureaucratic tactics like budget cuts, administrative decrees, and discriminatory hirings and firings, to suppress egalitarian methods (Aronowitz and Giroux 1986). The language sector is being forcefully propelled backwards into prescriptive impositions of “standard language,” just as if sociolinguistics had not produced massive evidence questioning whether such an enterprise is either justifiable nor feasible. Even linguistic minorities who ardently defend the value of their own language or sociolect are compelled to deal with its frank devaluation in daily practices by the holders of economic power.

In response, sociolinguistics has been moving well beyond the phonological and grammatical emphases of the early stages into the “paradigm of discourse sociolinguistics,” which promotes “critical discourse sociolinguistic analysis” across a broad spectrum of socially important domains, including education; “the results of our studies” might “make transparent inequality and domination,” “propose possibilities of change,” and “provide instruments for less authoritarian discourse” (Wodak 1996:6, 32 her emphases). “The adoption of ‘critical’ goals means, first and foremost, investigating verbal interactions with an eye to their determination by, and their effects on, social structures” in ways often not “apparent to participants”; “hence, ‘critique’ is essentially making visible the interconnectedness of things” (Fairclough 1995:36). “Sociology and linguistics, sociolinguistics and discourse theory intersect” in  “addressing” the “general problem” of the “negotiation and construction of understanding” and in pursuing the “emancipatory claim” to “help remedy the inequalities” (Wodak 1996:6). Similarly, post-modern “cultural studies” are seeking to “offer the basis for creating new forms of knowledge by making language constitutive of the conditions for producing meaning as part of the knowledge/power relationship”; we do not “simply situate the analysis of language in the discourse of domination and subjugation” but seek to also “develop a ‘language of possibility’” and place a prominent “emphasis upon perceiving language as both an oppositional force and an affirmative force” (Giroux 1992:164, 167-68).

 

4. Language and society from the standpoint of very large corpora

 

In this final section, I shall explore some prospects for an unconventional and possibly radical reorientation toward real language as the latter is represented by the evidence in very large corpora of authentic data from text and discourse. In a genuinely data-driven linguistics or sociolinguistics along these lines, our conceptions of  “language” would be markedly  different from those assumed in most of theory-driven linguistics so far.

 


4.1 Language as a different type of system

 

Since corpus linguistics is still in the process of working out its full implications, my proposals must necessarily be exploratory and programmatic, and my demonstrations episodic. This section will outline a set of interlocking conceptions which, for clarity of presentation, are marked off in sequence with Roman numerals from I to XX.

 

I. A  language comprises a set of standing constraints that persist on the plane of the system (e.g., the English article going before the noun, not after it) plus emergent constraints that are decided on the plane of the discourse (e.g. , the lexical choices appropriate to a festive banquet speech). We can account for many recalcitrant problems in linguistics since Saussure as the outcome of attempts to describe language as a deterministic system constituted entirely by standing constraints: in Meillet’s (1903-04:641) well-known formulation, “un système très délicate et très compliqué où tout se tient rigoureusement et qui n’admet pas de modifications arbitraires et capricieuses.” Generations of “linguistic theories” have thereby aspired to a definitive completeness that was never remotely approached in the practices of analysis and description. The emergent constraints either got discounted as properties of a “heterogeneous mass of speech facts whose unity we cannot discover” (1.1); or else were overgeneralised to be standing constraints.

II. The constraints are not only linguistic, but also social and cognitive in nature (Beaugrande 1997a). Each of these three modes can be distinguished in specific cases: definite and indefinite articles in English for linguistic; performatives (e.g., promising, warning) for social; and self-directed actions (e.g., coughing, showering) versus other-directed actions (e.g., telephoning, slapping) for cognitive. But specific  cases by no means suffice for the general conclusion that the three domains should be boxed up separately for the sake of scientific procedure. All three modes routinely interact in real discourse, and if our analyses of real data were required to break them apart, we would soon be right back in the fruitless quest for “language by itself.”

III. A dialectic obtains between the two sides previously construed to be dichotomies. The tri-modal interaction between standing constraints and emergent constraints co-ordinates the dialectical relations between social versus individual, homogeneity versus heterogeneity, competence versus performance, regularity versus innovation, and so on. The co-ordination maintains both poles of these relations on the side of real language, whereas the conventional blurry dichotomies have opposed one ideal pole against one real pole as a prelude to isolating the ideal, as we have repeatedly seen.

IV. The interactions among constraints vary along the parameter of a delicacy, a concept due to the work of Halliday (e.g. 1961) and Hasan (e.g. 1987). The more “delicate” the constraints, the more strongly they tend to converge upon specific selections and combinations, right down to contexts where just one expression seems suitable. Typically, standing constraints are less “delicate” than emergent ones, but the proportions are unstable, since the standing constraints on idioms, fixed phrases, and so on are nonetheless quite delicate.

V. To assess degrees of “delicacy,” we should fully acknowledge the unity of the lexicogrammar as being more “delicate” toward the “lexical” end and less so toward the “grammatical” end (Hasan 1987). Large corpus data impressively demonstrate how grammatical patterns prefer certain types of lexical items, and how lexical items prefer certain grammatical patterns (cf. Francis 1993; Louw 1993; Sinclair 1996). Moreover, real language data follow standing and emergent constraints which are not readily classifiable as either lexical or grammatical, but only as both.

The diversity and richness of the lexicon, which have spurred the notion in conventional linguistics of the lexicon being merely unsystematic (2.2), can be reassessed as the product of a natural evolution to accommodate the diverse needs of social groups in ways unavailable to phonology and grammar. The motive of those groups is to constitute or regulate the relevant patches within the lexicon, and not to preserve the tidiness or exactitude of the lexicon as a whole. Nor could their efforts be devoted to maintaining consistency or completeness, even where the morphological material might seem clear. For example, a large and erudite dictionary of English reveals closely matching pairs of which  just one succeeded, e.g., insert and exsert or exhume and inhume, which the Random House Webster’s  of 1991 (pp. 696, 472, 468, 693) still lists with no glosses that any of them might be at all unusual. According to the datings listed, the successful item appeared a century or so before the unsuccessful item, which was perhaps coined by some scholars whose ambitions to enhance the logic and symmetry of the lexicon were not shared by society.

VI. Instead of describing the lexicogrammar in terms of the “rules” and “features” in conventional theoretical linguistics, we can develop two Firthian conceptions: “colligation” for a “syntagmatic relation” and “mutual expectancy” among “elements” of “grammatical” “structure”; and “collocation” for “words” “presented in the company they usually keep” (cf. Firth 1968:186, 111, 182f, 106ff, 113). These two tendencies can be observed at many degrees of “delicacy” in large corpus data and constitute modes of order that cannot be distinctly seen, let alone validly described, on the basis of intuition or introspection about invented data.

VII. We can discard the idealised “grammatical competence” postulated by the formalists since Chomsky; despite their empty invocations of an “infinite set of sentences,” it is closed in theory because all rules are presumed to be known to the ideal speaker-hearer, and is remotely connected to practice, if at all. The “communicative competence” postulated by the ethnographers since Hymes is always open in theory to the recognition of new or more delicate constraints in real discourse. The limits are practical ones, determined by the range of discourse domains in which you happen to participate.

VIII. This open conception of “competence” can now be elaborated in terms of colligability and collocability: the predispositions of real speakers regarding the grammatical and lexical combinabilities within the language as an open and evolving system of linguistic, social, and cognitive constraints. Sociolinguistic investigations of sociolects or idiolects might weigh Firth’s (1968:195) surmise that “characteristic distributions in collocability” can constitute “a level of meaning in describing the English” of a “social group or even one person.”

IX. The openness hinges most vitally upon the complementarity between your passive competence, which suffices to understand the bulk of contemporary discourses; and your active competence, which is still only a fraction of the total language community’s. The differential between what you as a real speaker-hearer can understand versus what you are likely to find occasion to say is thus far more significant than implied by speculations about ideal language, e.g., that “the “synthesis and analysis” “speaker and hearer must perform” “are essentially the same” (Chomsky 1957:48). As I hope to illustrate in the next section, the differential is typically navigable in ways that effectively mediate between the partial version known to any single speaker the language as known to a whole community. Contact with real discourse can readily activate convergences that are novel to you even when the converging constraints are familiar. If varying degrees of competence can be gauged by the capacity to navigate these frontiers of novelty, then your current competence can always be stimulated to expand by broadening the range of real data you encounter. But the process may not look much like “language learning” in the conventional sense of schooling and education, which emphasises the borders, limits and errors in performance (Beaugrande 1997c)

X. In place of the idealised notion of language having stable and determinate meanings based upon “reference,” “denotation,” “semantic features,” and so on, in the theorising of conventional semantics, we can develop a theory of adaptive meanings with two senses (Beaugrande 1997a). In the more familiar and linguistic sense, the meanings of expressions “adapt” to context. The total range of adaptation for the meaning of an expression typically corresponds to its frequency of use across multiple discourse domains, e.g. “place”; conversely, meanings that adapt only mildly usually go with expressions that are seldom used, e.g. “fetlock,” or are strictly technical, e.g. “hypoblast.” A monitor corpus to which new data are steadily added can also help us retrace the evolution of a given range of adaptations, as when a technical expression (like “black hole” coined by John Wheeler) gets popularised under new meanings, e.g., for the “offshore” financial institutions already amounting to a trillion dollars in 1987, according to an estimate of the International Monetary Fund — on top of the two trillion dollars the IMF can register that that conceal undeclared and untaxed financial assets already amounting to a trillion dollars in 1987, according to an estimate of the International Monetary Fund — on top of the two trillion dollars the IMF can register (Martin & Schumann 1996:94).

In the less familiar and more social sense, the participants in the discourse seek to determine those meanings that hold “adaptive value” for themselves. A straightforward example can be found in the term “culture,” as defined in a corpus-based dictionary like the Collins COBUILD (p. 345). “Culture” figures in inclusive discourses about “the ideas, customs, and art that are produced or shared by a particular society”; and in exclusive discourses about “the quality of being well-mannered and well-educated, especially when you have a good knowledge of the arts.” The exclusive meaning has a high adaptive value for the elites who can justify their high status by virtue of their “manners” and “education,” and a maladaptive value for the non-elites who cannot.

A principle for integrating the dual senses of the “adaptive theory” just proposed might be: whenever a language is actualised in discourse, the adaptation of meanings to context is controlled both by the regularities of the participants’ own sociolect and idiolect and by the respective adaptive values for the social evolution of those participants. Regularities and adaptive values may pull in incompatible directions: dominant participants may determine a meaning with a high adaptive value that is quite at variance with usual meanings, as when “democratic reform” was associated with “promoting anarchy” in data sample (24) shown in 4.2.

XI. The apparent heterogeneity and indeterminacy within the system viewed as an isolated abstraction actually constitute strategic bands of undecidability to be adapted on-line to the multifarious modes and motivations of discourse. Many alternatives, concurrences, ambiguities, and so on, are left undecided on the plane of the system precisely in order that the system can evolve and continually adapt to new contexts. The “rules” and “features” of formalist linguistics could be viewed as theoretical tools for squeezing out undecidability; but the latter merely moved over into the “rules” and “features” themselves. The implacable disputes among formalist linguists are a clear symptom of being unable in principle to decide precisely how any of their deterministic systems might be related to a real language like contemporary English. Those same systems have miscast as mere disorder the undecidability that vital for the operation of real language (2.2).

XII. The relation between language, sociolect, and idiolect, which we have seen fomenting problems in sociolinguistics, might be modelled as a three-way dialectic for regulating the bands of “undecidability.” Some portion of the undecided constraints of the language are decided in any one sociolect; and some portion of the latter’s undecided constraints are decided in any one idiolect. The scale and parameters of these respective portions will have to be determined empirically from extensive corpus data before we can formulate the theoretical principles whereby language, sociolect, and idiolect either coincide or differ — how “homogeneous” or “heterogeneous” language in society might actually be.

XIII. Or again, we might find evidence of a four-way dialectic wherein these three systems interact with the discoursolect, being the current and episodic on-line system. I have not seen this term proposed so far in linguistics or sociolinguistics, where it might get promptly dismissed for collapsing the staid distinctions between “langue” and “parole” and between “competence” and “performance.” But if we grant that those distinctions have implied an implausible mismatch between order versus disorder, and that the transition from language into discourse must rather be a shift between two similar modes of order (2,2), the “discoursolect” could plausibly be the most specific systemic mode of order with the narrowest bands of undecidability. It could comprise the converging totality of “systemic” controls which are actualised in the discourse, and which ensure its “systemic” organisation despite the numerous and precise accommodations to local circumstances.

I might recall here the remark of Wellek and Warren (1956:152) that the relation between “langue” and “parole” might be parallel to the relation between the “literary work of art” and any one “individual realisation”; “both the “system” and the “work” represent “a collection of conventions and norms whose workings and relations we can observe and describe as having a fundamental coherence and identity.” Their remark was directed to the problem of different readings of the “same” text in literary studies, a domain heavily preoccupied with questions of interpretive authority. But, like other formulations originally addressed only to literature, this one may be applicable to discourse in general: the discourse could be the actualisation of a system upon which the respective participants’ processing activities are based without performing identical operations or producing identical results. Still, adapting the terms “langue” and “parole” in this novel sense could invite unwelcome confusion.

Or again, I might recall the “text grammars” proposed in the early stage of “text linguistics” (e.g van Dijk 1972). There too, literary discourse was also a major concern, presumably because it seemed or present 6the greatest resistance against the normative idealisations that were fashionable in theoretical linguistics at the time. As in some sociolinguistic work cited in section 3.1 (e.g. Decamp’s), the concept of “grammar” as a “formal device” was maintained, but this time with some more significant modifications. In order to “give a more adequate account of the systematic phenomena of natural language by describing and explaining more facts” and “providing more relevant generalisations,” van Dijk (1972:3) “introduced the concept of the text as the basic linguistic unit manifesting itself, as discourse, in verbal utterances” and possessing more “significant empirical reality” than does the “sentence.” He expressly called for a “theory of performance” to “formulate the regularities underlying specific uses (applications) of systematic rules by individual writers and readers” (1972:180). But he still assumed that “the heterogeneity of the studied empirical data” would “require several steps of idealisation and generalisation which bring any theoretical formulation rather far from the concrete empirical objects”; and he did not question that “formal theories like grammars will always be related to idealised language systems and thus will abstract from idiolectal and even dialectal differences” (1972:203, 189). The “speech community” would still be “based” on “standard language” “defined in sociolinguistic terms, e.g., as the language of a certain social (middle) class, normatively used in educational systems and prevalent in mass media” (1972:189). Reverting fully to orthodox linguistics, he suggested that the “standard” is “defined” by a “set of semantic, syntactic, and morpho-phonological rules also implicitly known by all speakers,” plus an “open vocabulary” (1972:191)  — the lexicon on the margins once again.

These moves situated the “text” as a theoretical unit upon a high plane of abstraction, where constraints are “formulated” in “rules forming and relating semantic structures with phonological structures of all the well-formed texts of language” (1972:11). Here too, the theory cannot deal with the emergent constraints decided on the plane of the discourse. Consider Lakoff’s (1968) study of “pronouns and reference,” proposing that reference in noun-phrases moves from most specific to least, as in invented data like (1). But real data like (2) can be found doing just the reverse.

 

(1) Napoleon arrived at the  palace. The conqueror of Austria was in high spirits. I never saw such an elated man. He hardly  ever stopped talking.

(2) Who should walk in but a venerable old man in whom His Grace immediately recognised one of the saints of the church, no other than the Right Revered Sergius (Nikolai Leskov)

 

Lakoff’s proposal resembled philosophical logic in assuming that the identity of a referent should always be specified in higher detail before lower detail. But Leskov’s tale gains dramatic effect by gradually specifying the identity of the mysterious nocturnal visitor. My point would be not that Lakoff’s proposal is “wrong,” but that it stated upon an unduly general plane, as so often occurs in extrapolating from logic to language.

XIV. By describing the dialectic in the sequence of language, sociolect, idiolect, and discoursolect in terms of decidability, we are not committed to the notion that each system is essentially a specification of the one to the left of it. The voluminous disputes over “standards” and “ungrammaticality,” whether in society, education, or linguistics, suggest that a system may also be a transformation, especially where a sociolect is asserting group identity, empowerment, and resistance against a dominant disempowering norm, or where an idiolect or a discoursolect is self-consciously striving for individualisation and innovation. Even opposition against a system is systemic, just as resistance against social norms is an intensely social act. The “style of the author” and the “style of the text,” so often invoked in literary studies, are emphatically constructs of order rather than disorder, even when the individual appears to eclipse the social, as befits the popular myth of the artist as a great misunderstood outsider or outcast who pushes and strains at the borders of language — presumably the exact opposite of the “ideal speaker-hearer” with his “infinite” gallery of banal “sentences” (2.1; 3.1)

As I noted for sociolinguistic research like Labov’s in 3.2, the notion of a “standard language” still implies some basis or framework in respect to which rules are said to “simplify” or “delete” certain features within a sociolect. This implication is not merely at variance with the sociolinguists’ own principle that each sociolect is a self-sufficient system, but also puts merely formal operations in the  place of the adjustments real speakers make, conscious or not, when they shift along sociolects.

XV. An alternative account would be to reverse the sequence into “discoursolect - idiolect - sociolect - language” and to describe it as a progression among steadily higher degrees of idealisation. So far,  sociolinguistics has tended to regard all of these (except the “discoursolect”) as ideal systems, with the uneasy implication that they are given in advance, witness Chomsky’s irrational advice to start right off “studying ideal systems” (3.1). Corpus research should allow us to describe all four of these systems on the basis of real language, and, in the process, to distinguish when “idealisation” is a real drift within the language or the activities of its speakers, and when it has been merely a privileged expediency of theoretical linguists to back away from real language.

XVI. Corpus research could also support a fresh balance between the dominant strategies of highlighting similarities linguistics proper and highlighting differences in sociolinguistics. Saussure’s counsel to “determine what is universal” in all languages (1.1), which by now has been reflected in extensive research, may have originally been a reflex of seeking “language by itself” by retreating from the individual languages like English or French into “language” in the abstract. Large corpus data present a complex, fine-tuned picture of similarities and differences in the individual language which can shed light on the comparable picture of a society. The two pictures correspond in a meticulous dialectic whose contours only begin to emerge when our sample corpus is very large indeed.

XVII. Working with large corpora obviates all the old wafflings about whether to include or exclude the relation between language and society, because society is omnipresent. Corpus data are the products of a huge population of real speakers (or writers), and, particularly in mass media discourse, intended for a still huger population of audiences. Some mass media have already gathered empirical information about their audiences, which we could consult when constructing social profiles of participant groups.

XVIII. A cogent opportunity is open here for linguists to be re-integrated with the society by virtue of being members of the population for whom such discourses as those in the corpus were intended: newspaper and magazine readers, radio listeners, television viewers, and of course conversational participants. We can lay aside all implicit claims to hold super-human powers of “awareness” which were bestowed upon us by an advanced degree in “theoretical linguistics” and which authorise us to access the “perfect grammar” of the “ideal speaker-hearer” (1.2; 2.1). Instead, we can justify our claims to expertise by accumulating and applying insights about the regularities of large corpora of data produced from and for real speaker-hearers whom we ourselves resemble far more closely than we resemble that solitary “ideal” one who would seem to know everything and say nothing (section 2.1).

XIX. We can profit richly from the parallel opportunity to investigate linguistic analysis and description as specialised modes of social activity. Field studies of actual work with real data could help to make explicit the steps whereby linguists identify and test prospective regularities (cf. Sinclair 1996). We might finally get “intuition” and “introspection” into the scope of reportable data and determine the respective contributions of a linguist’s idiolect or sociolect to interpreting authentic language data and passing judgements upon whether specified types of speakers are more or less likely to say them in relevant social situations.

Here if anywhere, we could finally come to grips with “idealisation” and its myriad problems, some of which I have undertaken to portray. As I proposed at the end of section 1,2, our goal would not be to suppress the “idealisation” of language, but rather to investigate and regulate “idealisation” as a range of activities adapting to the values of respective social groups.

XX. One provisional account would be that language is a system designed to oscillate between real and ideal in all the ways a society might require: between instance and class, token and type, literal and metaphor, time and tense, indicative and conditional, proper noun and common noun, and so forth. Attempts to highlight the “system” of language would tend to be static second-order idealisations of the dynamic first-order idealisations prefigured in the design of language. Our challenge is to  work on multiple levels of description and analysis while respecting rather than suspending the dynamics.

 

4.2 Some samples of corpus data

 

The conceptions surveyed in the foregoing section about language as a system can now be explored through some modest sets of corpus. Obviously, these data can only be suggestive and far from conclusive, the more so as corpus linguists must consciously resist drawing unduly wide generalisations. What corpus actually supply is chiefly hypotheses which we may find interesting enough to pursue by means of other sources, such as data obtained by sociological and ethnographic methods.

The corpus data I shall use came from the world’s largest corpus of real language data developed under the supervision of John McHardy Sinclair at the University of Birmingham under the names “Bank of English” and “COBUILD,” the  latter being an acronym for the “Collins Birmingham University International Language Database.” By June 1996, it reached the size of 323 million words of running text; but the data I shall be looking at were taken in July 1994, when it had reached the size of approximately 200 million words. The data have been assembled from contemporary spoken and written sources extending from the 1980s onward, such as: British and North American books; newspapers (Times, Independent, Guardian, Today, Wall Street Journal, New Scientist, Economist); magazines (e.g., Esquire, Good Housekeeping); ephemera such as letter-box mailings (e.g., YMCA appeal for homeless people, Friends of the Earth Tropical Rainforest Campaign), radio broadcasts (British Broadcasting Corporation in the UK and National Public Radio in the US); and recordings of spoken conversations.

The basic architecture of access seems deceptively simple: one or more key words are searched, and the occurrences are displayed toward the centre of data lines. The default line-length at COBUILD was 80 characters but could be expanded for more context; at around 250 characters, I found, most data become reliably clear. The hidden power of the access modes lies in signalling the interaction between the key word(s) and other selections whose co-occurrence is visibly far higher than chance, lending new force to Sinclair’s (1984:97) precept that “text is much more determined than is normally supposed” — far from an array of “incalculable accidents” (3.2).

To see how linguistic social and cognitive constraints interacts on various “linguistic levels” (in the sense of section 2.2), and how the “lexicogrammar” ranges in “delicacy” (4.1), I would cited one data sample for the English verb warrant (cf. Beaugrande 1996) Among the 228 occurrences in the Bank of English, fully 224 had third person subjects, versus just 4 in first person and 0 in the second person; and, within the third person, I found a mere handful of pronoun subjects he (6 occurrences), she (0), they (5), and it (7), all the rest being noun subjects. The most relevant constraints appeared to be social: actions and events rather than people are said to do the warranting by being in some way unusual or significant enough that a reaction might well be in order, and those who might be expected to do the reacting are saying why or why not they are going to, and how. Accordingly, the speaker — or, when the discourse is reported, its originator — was usually a person representing some social institution or authority, and the data signalled what kind: government, judiciary, military, sports, business, science, and medicine.

Some of the best-documented lexical constraints upon grammatical classes were both quite specific and quite orderly: for nouns, evidence (21 occurrences), investigation (12), trial (7), and punishment (5); for modifiers, enough (58), sufficient (27), serious (14), important (5), and severe (5). The predominance of legal discourse reflected here is reminiscent of the common uses of the noun warrant in legal terms like search warrant and death warrant, although these go back to the earlier meaning related to “certification” and (also etymologically) to “guarantee.”

Not having persons as grammatical subjects can allow subtle evasions of human agency by the source or speaker, who may be smugly unaffected by a recession (3), a job bias (4), or a food shortage (5), and whose death does not depend on whether national objectives send US troops off to war for oil (6).

 

(3) the declines are too modest to warrant the phrase recession,” said Lewis

(4) whether job bias is widespread enough to warrant special protections for gay

(5) if the food shortage is severe enough to warrant breaking the embargo # This report

(6) the national objectives at stake warrant the deaths of US troops # Oil,

 

Human suffering appears here as a factor within the calculations of faceless authorities about whether human responses might (or more likely, might not) be warranted.

Corpus data like these keep directing our attention back to the social constraints on sets of choices, and at a degree of delicacy we would be unlikely to achieve with unaided intuition (Francis & Sinclair 1994). The same factor emerged from a set of queries I entered to explore the common usages of major technical terms in my volume of deliberation of the “foundations of a science of text and discourse” (Beaugrande 1997a). I had long been concerned with the subtle influence of non-technical usage upon technical terms, even in the work of meticulous scientists like Piaget (Beaugrande 1995-96); and now I took the opportunity to sharpen the focus.

One set of queries I entered was for the paired terms stability and instability, for which the data base returned a total of 31 occurrences each. Whereas a dictionary definition like “continuance without change” (Random House Webster’,s p. 1299) suggests that stability might be said of all manner of things or persons, the corpus data displayed it actually being said of just a few specific things and only once of a person (whose mental stability was being questioned).

The leading topic was the political or economic conditions within a society, where the dominant adaptive value appeared to lie in designating those conditions that resist (or should resist) significant change, whether they obtain at present (7) or are being sought for the future (8). Such stable conditions were brightly associated with peace (8, 10) prosperity (9), security (10, 12), and the new world order envisaged by the U.S. (8), yet some darker countercurrents can be detected. An authoritarian regime may be the agent, as in Malawi (7); an internally unstable system like the former Soviet Union (cf. sample (19) below) may contribute to external stability after the West decides it is not the threat it once was (10); stability may be paradoxically linked to pressuring for sweeping changes in a region like North Korea, whose undeniable stability is unpalatable to the West (11); and a regime may defend stability by imposing strict repressions with its security forces against workers seeking wage increases (12). The Indonesian case in sample (12) most clearly shows the adaptive value of using stability to mean the authoritarian maintenance of a state entailing substantial but unacknowledged factors of instability, such as horrendous inequalities in wealth and privilege.

 

(7) The oasis of stability of which Malawi’s leaders boast

(8) interests in a new world order, stability in the Balkans, peace on Cyprus

(9) prosperity in the region. Greater stability is based on economic prosperity

(10) The Soviet Union is not viewed as the threat it once was to Asian security and in that respect Soviet ideas about peace and stability in the region are now treated far more seriously

(11) China’s co-operation is the key to maintaining stability in Asia, pressuring North Korea

(12) Indonesia’s minister for politics and security has said that strict measures will be taken against people organising strikes because of the threat industrial action poses to national stability

 

Just how strict the measures can be was gruesomely revealed by the fate of two female union organisers who tried to direct a strike in an Indonesian factory: their tortured corpses were found in the factory’s garbage dump (International Herald Tribune, 18 March 1996). Bad publicity for human rights groups but good for investors: not long after, the German Congress of Industry and Commerce (Deutscher Industrie- und Handelstag) published a study praising the “political stability” and “especially good investment conditions” in Indonesia (Frankfurter Allgemeine, 21 May 1996).

The other main use in the data, though less common, concerned the value of money, e.g.:

 

(13) factor will be price and currency stability, along with general economic

(14) commitment to sterling’s stability in the ERM, there are fears that

 

We might infer a down-to-earth concern about the buying power of ordinary wages but for the other data where large and powerful interests were concerned and where attempts to raise wages were judged a threat to national stability (12). Also, the sterling’s stability in international money markets is a main concern not of ordinary wage-earners but of high-powered profiteers whose successful campaign against the British pound (and shortly after the lira, the peseta, and the franc) in 1992 cracked the stability of European Monetary System and cost the taxpayers of Europe at 100 billion German marks (Martin & Schumann 1996:89) — all perfectly legal within the “globalisation” of the money market.

The occurrences of instability in the corpus were mainly in foreboding contexts, e.g., about conditions instigated against the disapproval of Mr de Klerk in South Africa, mainly calls for racial equality (15); or about the arrests of people who organise students and young people in Burma (16), a place where accurate statements about the political situation would automatically be judged anti-government propaganda. Sinister contextual cues included urban discontent (17), economic backwardness (18), danger (19), corruption (20), racial conflict (21), bloodshed and poverty (22), chaos (23), and anarchy (24). The opposition between democratic reforms plus anarchy versus democracy plus stability in sample (24) displays a discoursal contradiction arising dramatically from the split between the theory versus the practice of democracy, as discussed in section 3.2. Sample (21) confirmed my surmise that the stability being defended in Indoniesia in sample (12) is beset with inherent instability.

 

(15) and everything that instigates instability.” Mr de Klerk also made

(16) three people have been arrested in Rangoon for alleged anti-government activities. The radio said they were accused of organising students and young people to try to create instability and had published anti-government propaganda.

(17) was seen as a source of economic instability, urban discontent, and hence

(18) present economic backwardness and instability. As long as 80 million mainly

(19) danger they could use the current instability in the Soviet Union to gain

(20) corruption and political instability, as Latin America has shown.

(21) of possible racial conflict and instability in Indonesia.

(22) years of bloodshed, poverty, and instability.

(23) violent, but in such chaos and instability, and this country has seen

(24) He proclaims his goal as “democracy with stability” and is not interested in democratic reforms that promote “anarchy or instability.”

 

Taken together, these corpus data indicate that the dominant meanings of our key words are not just much more specific than their usual dictionary definitions would suggest, but are dynamically evolving within public discourse about social, economic, and political conditions which powerful interests may favour or disfavour. Stability may offer legitimacy to one undemocratic regime in capitalist Malawi (7) welcoming foreign investors and friendly with the apartheid regime in South Africa at the time, but not to another regime in socialist North Korea (11) closed to investors (who may be dreaming of another boom like the one in South Korea) and hostile to the United States. Also, stability may be the basis of prosperity (9), yet may be threatened when industrial workers take actions to attain prosperity (12) or when democratic reforms get proposed (24).

Within my own idiolect, the meanings of stability and instability are more elaborated than the meanings indicated either by dictionaries or by the regimes of Malawi and Burma. For me, stability designates conditions, however unstable in themselves, wherein powerful interests are firmly opposed to change, whereas instability designates whatever changes they oppose at any one moment. Those interests adaptively deploy the terms to invoke absolute and unquestionable values: good for stability, and evil for instability, to avoid probing the human consequences of either one. Even so, the implicit contradictions emerged in some data more clearly than I would have predicted, e.g., when a political leader solemnly associated democratic reforms with anarchy (24). I might conclude that the contradictions within a society will be reflected in its discourse even when powerful participants would gladly keep them hidden. Such a conclusion was in fact drawn by some inaugural figures in discourse analysis, such as Althusser and Foucault, but neither of them supported it with representative discourse data. For us, the conclusion remains a working hypothesis to be tested against quantities of data several orders of magnitude larger than the sampling just presented.

Socially significant data were also returned on a query for the terms multicultural and multiculturalism, whose linguistic evolution in public discourse should reflect the social evolution of the real phenomena in those English-speaking societies which have traditionally considered themselves monocultural and which are now facing a de-facto multiculturalism. There, the adaptive meanings of the terms should hinge upon whether and how societies in economic recession will respect or exploit cultural differences when dividing up the benefits of a society wherein the inclusive theories of “equal opportunity,” “free market,” and so on, now designate the  exclusive “freedom” of the upper 20% to squeeze a “growing” share out of the other 80%.

We could predict that multiculturalism will be featured in current discourse that either legitimates or contests the rapidly widening social inequalities. The leading strategy in the “conservative” right-wing press is to denounce multiculturalism as a grave danger to the mainstream culture, e.g. in the U.S. National Review:

 

(25) multiculturalism is far more than a radical ideology or misconceived educational reform: it is […] a systematic dismantling of America’s unitary national identity in response to unprecedented ethnic and racial transformation the debunking of multiculturalism must continue. (27 April 1992)

(26) many current public policies have an unmistakable tendency to deconstruct the American nation, [such as] official bilinguism and multiculturalism (22 June 1992)

 

By presenting itself as debunking — defined in the Random House Webster’s (p. 350) as “exposing as being false or exaggerated” — any discourse which affirms the relevance and value of multiculturalism, right-wing discourse seeks to mystify its own spurious accusations about deconstructing the American nation and to set the stage for continual confrontations.

The corpus data indicated that the prevalence of multiculturalism is at least generally acknowledged, viz.:

 

(27) and more fully portraying the multicultural nature of Britain’s society.

(28) just that we accept we are a multicultural society. MO2 Yes we are but

(29) there is no way back from today’s multicultural society to the ethnic

 

Again predictably, the divergent adaptive values of the various meanings of multiculturalism were contested between an opportunity to be welcomed, e.g. (30-34), versus a disruption to be deplored, e.g. (35-37). I italicize the contextual cues that, in my own intuition, indicate the respective discursive intentions.

 

(30) the virtues of a multiracial, multicultural society. At the outset it is

(31) elements of a new, more open, multicultural America # And you know,

(32) contribute to our strength as a multicultural society that welcomes diversity

(33) and co-operatively together in a multicultural country is one of the most valuable

(34) to recognise the developing multicultural make up of society as a pearl

(35) the complications of our multicultural society in ways that the young

(36) been replaced by the buzzword “multicultural” # which by definition separates

(37) of quality is sacrificed for multicultural equality

 

Data like (36-37) specify how right-wing discourse works to suggest that social and cultural differences are inevitably divisive, and that the differing groups are oddballs or troublemakers whose demands for equality lead to a loss of quality (37), with the usual implication that minorities are by nature “unqualified,” especially due to their “non-standard” language (cf. section 3.1).

The corpus data further indicated how language choices and discoursal strategies reflect social attitudes in contests over the adaptive meanings of linguistic expressions by systematically dramatising conflict (38-40) and by portraying any concern or respect for multiculturalism as a meek or prissy conformity (41-43).

 

(38) The fiercest battleground of the multicultural wars, however, involves the

(39) of reports on the struggle over multicultural education # US public schools

(40) controversy over the use of a multicultural curriculum in schools is

(41) strength. Here, any nebulous multicultural civility would be an evasion of

(42) Commissioning Editor for Multicultural Programmes, who slavishly

(43) of clichés about the necessity of multicultural diversity, the iniquity of Arts

 

Less frequently attested were contexts wherein multiculturalism figured as a resource for counterbalancing social conflict:

 

(44) a halt on violence. We call on a multicultural revolution of values in our

(45) seek peace and political stability, multicultural sensitivity, quality consciousness

(46) Fighting racial hatred with multicultural theater and music # The story

 

The domain of education, wherein cultural attitudes are decisively moulded for many young citizens, was predictably featured in a its own contrast between ameliorative (47-48) versus pejorative (49-51):

 

(47) the program emphasizes multicultural awareness and group cooperation

(48) the campus all the benefits of a multicultural environment. LTH In addition

(49) who are angry that the multicultural curriculum will teach children

(50) and want to throw out the entire multicultural curriculum

(51) position of having to do remedial multicultural education for roughly a third

 

The link to remedial in (51) builds again on the right-wing discourse strategy of insisting that measures to offer equality to minority cultures mean a sacrifice of quality (37).

The data also revealed an alternative strategy whereby multiculturalism, like so many other themes and issues, can feed the consumerism of a “modern society” eager for trendy innovations in fashions and the arts (52-54).

 

(52) sights sounds and energy of multicultural Britain. Join British soul

(53) institutions into a ferment of multicultural arts programs, featuring such

(54) and Tyson in the same heady, multicultural swirl. The travelling Cherry

From there, a small step leads to businesses enhancing their images by adapting their corporate identity (55), hiring directors of multicultural design (56), and sponsoring multicultural art museums (57). Their market strategies acknowledge minority groups at least as potential consumers (58-59).

(55) companies are likely to adopt a more multicultural corporate identity

(56) Daisy Chin-Lor, director of multicultural planning and design at Avon Products

(57) companies sponsoring multicultural art museums minorities (ethnic

(58) to 70 %. BDDP wants to create a multicultural advertising network. p In a

(59) leadership, the Tribune served a multicultural community that has grown to

 

In contrast, the term monocultural never appeared in the corpus as the prospective counterpart for a society centred on a single culture. I found only 4 occurrences, all in the agricultural meaning of “raising a single crop,” which, for the record, is also the only meaning given in the otherwise self-consciously progressive Webster’s Random House College Dictionary of 1991. This negative evidence fits the strategy, whose adaptive value has been explored in critical discourse analysis, of carefully leaving the mainstream invisible, as if it had no politics or special interests of its own but were simply the neutral zero grade or centre from which all differing cultural positions can be objectively viewed as deviant (cf. Giroux 1992; Fairclough 1995). The news media and their controllers evidently wish to focus public attention on multiculturalism as a free-standing phenomenon whose topical interest is highest for conflicts. In return, monoculturalism need not even be examined or defined, let alone objectively demonstrated to be superior. The diverse opponents of multiculturalism can keep their own ideologies of greed, selfishness, intolerance, and aggression comfortably out of public discourse.

The tiny of range corpus data I have presented in the section should nonetheless suggest why such data might be of great interest for sociolinguistics. We can see the fallacy of supposing, in the stolid tradition of Saussurian and Chomskyan linguistics, that corpus data are so disordered (‘heterogeneous,” “deviant” etc.) and “accidental” as to resist a description of language. They do resist premature idealisation; and that, I submit, is all to the good.

We may also begin to see how corpus data are orderly and motivated, but in rich and delicate ways which are accessible to our “competence” as speakers but not to our unaided “intuition” and “introspection” (4.1). In my own theoretical discourse, I have diagnosed a pervasive preference  among scientists for theories and model that foreground stability and marginalise instability; I now find corresponding attitudes in public discourse about the organise of society, even though most current societies are at least implicitly unstable. I can now feel motivated to explore other correspondences between scientific and public discourse as a corrective to the official value-free stance of positivist scientists.

Yet we may also see a more sobering factor: many socially relevant constraints will not be registered until the corpora far larger than any we have now. Even the largest-ever Bank of English has only started to indicate what we can expect. Its evolution from 20 million to 200 million and then to 323 million words of running text has proven that the most vital gain through increases in size are not in frequency but in delicacy in the sense proposed in 4.1: not just how many occurrences we find but how much contextual information those occurrences can supply about interacting constraints. At any size, we will always have “patterns for which there is some evidence, but insufficient to make a conclusive case for significance” (Sinclair 1991:491). Bumping up the size may bring some of those patterns into focus, but will also throw up others which seem interesting for sociolinguistic research but which call for more evidence. For example, in my query to the Bank of English for indeterminate, the single most common attestation (4 out of 61 occurrences) was the age or years of women, whereas the age of men wasn’t found even once. These data may plausibly reflect the stereotypical evaluation of women by age, looks, hairstyle, dress, etc., but they too only suggest a tentative hypothesis for sociolinguistic and sociological research.

Sociolinguistics would also requires the corpora to be more precisely differentiated for data-driven research on sociolects or idiolects, or even “discoursolects,” as advocated in 4.1.  Such research would require parallel corpora whose size might depend chiefly upon the relative degrees of variation. When two varieties differ substantially, smaller corpora might support the research, at least for a time, e.g., the local varieties of English in Hong Kong, Singapore, Jamaica, Kenya, and so on, each represented by a million-word corpus within the International Corpus of English (ICE). But when two varieties differ more subtly, such as those clustered loosely around “Standard English” in the U.S., the corpora will need to be far larger. At the present stage, we cannot safely predict just how large; but we can safely predict that the question can be constructively posed when advances in the technology of computerised data bases soon enable us to work with much larger corpora and far more sophisticated software than we can now.

Sociolinguistics might also be interested in exploring the potential of corpus-browsing as a constructive activity for changing language attitudes. As argued in 4.1 and observed in my own work, a corpus can be a decisive resource for reconnecting the user to a large population of other language users and thus for testing, specifying, and modifying your personal intuitions. Corpus-browsing can re-open the practical limits on your “communicative competence” as a counterbalance to the negative experiences in language education.

Corpus data allow us to discard the sterile idealisation seeking set of “rules” could “assigning structural descriptions” to all grammatical sentences” of an entire “language.” “Infinite sets” can exist only as ideas, and if we used our terms strictly, would undercut rather than support the proposed distinction between “grammatical” versus “ungrammatical.” A truly infinite set would contain all combinations and sequences, including ones extravagantly unlikely to occur, just as the infinite typing of chimpanzees in the well-known philosophers’ example would, in infinite time, produce the works of Shakespeare. In its strict meaning, “infinity” erases the borders not just between probable versus improbable, making our description meaningless, but also between possible versus impossible, making our description interminable. We could not validly construct a single sentence that could never occur in the set, however deviant it might seem to the intuition of the native speaker. So we can safely return the concept of infinity to its proper home in theoretical mathematics.

We can instead work from a very large corpus displaying only the finite set of combinations and sequences that have already occurred, and can explore how their contexts continually tune or reset the probabilities. Our theoretical explorations can inquire into the criteria for determining whether or how far a very large accessible corpus of determinate size can represent the far larger inaccessible corpus of indeterminate (but not infinite) size for the whole language. Of particular interest here, as I have remarked, is the openness of the native speaker’s competence for accommodating the evolution whereby new discourses are continually deciding new interactions among constraints or modifying some older ones.

We can now also discard Saussure’s curious conviction that “no individual, even if he willed it, could modify” the language “in any way,” and that “the community itself cannot control so much as a single word.” Authentic discourse data refreshingly display how both the individual and the community can and do “modify” and “control,” although they are probably rarely aware of any explicit intention to do so, since their “communicative competence” remains open.

In all these ways, the contact with large corpus data offer multiple opportunities for sociolinguistics whose theoretical deliberations were formerly restricted or stymied by practical limits on our access to real language. Such opportunities seem all the more vital at a stage where the varieties and diversities of language are exerting such a decisive practical impact on human lives.

 

5. conclusion and outlook: theory and practice again

 

I am only too conscious that the lines of argument in this paper are strongly at variance with long-standing commitments in language-related intuitions, especially “theoretical linguistics” and its more loyal clients in sociolinguistics. The allure of expedient idealisations is powerful, but the prices we have been paying are too high: stagnation of progress in both theoretical and practical research; breakdown of consensus; fragmentation  and competition among gratuitious “minimalisms”; and evasion of social responsibility in the contest over language pedagogies and multicultural education. A stance of Kuhnian resignation, waiting for the generation of hard-core idealising “normal scientists” to retire, is dangerously. What must be retired is the free license to idealise language, and as soon as possible.

In return, we gain the challenge of exploring new terrains. The words of the eminent neurologist Gerald Edelman (1992:65, 71), who has been assiduously seeking to put language upon a new biological and neurological basis,

 

The best time to be working in a science is when it is in a crisis state. It is then that one is prompted to think of a new way of looking at the data, or of a new theory or of a new technique to resolve an apparent paradox […] We are at the frontier, a place where boundaries shift, where, although amenities may be lacking, the sense of excitement is heightened.

 

References

 

Abraham, Werner et al. (eds.) (1996). Minimal ideas: Syntactic studies in the minimalist framework. Zaragoza: Pórtico Librerías.

 

Afheldt, Horst. 1994. Wohlstand für niemand? Die Markwirtschaft entläßt ihre Kinder. München: Kunstmann Verlag.

Aronowitz, Stanley, & Giroux, Henry. (1986). Education under siege: The conservative, liberal, and radical debate over schooling. South Hadley, MA: Bergin & Garvey.

Bailey, B.L. (1965). Toward a new perspective in Negro English dialectology. American Speech 40, 171-77.

Bakhtin, M.M. The Dialogic Imagination Austin: University of Texas Press

Barber, Benjamin. 1996. Coca Cola und Heiliger Krieg: Wie Kapitalismus und Fundamentalismus Demokratie und Freiheit abschaffen. Bern: Scherz Verlag.

Beaugrande, R. de (1984a). Text production. Norwood, NJ.: Ablex.

______ (1991). Linguistic theory: The discourse of fundamental works. London: Longman.

______  (1994). Function and form in language theory and research: The tide is turning. Functions of Language, 1/2, 163-200

______  (1995-96). Special purpose language in the discourse of epistemology: The “genetic psychology” of Jean Piaget. Linguistica e letteratura 20-21, 227-259.

______  (1996). The ‘pragmatics’ of doing language science: The ‘warrant’ for large-corpus linguistics. Journal of Pragmatics, 25, 503-535.

______  (1997a). New foundations for a science of text and discourse. Greenwood, CT: Ablex.

______  (1997b). On history and historicity in modern linguistics: Formalism versus functionalism revisited. Functions of Language.

______  (1997c) Society, education, linguistics, and language: Inclusion and exclusion in theory and practice Linguistics and Education.

______  (1997d). Performative speech acts in linguistic theory: The programme of Noam Chomsky. Journal of Pragmatics.

______  (1997e). Theory and practice in applied linguistics: Conflicting, estranged, or cyclical? Applied Linguistics.

______  (1997f). Theory versus practice in language planning and in the discourse of language planning. World Englishes.

Bereiter, Carl, & Engelmann, Siegfried (1966). Teaching disadvantaged children in the preschool. Englewood Cliffs: Prentice-Hall.

Bernstein, Basil (1961). Social structure, language, and learning. Educational Research 3:163-76.

______ (1967). Elaborated and restricted codes: An outline. International Journal of American Linguistics 33/2:126-33 their social origins and some consequences. of American Anthropologist 66/2:55-69.

Bickerton, D. (1971). Inherent variability and variable rules. Foundations of Language 7:457-92.

Bisseret, Noelle. Class, Language, and Ideology. London: Routledge (atten to historical conflicts and contexts in purpose and meaning of discourse) also

Bloomfield, Leonard (1933). Language. Chicago: University of Chicago Press.

Cameron, Deborah (1992). Feminism and linguistic theory. London: Macmillan.

Cherryholmes, Cleo. 1988. Power and Criticism: Poststructural Investigations in Education. New York: Teachers College Press.

Chomsky, Noam (1957). Syntactic structures. The Hague: Mouton.

______ (1965). Aspects of the theory of syntax. Cambridge: M.I.T. Press.

______ (1977). Language and responsibility. New York: Pantheon.

______ (1991). Language, politics, and composition. In Gary Olsen and Irene Gales (eds.),. Interviews: Cross-disciplinary perspectives on rhetoric and literacy. Carbondale: Southern Illinois UP. 61-95.

Cross, Patricia (1974). Beyond the open door. San Francisco: Jossey-Bass.

Crowley, Tony 1989. Standard English and the Politics of Language. Urba University of Ill P

Currie, Haver C. (1952). A projection of sociolinguistics: The relationship of speech to social status. Southern Speech Journal 18:28-37.

Decamp, D. (1969). Toward a formal theory of sociolinguistics. Austin: University of Texas thesis.

Dijk, Teun van (1972). Some aspects of text grammars. The Hague: Mouton.

Dittmar, Norbert (1976). A Critical survey of sociolinguistics. New York: St. Martins.

Durbin, M., & Micklin, M. (1968) Sociolinguistics: Some methodological contributions from linguistics. Foundations of Language 4, 319-31.

Eisenberg P., & Haberland, H. (1972). Das gegenwärtige Interesse an der Linguistik. Argument 72/3-4:326-49. Goffman, E. 1964. The neglected situation. America Anthropologist 66, 133-36.

Edelman, G. (1992). Brilliant air, bright fire: On the matter of the mind. New York: Basic Books.

Escribano, José (1993). On syntactic metatheory. Atlantis 15/1: 229-267.

Firth, John Rupert (1957). Papers in Linguistics 1934-1951. London: Oxford UP.

______ (1968). Selected Papers of J.R. Firth 1952-1959, ed. Frank R. Palmer. London: Longman.

Fishman, Joshua (1971). Sociolinguistics: A brief introduction. Rowley, MA: Addison-Wesley.

Fairclough, Norman (1995). Critical discourse analysis. London: Longman.

Francis, Gill (1993). A corpus-driven approach to grammar. In Mona Baker, Gill Francis, and Elena Tognini-Bonelli (eds.), Text and technology: In honour of John Sinclair. Amsterdam: Benjamins. 137-156.

______, & and Sinclair, John McHardy (1994). I bet he drinks Carling Black Label: A riposte to Owen on corpus Grammar. Applied Linguistics 15:190-200

Giroux, Henry (1992). Border Crossings: Cultural Workers and the Politics of Education. London: Routledge.

 Giroux. Teachers as Intellectuals New York: Bergin and Garvey 1988

Halliday, Michael (1961). Categories of a theory of grammar. Word 17/3: 241-92.

Hasan, Ruqaiya (1987). The grammarian’s dream: Lexis as most delicate grammar. In Michael Halliday and Robin Fawcett (eds.), New Developments in Systemic Linguistics. London: Pinter. 184-211.

Hjelmslev, Louis (1969 [1943]). Prolegomena to a theory of language. Madison: University of Wisconsin Press.

Houston, S.H. (1969). A sociolinguistic consideration of the Black English of children in north Florida. Language 45:599-607.

______ (1970). Competence and performance in Child Black English. Language Sciences 12:9-14.

Hymes, Dell. 1962. The ethnography of speaking. In Thomas Gladwin and William C. Sturtevant (eds.), Anthropology and Human Behavior. Washington D.C: Centre for Applied Linguistics, 13-53.

Hymes, Dell (1967) Models of the interaction of language and social setting. Journal of Social Issues 23:8-28

Kanngiesser, S. (1972). Bemerkungen zur Soziolinguistik. In U. Engel & O. Schwenke (eds.), Gegenwartssprache und Gesellschaft. Düsseldorf: Schwann. 82-112.

Klein, Wolfgang (1974). Variation in der Sprache. Kronberg/Taunus: Scriptor

Labov, William (1969). Contraction, deletion, and inherent variability of the English copula. Language 45:715-62.

______ (1970a). The study of language in its social context. Studium Generale 23:30-87.

______ (1970b). The logic of non-standard English. In James Alatis (ed.), 20th Georgetown University Round Table on Languages and Linguistics. Washington D.C: Center for Applied Linguistics. 1-43

______ (1972). Sociolinguistic patterns. Philadelphia: University of Pennsylvania Press.

______;  Cohen, P;  Robins, C.; & Lewis, J. (1968). A study of the Non-Standard English of Negro and Puerto Rican speakers in New York City. Washington, D.C.: US Office of Health, Education, and Welfare.

Lakoff, George (1968). Pronouns and reference. Blooomington: Indiana University Linguistics Club.

Loflin, M.D. (1969). Negro Nonstandard English and Standard English: Same or different deep structure? Orbis 18:74-91.

_____ (1970). On the structure of the verb in a dialect of American English. Linguistics 59:145-28.

Louw, Bill (1993). Irony in the text or insincerity in the writer?: The diagnostic potential of semantic prosodies. In Mona Baker, Gill Francis, and Elena Tognini-Bonelli (eds.), Text and technology: In honour of John Sinclair. Amsterdam: Benjamins. 157-176.

Lyons, John (1977). Semantics. Cambridge: Cambridge University Press.

Manley, Michael (1991). The Poverty of Nations. London: Pluto Press.

Martin, Hans-Peter, & Schumann, Harald. (1996). Die Globaliserungsfalle: Der Angriff auf Demokratie und Wohlstand. Reinbeck: Rowohlt.

Meillet, Antoine (1903-04). Review of Michel Bréal, Science des significations. Année Sociologique 8:640-641.

Ohmae, Kenichi (1996). Der neue Weltmarkt: Das Ende des Nationalstaates und der Aufstieg der regionalen Wirtschaftszonen. Hamburg: Hoffmann and Campe.

Paulston, C.B. (1971). On the moral dilemma of the sociolinguist. Language Learning 21:175-81.

Pennycook, Alastair. (1995). The cultural politics of English as an International Language. London: Longman.

Phillipson, Robert, Tove Skuttnabb-Kangas, & Mart Ranut (eds.) (1994). Linguistic human rights. Berlin: de Gruyter.

Phillipson, Robert (1992). Linguistic imperialism. Oxford: Oxford University Press.

Reich, Robert B. (1991). The resurgent liberal. NY: Vintage.

_____ (1993). The work of nations: Preparing for 21st-century capitalism. New York: Simon and Schuster.

Sapir, Edward (1921). Language. New York: Harcourt, Brace, & World.

Saussure, Ferdinand de (1969 [1916]). Course in general linguistics (transl. Wade Baskin). New York: McGraw-Hill.

Sinclair, John McHardy. (1984). Naturalness in language use In Lexis and Lexicography. Singapore: National University Press, 96-104.

______ (1991). Shared knowledge. In James Alatis (ed.), Georgetown University Round Table on Languages and Linguistics 1991. Washington, D.C.: Georgetown University Press, 489-500.

_____ (1996). What do we know about language, how do we get to know it, and what has all that got to do with language teaching? Paper at the International Conference on Analysis and Description: Applications to Language Teaching, at Lignan College and at the Hong Kong University of Science and Technology, June 1996(available on video from RdB.

Smith N.V. (1983). Speculative linguistics: An inaugural lecture. London: University College.

Thom, R. (1989 [orig. 1972]). Structural stability and morphogenesis (trans. D.H. Fowler). New York: Addison-Wesley.

Vološinov, Valentin Nikolaievich (1973 [1929]). Marxism and the philosophy of language. New York: Seminar.

Wellek, René, & Warren, Austin (1956). Theory of literature. New York: Harcourt, Brace, and World.

Wodak, Ruth (1996). Disorders of discourse. London: Longman.

Wolfram, W.A. (1969). A sociolinguistic description of Detroit Negro speech. Washington, D.C.: Center for Applied Linguistics.