Home | Feature Inventory | Related Issues | Search | Contact |
![]() |
Grammatical Features Home > Feature Inventory > Gender
GenderAnna Kibort & Greville G. Corbett
1. What is 'gender''Gender' most commonly refers to classes of nouns within a language which are 'reflected in the behavior of associated words' (Hockett 1958:231). The term is used both for the particular classes of nouns (so, a language may have two or more genders) and for the whole grammatical category (so, a language may or may not have the category of gender). Almost all languages have some grammatical means of dividing up their noun lexicon into distinct classes. Gender is one such device; other devices are frequently grouped under the term 'classifiers' and include noun classifiers, numeral classifiers, classifiers in possessive constructions, verb-incorporated classifiers (also referred to as classificatory noun incorporation), classificatory verbs, locative classifiers, and deictic classifiers. Distinguishing gender from classifiers is justified in Dixon (1982) and Corbett (1991:136-137), and examples can be found in Allan (1977) and Aikhenvald (2000). In some traditions genders are referred to as 'noun classes'. The core of the gender system in any language is the gender assignment system, a set of rules according to which nouns are allotted to genders. From three possible assignment systems - based on the meaning of words, the form of words, or a combination of both - two gender assignment systems are attested in the world's languages: semantic gender assignment systems and semantic-and-formal gender assignment systems. In other words, there is always some semantic basis to gender classification, though genders can be semantically transparent to a greater or lesser extent. In languages with a strict semantic gender assignment system the meaning of a noun is sufficient to determine its gender, for all or almost all nouns. An example of such a language is Bagvalal (North Caucasian; Kibrik et al. 2001:64-66), where nouns denoting male humans (and only those) are masculine, nouns denoting female humans are feminine, and all remaining nouns are neuter. In Kala Lagaw Ya (Australian, spoken on the Western Torres Strait Islands) nouns denoting males (and the moon, a mythical 'male entity') are masculine, and all others belong in the feminine gender (Bani 1987). In Diyari (Dieri; a now extinct South Australian language), nouns denoting female animates are in one gender, and all remaining nouns in another (Austin 1981). Strict semantic three-gender systems are also common in the Dravidian family (e.g. in Kannada and Tamil). Many languages have a predominantly semantic gender assignment system, where assignment of nouns to some genders, or some nouns to genders, is semantically transparent, but there are exceptions for which there is no readily available explanation. An example of such a language is Bininj Gun-Wok (a large group of related dialects spoken in Western Arnhem Land, Australia). It has four genders with a semantic core: masculine, feminine, vegetable, and neuter - but some semantic groups of nouns appear to be allocated to the genders arbitrarily. It has been claimed that many of such apparently random allocations can be explained once the cultural setting of the language is taken into account and gender assignment is seen as reflecting the worldview of the speakers (see e.g. the discussions of Dyirbal, Australian, Dixon 1972 and recent re-analysis in Plaster & Polinsky 2010; or Ojibwa and other Algonquian languages of North America). The most frequently found semantic distinctions on which gender assignment is based are male versus female (many Afroasiatic languages, East-Nilotic, Central Khoisan), human versus non-human (some Dravidian languages), rational (i.e. humans, gods, demons) versus non-rational (Tamil and other Dravidian languages), and animate versus inanimate (Siouan, North American). Other less usual criteria may include: non-flesh food (Dyirbal, Australian), insects (the Rikvani dialect of Andi, Nakh-Daghestanian), diminutives (various Bantu languages), places (also Bantu), and sometimes also shape and size (also Bantu; however, sometimes primarily sex-based genders may have additional shape- and size-related meanings). Furthermore, languages may combine these parameters in different ways (Aikhenvald 2006:463-464). In many languages, apart from the semantic basis for gender assignment which is at the core of the system, we find additional rules for assigning nouns to genders according to their form. The rules may access two types of information: phonological and morphological, and there may be combinations of such rules. A clear example of gender assignment depending on phonological information is found in Qafar (East Cushitic; Parker & Hayward 1985), where sex-differentiable nouns are assigned gender - masculine or feminine - according to their semantics, and all other nouns are assigned to the two genders depending on their phonological form. Typically, nouns denoting males and females fit with the phonological assignment rules, but in cases of conflict semantic rules take precedence. An example of a language with a morphological assignment system supplementing the semantic core is Russian. As in many other Indo-European languages, nouns denoting sex-differentiables in Russian are assigned masculine and feminine genders. However, the semantic residue is shared between all three Russian genders - masculine, feminine, and neuter - with the neuter not even receiving the majority. The gender assignment of the residue is determined by the morphological form of the nouns: there are four main inflectional classes in Russian, and the remaining nouns are assigned gender according to their inflectional class (with indeclinable nouns requiring separate, additional rules). Again, many sex-differentiable nouns would be assigned correctly by the morphological assignment rules alone. However, in cases of conflict, semantic rules take precedence here, too. Apart from many other Indo-European languages, semantic-and-morphological assignment systems are found in Arabic (Cowell 1964:372-375) and Kuot (a language isolate of New Ireland, Papua New Guinea; Lindström 2002:147-164, 176-194). Finally, even though in many languages most nouns are assigned to just one gender, in some languages different genders can be chosen to highlight a particular property of the referent. Manambu (a Sepik language of Papua New Guinea) has two genders: masculine gender including male referents, and feminine gender including females. But the choice of gender can vary and depend on other factors: if the referent is exceptionally long, or large, it is assigned masculine gender, and if it is small and round, it is feminine (Aikhenvald 2006:464). English provides more familiar examples of this phenomenon. There are double- and multiple-gender nouns, such as doctor, baby and nouns referring to animals (especially familiar animals such as pets). There are also hybrid nouns (which trigger different agreement forms depending partly on the type of target), such as ship and other 'boat nouns' which can take the personal pronoun she but not the relative who. For discussion of the types of nouns that deviate from a consistent agreement pattern, see Corbett (1991:180-184). Gender systems are typically found in languages with a fusional or agglutinating (not an isolating) profile (Aikhenvald 2006:463). It is unusual, but possible, for a language to have both gender and classifiers. This has been found in Tariana (North Arawakan, Brazil; Aikhenvald 1994), Retuarã (Tucanoan, Columbia; Strom 1992:10-11, 34-36, 45-47), and Tidore (West Papuan, Indonesia; van Staden 2000:77-81). Furthermore, Ngan'gityemerri (Daly; northern Australia) shows the development from generic classifiers into genders (Reid 1997), and a similar situation, but with a very large system, has been reported for Miraña (Boa, Witotoan, Colombia; Seifart 2004). Jump to top of page/ top of section 2. Expressions of 'gender'In some languages there is a marker of gender on every noun, in others nouns bear no markers of gender. The former, perhaps less familiar, class of languages includes Bantu languages, Berber (especially North Berber - Kabyle, Tashelhit, possibly Tamazight - which mark just about all nouns for gender; Alexandra Aikhenvald personal communication), many Arawak languages of South America (e.g. Baniwa and Tariana; in these languages genders are marked on all nouns, sometimes with the exception of human nouns, and are also used in other classificatory functions, such as numeral classifiers; Aikhenvald 2007), and many languages in New Guinea. However, no amount of marking on a noun can be taken as evidence that the language has gender. The evidence of gender comes from the agreement targets that show gender in the language. It is taken as the definitional characteristic of gender that some constituent outside the noun itself must agree in gender with the noun. In other words, a language has a gender system only if we find different agreements ultimately dependent on nouns of different classes (Corbett 1991:146ff; 2005:126). Agreement can appear on: other words in the noun phrase (adjectives, determiners, demonstratives, numerals, etc., even focus particles), on the predicate of the clause, on an adverb, and - for some linguists - on an anaphoric pronoun outside the clause boundary. Languages often have portmanteau markers combining information about gender with number, person, case, or other features. Barlow (1992:134-152) discusses the issue of the scope of agreement and concludes that there are no good grounds for distinguishing between agreement and antecedent-anaphor relations. If antecedent-anaphor relations are accepted as agreement, languages in which gender distinctions are absent from noun phrase modifiers and from predicates and in which free pronouns present the only evidence for gender, can be counted as having a (pronominal) gender system. Such languages are rare, and the best known example is English, which is typologically unusual in this respect (Corbett 2005:126; see also §6 below); another such language is Defaka (Niger-Congo, Nigeria; Jenewari 1983:103-106). Genders, understood as classes of nouns within a language which are 'reflected in the behavior of associated words' (Hockett 1958:231), are agreement classes which may be defined as follows (Corbett 1991:147-150; cf. Zaliznjak 1964:30; see also §4 below):
'Standing in the same morphosyntactic form' means that, crucially, the nouns have to have the same number and case. Unlike gender, these two features can often be justified without reference to agreement, only on the basis of the morphological material on the noun itself. Thus, when number and case are taken out of the equation, the remaining distinctions between agreement classes - if there are any - identify these classes of nouns as belonging to different genders. (Note that these are 'controller genders', which are the values of the morphosyntactic feature of gender, as opposed to 'target genders' which are generalisations about the patterns of forms). Jump to top of page/ top of section 3. The status of 'gender' as a featureGender is an indisputable morphosyntactic feature, since it is required for agreement. The realisation of the value for gender on the target is the canonical instance of the need for a syntactic rule of agreement (Corbett 2006:126). Gender is an inherent feature of nouns, and a contextual feature (determined through agreement) for any other elements that have to agree with the nouns in this feature (e.g. adjectives, verbs, etc.). Typically, gender is lexically supplied and its value is fixed for the noun. However, on some nouns (multi-gendered nouns and hybrid nouns) gender can be a semantically selected feature, where one gender value is selected from a set of options. Therefore, the lexical entries of nouns in a gendered language must specify either that the noun has a fixed gender value or that it is capable of taking on different gender values as dictated by the semantics. An interesting question that arises with respect to the shape of lexical entries is what information exactly needs to be specified: the gender of a noun in a gendered language must be available, but it can be derived from other information - semantic, morphological, or phonological. Some lexical oppositions which correspond to semantic distinctions similar to gender, but which are not instantiations of the morphosyntactic feature of gender, include semantically contrasting lexical items (as in Kanuri, Nilo-Saharan, Nigeria - which doesn't have a gender system but does have lexical items for 'boy' versus 'girl', for example), or lexical derivations (e.g. in English, the class of nouns ending in -tion). Jump to top of page/ top of section 4. The values of 'gender'The noun inventory of a language is divided into different classes, or genders, according to the different agreements they take (for the notion of 'agreement class', based on Zaliznjak 1964, see §2 above). 'For two nouns to be in the same agreement class, they must take the same agreements under all conditions - that is, if we hold constant other features such as case and number. (...) If two nouns differ in their agreements when factors such as case and number are held constant, then they belong to two different agreement classes and normally they will belong to two different genders' (Corbett 2006:750). Note that the earlier Bantuist tradition treated nouns as being in different noun classes when singular and plural, and it is often stated that Bantu languages have twenty noun classes. Counted in the way outlined above, the number is typically between seven and ten. For many languages, establishing agreement classes determines the number of genders straightforwardly. However, in languages with more complex gender systems, it may be necessary to separate out the classes into which nouns are divided (the controller genders) from the number of different genders marked on agreement targets (the target genders), and propose a different number of genders for the controllers and for the targets (Corbett 1991; see also the example of Romanian in the 'Problem cases' section below). However, while controller genders are the values of the morphosyntactic feature of gender, target genders can be thought of as generalisations about the patterns of gender markers (as found on targets). Based on the analysis of a sample of 256 languages, Corbett (2005:127-129) reports that somewhat over half (144) have no gender system, two-gender systems are common (50 languages in the sample), three-gender systems are around half as common (26 languages), and four-gender systems about half as common again (12 languages). Larger systems, with five or more genders, represent a substantial minority (24 languages in the sample). A minimal gender system requires two genders, and this is the most common number of genders, found in most geographic areas where gender is found. For larger systems, the major source is Niger-Congo, where systems in excess of five genders are common, with the record-breaker Nigerian Fula having around twenty genders (the exact count depending on the dialect). Outside Niger-Congo, Arapesh (Papua New Guinea) has thirteen genders, and Ngan'gityemerri (northern Australia) arguably has fifteen genders (for references see Corbett 2005:127). Even though every language that has gender divides up its noun lexicon in a different way, the typical semantic distinctions that underlie gender distinctions have given rise to several common labels used for gender values (see also §6 below: a note on 'Gender labels and the correspondence problem'). However, because languages with strictly semantic gender assignment are rare, it is important to remember that for most languages gender labels with semantic denotation likely correspond to classes of nouns with membership determined by both semantic and formal criteria. For example (with thanks to Dunstan Brown for discussion of these):
Apart from using semantic labels for gender values, two other conventions are to use Arabic numerals, or Roman numerals. Roman and Arabic numerals are often used for languages for which there is a descriptive tradition involving use of the term 'noun class' instead of 'gender', in particular in languages of the Caucasus or Bantu languages; Roman numerals are particularly useful where the number of genders is large (as in Bantu languages). If the 'noun classes' are involved in agreement systems, they are gender systems. Roman and Arabic numerals may also be used in instances where another label is possible. For instance, in one language the gender to which nouns with human denotation are assigned might be called 'human', whereas in another language nouns with a similar denotation may be assigned to a gender with an arbitrary Arabic numerical label such as '1'. Similarly, in one language the gender to which nouns with male rational denotation are assigned might be called 'masculine', whereas in another language nouns with a similar denotation may be assigned to a gender with an arbitrary Roman numerical label such as 'I' (see e.g. Corbett 1991:25; again thanks to Dunstan Brown for discussion). Jump to top of page/ top of section 5. Oddly behaving gender markers
The following table lists affixal agreement forms marking verbs in Archi:
The personal pronouns zon 'I' and un 'you (singular)' take gender agreements corresponding to the gender of the speaker or addressee: male humans trigger gender I agreement, female humans - gender II agreement, and imaginary locutors of genders III and IV (e.g. a speaking cow and a speaking goat kid) trigger gender III and IV agreements, respectively. If Archi has no person feature, we should expect the same pattern of agreement, based on gender, to occur with personal pronouns in the plural. Indeed, this is what happens with the personal pronoun teb 'they'. It takes gender I/II agreement (the prefix b-) when the referents are human, and gender III/IV agreement (zero marking) when the referents are non-human. However, unexpectedly, the personal pronouns nen 'we' and žwen 'you (plural)' referring to humans do not take the gender-based I/II agreement marker (b-). Instead, they trigger zero marking, which we gloss as III/IV.PL as in the table above:
An analysis of Archi without a person feature requires a complication of the gender system, as we have to add four more genders to the system, each representing one of the four types of gender agreement required by the set of the personal pronouns. In this way, the pronouns are treated as unique lexical items. Furthermore, very complex and typologically odd resolution rules for agreement with conjoined phrases have to be proposed to account for agreement when one of the conjuncts is a pronoun. The alternative is to base the gender resolution rules for Archi on a general rule formulated in purely semantic terms (i.e. if there is at least one conjunct denoting a rational or rationals, gender I/II agreement will be used; otherwise gender III/IV will be used) and, rather than treating the personal pronouns as each being an exception in terms of gender, accept a person feature for Archi. In this way, the gender resolution rules in Archi are fairly usual, and the person resolution rule required (persons 1 and 2 > person 3) is standard, except for the interesting points that there is no distinction here between persons 1 and 2, and that it operates only in the plural. The following paradigm illustrates Archi agreement in person, showing how gender markers double as person agreement markers:
Jump to top of page/ top of section 6. Problem cases
Because of these characteristics, the term 'gender' is most commonly used to refer to classes of nouns within a language which are 'reflected in the behaviour of associated words' (Hockett 1958:231). This is also the definition adopted by Corbett (1991), who argues that in order to define gender we have to refer to the targets of agreement in gender, which allow us to justify the classification of nouns into genders. In the typology of grammatical features which underlies this Inventory (see the 'Feature Inventory' page), it has been possible to retain the special status of gender. However, the position of gender within the typology needs to be clarified in order to enable comparisons with other features. As defined in Hockett (1958) and Corbett (1991), gender is exclusively a feature of agreement. Hence, the feature is referred to as 'gender' in a language if it concerns the classification of the nominal inventory of the language, but only if the inherently assigned gender values found on nouns are matched by contextually assigned gender values found on targets of agreement in gender. If a language has a system of nominal classification expressed through inflectional morphology, but the feature of nominal classification does not participate in agreement, it does not qualify as 'gender'. With respect to syntax, the status of such feature is similar to the status of tense in most of the familiar languages: an inflectionally marked feature such as tense expresses a semantic or formal distinction, but is not relevant to syntax for the purposes of agreement or government. Syntax does not need to know the value of the inflectional noun classifier or inflectionally marked tense. Therefore, the distinction between inflectional noun classification and gender is that, while the former can only be a morphosemantic feature, gender can only be a morphosyntactic feature.
Furthermore, the semantics of the genders can differ dramatically, as say between Tamil (semantically assigned) and Slovene (semantically and formally assigned), though both have three genders. As was discussed in §1, gender systems in all languages do appear to have some semantic basis. For example, in Russian (as in many other Indo-European languages), for sex-differentiables, nouns denoting males are masculine, and those denoting females are feminine. However, the nouns not covered by these rules - the semantic residue - are not simply assigned to the neuter gender. Rather, the semantic residue is shared between the three genders, with the neuter gender not even receiving the majority. The following table, from Corbett (2006:752), shows that nouns with apparently similar semantics can be assigned to all three genders. The assignment of these nouns to gender classes is achieved on the basis of their morphological properties (specifically, their inflectional class):
Thus, when describing the feature inventory, it is not enough to list gender values, but we need some declaration of the gender system to which they belong.
However, difficulties arise in languages with more complex gender systems. If the distinction between controller and target genders is not considered, gender values may be presented as though the pattern was equally uncontroversial, but no indication is given about what the values really mean. Then, the number of genders in a particular language can be the subject of interminable dispute, or we find that similar situations are described differently by those working on different language families. This problem can be seen as another instance of the correspondence problem, where this time the feature values do not correspond intralinguistically, rather than crosslinguistically. A good example of a language whose gender system has been the source of continuing disagreement is Romanian (for references to the extensive literature on this topic see Corbett 1991:150). The argument, which has gone on for decades, is whether we have two genders or three. In terms of agreement classes, the situation is clear: there are three classes that should be set up as follows (where the agreement endings are typical allomorphs for each target gender):
However, simply to say that Romanian has three genders suggests that it is like German, Latin or Tamil, even though in each of these languages, intuitively, the situation is rather different. It can be seen that Romanian has three controller genders (i.e. the genders into which nouns are divided), and it has two target genders (i.e. the genders which are marked on adjectives, verbs, and so on, depending on the language) in both singular and plural. The gender system of Romanian can, then, be diagrammed as follows (illustrated with the agreement forms of the adjective bun- 'good'): ![]() The controller genders in Romanian (i.e. the lines labelled 'class I nouns', 'class II nouns', and 'class III nouns') are usually called 'masculine' (I), 'feminine' (II), and the disputed gender (III) is sometimes called 'neuter' and sometimes 'ambigeneric'. The latter is a useful term, provided it is used not to imply that there is no distinct gender but rather that the situation is different from the more common Indo-European three-gender system: Romanian does have neuter gender, but it is non-autonomous. While there are many languages where the number of controller and target genders are the same, mismatches of the type that occurs in Romanian are common. Examples of several even more complex systems are given in Corbett 1991. The important point here is that the mismatches do not concern one or two odd exceptions within the category of nouns or agreeing forms, but they concern substantial parts of the lexicon of the language: the diagram above represents systematic correspondences occurring across the whole lexicon in Romanian. At present, the best account of such systems, which captures the generalisations with regard to gender agreement, involves proposing a different number of genders for controllers and for targets, as has been sketched for Romanian.
English has three forms of the singular personal pronoun (he, she and it) and two forms of the relative pronoun (who and which), which distinguish between masculine, feminine, neuter, personal, and nonpersonal nouns, respectively. The patterns of pronoun coreference for singular nouns give three consistent agreement patterns in English (in the plural only the distinction between personal and nonpersonal is preserved, i.e. they/who versus they/which) (Corbett 1991:180): who/he - masculine, who/she - feminine, and which/it - neuter. Payne (2006:713-714) extends these to four: who/he - personal masculine, who/she - personal feminine, which/it - nonpersonal neuter, and which/she - nonpersonal feminine (for the so called 'boat nouns', e.g. ship; note, however, that these can be analysed as hybrid nouns that trigger different agreement forms depending partly on the type of target - see Corbett 1991:180-184). And, at the extreme end, Quirk et al. (1985:314) propose nine 'gender classes' for singular nouns in English: male (brother), female (sister), dual (doctor), common (baby), collective (family), higher male animal (bull), higher female animal (cow), lower animal (ant), and inanimate (box). This classification results from an attempt to differentiate between all possible types of nouns which have different agreement possibilities based on pronoun coreference. However, such multiplication of noun classes is frequently unsatisfactory. Intralinguistically, it misses generalisations; and crosslinguistically, it makes similar systems appear more different than they really are (Corbett 1991:161). The most satisfying account for the given language should list consistent patterns of pronoun coreference (which often correspond to the traditional number of genders generally accepted for that language), with the additional extensions to the system identified as subgenders, overdifferentiated targets, inquorate genders, defective nouns, multiple-gender nouns, or hybrid nouns; it is also possible that the gender system of a language may be best analysed as combining two coexisting gender systems (Corbett 1991:161-188).
Furthermore, the rules of pronoun usage in spoken Dutch make almost all nouns of the language appear to be 'hybrids' (Corbett 1991:183-184) which neither simply take the agreements of one consistent agreement pattern nor belong to two or more genders, but whose agreement form depends in part on the type of target. This is an unsatisfactory analysis, as hybrid nouns are expected to be isolated exceptions to the general rule that a noun consistently controls a particular gender value on all targets. This problem has been identified by Audring (forthcoming b). She notes that the hybridity account construes the situation entirely from the perspective of the noun. Hybridity is a nominal property, and the nouns are held responsible for all agreement values that appear on the targets. However, she argues that the Dutch data can be explained better from a pronominal perspective. The pronouns themselves have developed a semantic link between countability and gender. This has the consequence that some syntactic agreement configurations become dispreferred and are replaced by semantically motivated choices. Thus, the agreement patterns of Dutch reflect properties of pronouns rather than those of nouns.
Since the adjective has to express gender and number, in a situation where there is no controller that could dictate its gender and number, it shows 'default agreement', which is typically 'third person singular neuter'. Hence, instances of gendered adjectives which have no controller are not instances of government but of default agreement. Jump to top of page/ top of section 7. Key literature
REFERENCES
Jump to top of page/ top of section How to cite this entry:
|