complexity.pdf

(335 KB) Pobierz

Self-organization and Selection

in the Emergence of Vocabulary

Authors:

Jinyun Ke

Language Engineering Laboratory

Department of Electronic Engineering

City University of Hong Kong, Hong Kong

phone: (852)27887187

fax: (852)27887791

email: jyke@ee.cityu.edu.hk

James Minett

Department of Electronic Engineering

City University of Hong Kong, Hong Kong

Ching-Pong Au

Department of Chinese, Linguistics and Translation

City University of Hong Kong, Hong Kong

William S-Y Wang

Language Engineering Laboratory

Department of Electronic Engineering

City University of Hong Kong, Hong Kong

Number of text pages: 13

Number of ﬁgures: 7

Number of tables: 9

This is a preprint of an article accepted for publication in Complexity copyright (2002)

Abstract

Human language may have started from a consistent set of mappings between meanings

and signals. These mappings, referred to as the early vocabulary, are considered to be the

results of conventions established among the agents of a population. In this study, we report

simulation models for investigating how such conventions can be reached. We propose that

convention is essentially the product of self-organization of the population through interac-

tions among the agents; and that cultural selection is another mechanism that speeds up the

establishment of convention. Whereas earlier studies emphasized either one or the other of

these two mechanisms, our focus is to integrate them into one hybrid model. The combina-

tion of these two complementary mechanisms, i.e. self-organization and cultural selection,

provides a plausible explanation for cultural evolution which progresses with high transmis-

sion rate. Furthermore, we observe that as the vocabulary tends to convergence there is a

uniform tendency to exhibit a sharp phase transition.

Summary

Language is one of the deﬁning characteristics speciﬁc to humans. It is well known that

language evolves continuously at high rate. However, the answers to why and how human

language emerged and changes are still being pursued. In this study, we speculate that human

language may have started from a consistent set of mappings between meanings and signals.

These mappings are considered to be the results of conventions established among the indi-

viduals of a population. We use simulation models to investigate how such conventions can

be reached.

Keywords

language evolution, emergence, vocabulary, self-organization, selection

Introduction

Language is generally considered to be one of the most important characteristics that diﬀer-

entiates humans from other species. The question as to how fully-ﬂedged human language

came into being has been pursued for centuries; various theories have been suggested and

many controversies exist [1]. Recently there has been growing interest in the study of lan-

guage emergence and change within the framework of complex adaptive systems [2]. In this

study, we focus on the emergence of vocabulary within a population of early humans using

simulation models, as is often adopted in the study of complex adaptive systems.

Most would agree that the ﬁrst step in human language evolution was communication

systems using a number of holistic signals, like those found in primates and other animals

such as bees and birds [3]. These signals are not equivalent to the “names” or “words” as

we use these terms today. Each signal may refer instead to a complex mix of meanings. The

signal for “danger”, for example, may imply a cry of fear, a warning – “run, there is danger

coming”, as well as a general reference for the dangerous predator. Later, such a signal

would have been narrowed down to name a single object or class of objects, and used in a

non-situation speciﬁc fashion [4]. The realization of signals as symbols representing objects or

events is referred as the “naming insight” [5]. In the language development of a normal child,

the naming insight comes so naturally that most parents do not notice the exact moment of

this event without special attention. On the contrary, chimpanzees need intensive teaching

to learn to name [6].

How the naming insight occurred in the phylogeny of the hominid line is still a mystery.

However, we may assume that there was a stage during which early humans became equipped

with the naming insight. What interests us in this study is that once the naming insight was

achieved in the agents, each of whom might have his or her own way to name in an arbitrary

manner, how did the actual naming of objects or events become consistent across the entire

population? Some twenty centuries ago, two great philosophers, continents apart, arrived

at a similar observation that names are formed by convention [7]. Xunzi in China taught

that “names have no intrinsic appropriateness” and “names have no intrinsic reality” —the

appropriateness and reality of names are both given by convention. At about the same time

in Greece, Plato wrote that “any name which you give is the right one, and if you change

that and give another, the new name is as correct as the old.”

In this study, we use simulation models and mathematical analysis to explore the process

of how convention is achieved to form a coherent naming system—which we may refer to as

vocabulary.

In our models, at ﬁrst, a number of agents in a population each have their own

agent-speciﬁc way for naming a set of objects. Upon interacting with each other, some are led

to modify their naming systems. Agents may not be concerned with and may not care about

the general communication performance at the population level, but instead concentrate only

on their

own

communication performance with other agents. Without any explicit or implicit

design, however, a consistent common vocabulary develops as an emergent property of the

population. Our proposed model of vocabulary emergence shares the same spirit as the

“invisible hand” mechanism suggested by Keller [8], which has been echoed by Kirby: “the

local, individual actions of many speakers, hearers, and acquirers of language across time

and space conspire to produce non-local, universal patterns of variation” [9]. In fact such

phenomena have been widely studied in many complex systems in various disciplines and are

described as “self-organization” [10]. A system is deﬁned as self-organizing “if it acquires

a functional, spatial, or temporal structure without speciﬁc interference from the outside”

[11]. In recent studies of language emergence, there have been several reports explicitly

adopting this framework, which use computer simulation models to study the self-organization

mechanism of the emergence of vocabulary [12] and sound systems [13].

From the evolution point of view, self-organization plays a role in the synchronic and

horizontal interaction in a population. However, self-organization has not addressed the

evolution process in the diachronic aspect, i.e. from generation to generation. On the other

hand, natural selection has been generally considered as the major mechanism for evolution

in terms of gene transmission across generations, i.e. vertical transmission. In the study of

language evolution there are two views related to natural selection. In one view, some consider

language as the product of a biological organ, the Language Acquisition Device (LAD); they

think it is the LAD that must have evolved through natural selection [14]. Alternatively,

Dawkins introduced the term “meme” to refer to a new type of replicating unit in cultural

evolution as a counterpart of the gene in biological evolution [15]. In this sense, a language is

an elaborate complex of memes, which evolve based on linguistic or cultural selection

. It is

the languages

themselves

that adapt for their own “survival” in the transmission from speaker

to speaker. It is impossible to evaluate which language, e.g. English, Chinese or any other,

is better, in terms of its overall ﬁtness by taking into account learnability, expressiveness,

complexity and so on. However, we argue that when considering some speciﬁc aspects,

for example the possibility of ambiguity in terms of the number of homophones

, diﬀerent

language subsystems indeed can be compared with respect to their ﬁtness. Subsystems with

higher ﬁtness will have a higher probability to diﬀuse more into the next generation. In this

study we assume that vocabulary can be transmitted from generation to generation as a result

of children’s learning from parents and that the communicative consistency of the vocabulary

determines the possibility in which it is transmitted to the next generation. In this way, we

study the eﬀect of cultural selection in this vertical transmission process and examine how

selection couples with self-organization in catalyzing the formation of a consistent vocabulary

in the population.

In the study of modeling vocabulary emergence, there have been a few related studies.

Among them, Steels [12], as mentioned above, focuses on the self-organization mechanism,

while Hurford [16] and Nowak & Krakauer [17] mainly emphasize the selection process. In this

study, we propose that both mechanisms are indispensable for structure emergence in evolu-

tion. Using the “tinkerer” metaphor introduced by Jacob [18]

, we highlight the combined

function of these two “tinkerers” in language evolution.

The rest of the paper is organized as follows. First, we report an imitation model in

Section 2, which simulates how a common vocabulary is formed by agents imitating each other

either merely randomly or by following the majority. We use Markov chain theory to analyze

the model. A detailed proof of the convergence is given in the appendix. In Section 3 we

present an interaction model which uses a probabilistic representation of vocabulary. Diﬀerent

parameters in the model are investigated by simulation and a few interesting observations

are reported. Section 4 introduces a hybrid model which combines the self-organization and

cultural selection mechanisms. Conclusions and discussion are given in the last section.

The imitation model

The strong ability of imitation in human, even from early infancy, has been extensively

documented in the studies reported by many investigators, e.g. Meltzoﬀ [19]. While other

In this study, however, we still use the term “natural selection” in a broad sense to refer to the selection

mechanism in language evolution.

Two or more words are homophones if they are pronounced the same but diﬀer in meaning, such as the

words ‘too’ and ‘two’.

Jacob states that “natural selection

. . .

works like a tinkerer, who does not know exactly what he is going

to produce, but uses whatever he ﬁnds around him, whether it be pieces of string, fragments of wood, or old

cardboards,

. . .

to produce some kind of workable object.”

social animals, particularly the primates, also imitate, it appears that the tendency is by far

the strongest and most general in our species. We assume that imitation may serve as the

most explanatory mechanism for the formation of a common vocabulary. Before establishing

a consistent way of naming things, early humans very likely made use of their propensity for

imitation; the younger ones imitating their elders, the followers imitating the leaders or, just

by chance, their neighbors. Such imitation between agents can be seen as self-organization

in the population.

In this study, we set up a model to simulate the process of word imitation by agents in a

population. Assume that in a population of

agents there are

classes of objects

that

the agents must name in their daily communication, and

diﬀerent utterances

that the

agents can make. Each agent’s vocabulary consists of a set of associations, or one-to-one

mappings, between meanings and utterances. Each agent can create and change his or her

own vocabulary by imitation, similar to the model proposed by Steels [12]. We assume that,

at ﬁrst, each agent already has his or her own speciﬁc meaning-utterance mappings. For

example, the M-U mappings of two agents (A and B) might be as shown in Table 1.

Table 1 about here

Let us assume that when they interact with each other, the two agents only communicate

a single meaning. When they are not using the same word to represent that meaning, an

imitation event is likely to occur; for example A may imitate B when they communicate

meaning

by replacing his own

with B’s

, or vice versa. We assume that each agent

is equipped with an imitation strategy. For simplicity, we assume that such imitation events

involve no errors. There are many possible imitation strategies that can be conceived. In

this study we report two strategies that may be the most realistic

•

Strategy 1: imitating by random direction—either A imitates B, or vice versa, with

equal probability.

•

Strategy 2: imitating by following the majority—A imitates B if the utterance B uses

is shared by more agents than the utterance A uses.

Before studying how imitation aﬀects agents’ vocabulary in a population, we need to

design some criteria to evaluate how well the vocabularies of the members of a population

work for conveying meanings. First, we need to consider the population consistency, denoted

i.e. how many consistent meaning-utterance mappings there are among the agents.

Second, we consider the eﬀect of homophones, which are words that have the same form but

diﬀerent meanings. The existence of homophones is likely to cause confusion. Therefore,

assuming that the fewer the homophones, the better the vocabulary, we select a second

criterion to be the distinctiveness of the vocabulary, denoted by

D. C

and

are calculated

as follows. For a population with

agents,

1. The overall

consistency

for all meanings is

i=1

A class of objects containing a single object is permissible.

In this report “utterance”, “signal” and “word” are used interchangeably for the sake of convenience,

particularly when discussing other studies, though we acknowledge that there are important diﬀerences among

them.

We have considered other strategies such as following the authority and homophone avoidance. The

detailed simulation results are reported in an earlier study by Wang & Ke [20].

Plik z chomika:

ukrhuman

Inne pliki z tego folderu:

complexity.pdf (335 KB)
grayscience2005.pdf (189 KB)
In Search of the First Language.pdf (121 KB)
Jackendoff.htm (23 KB)
146850.rtf (120 KB)

complexity.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: