Papers to Download

Overview

Language is at the core of human cognition. Theorists point to language as both proof and explanation of humankind's special abilities. The goal of my research is to understand both aspects of language: how language makes us special, and how we are special so we can do language. Through the combination of empirical research and computational modeling, I intend to specify in detail the cognitive processes that give rise to language and also how language learning changes these very same processes. The hypothesis underlying my work is that language emerges out of general cognitive processes, most notably associative learning, when placed in a highly structured environment which includes language itself. Below I summarize my three main lines of research.


Development of symbols - words as pointers to categories

The ability to create, learn and use systems of symbols is a uniquely human characteristic. Language is the first symbol system learned and nouns -- arbitrary symbols that point to categories -- are the first part of the language system that is learned. The developmental evidence suggests that names for things have a special status as pointers to categories. Interestingly, the evidence also shows that words do not have this special status at the beginning of development. Toddlers will readily map linguistic sounds, non-linguistic sounds or gestures to object categories. However, a few months later, children seem to know that words are the privileged way of pointing to categories and will not map non-linguistic sounds and gestures to object categories. The question is what mechanism makes children go from taking anything as a label to preferring words?

The broad idea is that events that start with roughly the same importance become "special" (cues) by virtue of the way they correlate within a task. That is, for a given task, we learn to attend to those cues that have historically mattered for performing that particular task. If this is true, then we may be able to find some in-between stage of development in which children accept non-linguistic but highly-correlated sounds as labels of categories, while rejecting less-correlated sounds. In a study with 20-26 month olds we found that children at this age will map animal sounds but not motor sounds to animal categories and motor sounds but not animal sounds to vehicle categories. This result suggests that, at least at the beginning of development, the degree to which a kind of signal correlates with a kind of category will have an effect on children's likelihood to take that kind of signal as a label for that kind of category.

A second piece of evidence that suggests the importance of children's experience in figuring out what is a word, is the effect of characteristics of the labels that might seem incidental. What defines a name is the cluster of features that systematically co-occurs with categories. This means that any strongly correlated feature of a name, even beyond what we conventionally think of as a word, will become at least initially an integral part of what is a name. Consistent with this idea, children will only associate the word when it comes from the mouth of the experimenter and not when the same experimenter's voice emanates from a hand-held recorder. In contrast, children will associate the animal sound with the animal category (and the vehicle sound with the vehicle category) whether the sound comes from the experimenter's mouth or from the recorder. That is, for children, an important characteristic of a linguistic label is that it actually emanates from a human mouth, but that constraint does not extend to other kinds of labels perhaps because they are highly correlated with the categories but not with the source of the sound - children hear animal sounds from real animals, toy animals, and parents imitating animals. These results suggest that the ability to use symbols develops with experience using them and that the details of the way it works are tied to the details of the input.


Development of kinds - words as the glue behind abstract ideas

Language organizes things into kinds. Young children generalize names for novel things differently for different kinds. For example, 30-month-old children generalize names for solid objects by their shape (as opposed to their material or color) and names for non-solid substances by their material (as opposed to their shape or color). This is true for children learning languages as vastly different as English and Japanese. Where does this knowledge come from? One idea is that it comes from regularities present in the language itself, and in fact the early lexicon in both English and Japanese presents regularities that might be enough for children to learn the kind-specific knowledge, or second-order generalizations: solids tend to be named by shape, non-solids tend to be named by material. We have conducted a series of experiments with connectionist networks and 30-36-month-old children to test this idea. Our results show the following: 1) the regularities in the early lexicons are enough for a simple connectionist network to acquire rule-like generalized knowledge about the importance of shape for solids and the importance of material for non-solids and 2) the acquired knowledge, although "rule-like" in that it can be applied to novel instances, is also exquisitely tuned to the details of the regularities present in the input.

One of the predictions made by the networks is that there should be an early preference for material for non-solids presented in simple shapes. The idea of an early material bias is surprising because it is contrary to a well-accepted conclusion in the literature - that the preference for material for non-solids occurs developmentally later than the preference for shape for solids. However, the network makes this prediction for non-solids with simple shapes, and not non-solids in more constructed shapes. We tested this prediction in children who were about 6-12 months younger than age at which children usually show any consistent preferences in this task and found that they have a preference for material for non-solids that are presented in simple (blob-like) rather than more constructed shapes (e.g. square U).

A second example of how the regularities in the input shape the knowledge about kinds can be seen in our cross-linguistic models of children's novel noun generalizations. Although children learning both English and Japanese show the same overarching pattern of generalizations - shape for solids, material for non-solids - they also show some intriguing differences. Most notably, children learning English generalize names for simply-shaped solids by shape, while children learning Japanese are more likely to do it by material. By studying the similarities and differences between languages and feeding them into the same statistical learner we gain insight on the ways different parts of language can affect the creation of generalized knowledge. The regularities in the early Japanese lexicon and in the early English lexicon are enough to model the similarities in the way children generalize novel nouns in both languages. However, to explain the different ways in which children learning the two languages generalize novel nouns for simply-shaped solids we need to extend the input given to the networks. When we add English count/mass syntax, as a mere correlated signal, we model the whole pattern of results, both the universals and the differences.


Constraints on the learner

Several of my projects explore the ways different manipulations affect learning in an attractor neural network as a way to figure out something about how people learn. By doing this I expect to find out what it is in the input, representation or architecture that makes us the kind of learner we are. One example of constraints in the architecture is Playpen. Playpen is a connectionist network that has Relation Units, units that are "about" two things and represent "micro-relations". The idea is that relational concepts are formed by correlations of these micro-relations. We have used this framework to model the learning of rules in babies, relational words and spatial relations. Through experiments with adults we have confirmed some predictions of the network --- for example, that similarity matters even for rule-like knowledge, and that performance of humans is more graded or more categorical depending on the structure of the training set.

Another example is manipulating the way networks are trained, supervised vs. "unsupervised" (auto-association). The two ways of training result in different cues being ignored or highlighted depending on the structure of the training set. The differences in English and Spanish count/mass syntax provide an example of input that is structured differently in the relevant ways. We have compared the performance of the two kinds of training with the performance of 2 and 3-year-old Spanish and English speakers and can conclude that learning is somewhat supervised.

Finally, I have recently explored the effects of distributed versus local representations on the abstraction of second-order generalizations. My results suggest that orthogonality within a set of features that correlate one-on-one with the first-order generalizations (categories) is a crucial element for the learning of second-order correlations. This is potentially important in explaining mechanistically the role of language on the power of human cognition.


Future work

I plan to continue to work on early systems of symbols. The next step is to train children with novel symbol systems manipulating two components that the simulations specifically predict will have an effect on learning: within-symbol similarity (systematicity) and symbol-referent similarity (iconicity). I also intend to extend the work on learning of second-order generalizations by children and connectionist networks. I am in the process of testing fine-grained predictions about the course of learning of different kinds of words and the graded performance networks exhibit for continuously varying solidities and shape complexities. In short, my work attempts to characterize both the nature of the learner and the nature of the knowledge learned by studying the development of knowledge in children and computational models.


(very)Old research statement (with links to old papers).