Linking Language and Events: Spatiotemporal Cues Drive Children’s Expectations About the Meanings of Novel Transitive Verbs

ABSTRACT How do children map linguistic representations onto the conceptual structures that they encode? In the present studies, we provided 3–4-year-old children with minimal-pair scene contrasts in order to determine the effect of particular event properties on novel verb learning. Specifically, we tested whether spatiotemporal cues to causation also inform children’s interpretation of transitive verbs either with or without the causal/inchoative alternation (She broke the lamp/the lamp broke). In Experiment 1, we examined spatiotemporal continuity. Children saw scenes with puppets that approached a toy in a distinctive manner, and toys that lit up or played a sound. In the causal events, the puppet contacted the object, and activation was immediate. In the noncausal events, the puppet stopped short before reaching the object, and the effect occurred after a short pause (apparently spontaneously). Children expected novel verbs used in the inchoative transitive/intransitive alternation to refer to spatiotemporally intact causal interactions rather than to “gap” control scenes. In Experiment 2, we manipulated the temporal order of sub-events, holding spatial relationships constant, and provided evidence for only one verb frame (either transitive or intransitive). Children mapped transitive verbs to scenes where the agent’s action closely preceded the activation of the toy over scenes in which the timing of the two events was switched, but did not do so when they heard an intransitive construction. These studies reveal that children’s expectations about transitive verbs are at least partly driven by their nonlinguistic understanding of causal events: children expect transitive syntax to refer to scenes where the agent’s action is a plausible cause of the outcome. These findings open a wide avenue for exploration into the relationship between children’s linguistic knowledge and their nonlinguistic understanding of events.


Introduction
During their first years of life, children develop rich and robust cognitive models of the world around them. They represent the differences between people and objects, make predictions about physical and social events, and learn complex patterns of causal information (Baker, Saxe, & Tenenbaum, 2011;Luo, Kaufman, & Baillargeon, 2009;Sobel, Tenenbaum, & Gopnik, 2004;Spelke, 1990;Woodward, 1998). At the same time, they learn the structure and content of at least one language, acquiring representations that allow them to understand the speech around them and Figure 1. Adults readily perceive (a) as direct launching, a causal event. Sequence (b) is perceived as noncausal: The light gray ball appears to move on its own (Michotte, 1963). Only the spatiotemporal relationship between the two sub-events (dark gray ball moves, light gray ball moves) differs between scenes. spatiotemporal gap like (b), they are less impressed by changes in the first ball. The fact that very young children use this kind of fine-grained spatiotemporal information to understand causal events indicates that they are sensitive to the sub-event structure of these events. They do not group all twoparticipant "billiard-ball" scenes together or treat each sub-event as a separate entity. Instead, they recognize that the relationship between the motion of the first and the motion of the second ball carries crucial information about the event as a whole.
Understanding how children map between these causal concepts and language is critical for two reasons. First, causal concepts underlie some of the central generalizations about the form and interpretation of language. Causal information is carried not only by particular words in a language (such as make or because) but also by the form of a sentence. A sequence like Ben pilked the cup. The cup pilked. suggests to adult participants that pilking involves changing or causally affecting the cup in some way, even through the actual verb is unfamiliar (Kako, 2006). In lexical semantics, this fact is captured by theories of verb meaning which encode causation as a representational primitive, as in: (  Jackendoff, 1983) While there are a variety of proposals about lexical semantics that differ in many respects, most of these theories break the meanings of verbs into pieces (subpredicates) and include a subpredicate that encodes cause (Croft, 2012;Fillmore, 1968;Folli & Harley, 2008;Hale & Keysar, 1993;Jackendoff, 1990;Pinker, 1989;Rappaport Hovav & Levin, 1998;Talmy, 1988;van Valin & LaPolla, 1997). If they are available to children, these argument structures could provide a constrained set of hypotheses for learning a new verb. As shown above, these structures dictate the number of arguments expressed about an event and the hierarchical relationships between them, and these relationships are reflected in the syntax of sentences. Thus, when running, chasing, playing, and laughing are all going on in a scene, the sentence surrounding a new verb can provide important information about the number of participants and the nature of their relation, that can allow the child determine which specific perspective on an event is being referred to.
The second motivation for understanding how causation maps to language is that children acquire much of their causal knowledge about the world through language. Second-hand information provided by adults or peers (Harris, 2002;Harris & Koenig, 2006) can change children's understanding of causal events in a variety of ways. Children categorize objects differently when causal language is used to describe them (Nazzi & Gopnik, 2000, preschoolers explore perceptually identical objects with disparate causal properties more if the objects are given the same name (Schulz, Standing, & Bonawitz, 2008), and 2-year-olds who have learned that one event predicts another only try themselves to use the first action to cause the second if the relationship has been described with causal language like "the block makes the helicopter spin" (Bonawitz et al., 2010). However, little is known about how children use specific grammatical or semantic features of the language they hear to support causal learning.
Transitive sentences like Sarah broke the lamp are thus a critical test case for understanding how children connect specific linguistic information-in this case argument structure-to their nonlinguistic conceptual representations. Because causal events are just one kind of two-participant event, the number of arguments in the sentence does not in itself provide strong evidence that sentence describes a causal event. This parallels the effect in causal perception: the number of billiard balls in a display does not provide strong evidence on its own that the event is causal. However, both within and across languages (Haspelmath, 1993;Levin & Rappaport Hovav, 2005), causal events are more likely than noncausal events to be described in a transitive sentence (3) rather than a sentence like (4), where an intransitive verb appears with a prepositional phrase.
English learners face a challenge, though, because the connection between transitivity and cause is in actuality fairly weak. Transitives are also used to describe contact (the lamp touches the table), perception (the girl sees the lion), spatial relations (the wall surrounds the castle), and motion (the tiger enters the room), all of which lack cause predicates in argument structure theories (cf. Levin & Rappaport Hovav, 2005). When we look across languages, however, the picture is clearer: events of direct external causation are consistently described with transitives, whereas the encoding of noncausal events is more variable. For instance, the Russian translations of sentences like "The supervisor manages the department" are not transitive sentences but instead have oblique arguments (roughly, "The supervisor manages over the department"; Levin & Rappaport Hovav, 2005). This connection between cause and transitivity influences the behavior of adult English speakers, despite the looseness with which English uses transitive syntax. When adult English speakers are asked to guess the meanings of novel transitive verbs, they interpret them as causal verbs, inferring that the subject exerted force and caused something to move or change, while the object underwent some change of state (Kako, 2006). This is true even in the absence of the causal/inchoative alternation (The girl breaks the lamp/The lamp breaks), which is more tightly limited to causal meanings.
The fact that the same surface syntax allows both causal verbs like break and noncausal verbs like touch presents an important learning challenge for young children. Children will hear transitive verbs with a variety of meanings: some verbs will encode cause and effect, but others will encode contact, perception, or possession. The more specific causal/inchoative alternation might help, but especially with less frequent verbs children may not reliably hear both versions of an alternation used with a single event. Thus the evidence that children receive from linguistic input may be broadly consistent with (at least) two hypotheses. First, the transitive frame could have a single and fairly general meaning. In particular, transitive syntax could be seen as applicable to any event that involves two participant roles, with any ranked preferences reflecting the cues children use to determine what counts as a "two participant event." This single mapping would cover not only causal transitives but also those encoding contact, perception, motion, and spatial relationships. In fact, Fisher and colleagues have suggested that infants might begin verb learning with precisely this kind of broad mapping and only later develop more specific intuitions (Fisher, 1996;Fisher, Gertner, Scott, & Yuan, 2010;Lidz, Gleitman, & Gleitman, 2003;Yuan, Fisher, & Snedeker, 2012). Second, children could believe (in addition to or instead of a general two-participant preference) that the transitive frame has multiple meaning "clusters," each of which encodes a different specific event structure. On this hypothesis, we might expect a mapping between transitive syntax and causation to be acquired earlier than many of the others, both because it relies on concepts that are available early in infancy and because it is cross-linguistically robust (raising possibility that it serves central role in the organization of argument structure).
The existing research on verb learning in young children is compatible with both of these possibilities. Four sets of findings are particularly relevant. First, children interpret transitive verbs as referring to prototypical causal events (e.g., one girl spinning another girl on a chair) rather than events with parallel action (e.g., two girls jogging) or a single participant (Arunachalam, Escovar, Hansen, & Waxman, 2013;Arunachalam & Waxman, 2010;Naigles, 1990;Yuan & Fisher, 2009;Yuan et al., 2012). This finding has been extensively replicated across a variety of ages and discourse conditions (cf. Fisher et al., 2010 for a review). Second, children also interpret transitive verbs as referring to prototypical contact events (e.g., a girl patting another girl on the head) rather than events with parallel actions (Naigles & Kako, 1993). Third, under some circumstances, children prefer to interpret transitive verbs as referring prototypical causal events rather than prototypical contact events (Naigles, 1996;Scott & Fisher, 2009). Finally, children prefer transitive verbs to refer to telic events (roughly, events that are "finished" to a natural boundary point) rather than atelic ones, even in the absence of causality (Hohenstein, Naigles, & Eisenberg, 2004;Wagner, 2010). This research is consistent with both the possibility that children have only a single broad semantic mapping ("two participants") for the transitive and the possibility that they make additional narrow, specific semantic mappings. Under the "two participant" account, it suggests that causal, contact, and telic events qualify as a single event with two participants, but that parallelaction and atelic motion scenes do not. The only preference established within two-participant scenes is cause>contact, and this might be because the participant roles in a causal event are somewhat more asymmetric than the participants in a contact event, making them a somewhat better candidate for the transitive frame. Likewise, in a telic motion event (a girl enters a room), the two arguments (girl, room) might be more naturally understood as two participants in a single event than in an atelic version (a girl runs around inside a room.) The main challenge for this kind of theory is to explain how children classify an event as having two participants of the right kind. Under the second "clustering" hypothesis, the findings suggest that young children have mapped the transitive construction to both cause and contact meanings, but may prefer the causal mapping to the contact one; additionally, they may map the transitive to a broader but overlapping class of telic events. The equivalent challenge for this theory is then to explain how children form (and rank) these more specific mapping expectations.
How can we determine when and whether children form specific mappings between transitive syntax and specific event classes such as causation? To do this, it is necessary to test the boundaries of an event class of interest, in addition to prototypical exemplars. Critically, because the studies described above have used prototypically events of cause, contact, and so on, they do not demonstrate that it is the causal properties of an event per se that lead children to the transitive mapping. These studies were largely designed to explore verb learning and the nature of early syntactic representations, and consequently the stimuli were often not designed to control for and target small conceptual distinctions.
This method of pitting two event prototypes against each other originates with Naigles (1990). In this study, children saw two test scenes: one canonically causal scene in which a duck pushing on a bunny's shoulders to make it bend over, and one "parallel action" scene in which the duck and bunny simultaneously and separately wave their arms. 2-year-olds mapped transitive, but not intransitive, novel verbs to the causal scene (Naigles, 1990). What features of these events guided toddlers' syntax-specific expectations? Children might be tuning in to many different features of the scene, such as a particular forceful action, the timing of the duck's shoulder-pushing and the bunny's bending, or the simple fact that the actions performed by the duck and bunny are different from one another. In other words, the causal scene is characterized both by particular sub-events (shoulderpushing), and by a particular relationship between sub-events (the difference, characteristic timing, and/or physical proximity of the duck and bunny's sub-events.) Thus, the existing studies of children's early syntactic awareness do not tell us whether children selectively expect transitive verbs to refer to causal events, qua causal events. In these studies, we therefore investigate whether preschoolers' expectations about novel transitive sentences are sensitive to spatiotemporal cues to causality. To do this, we provide children with minimal-pair scene contrasts to determine the effect of specific event properties on novel verb learning. We test these questions with 3 and 4 year olds, who have rich representations of causality in nonlinguistic domains and who are actively building their verb lexicons. In Experiment 1, we provided children with novel verbs in the more specific causal/inchoative alternation (The toy pilked, she pilked the toy) and examined spatiotemporal continuity. All stimuli consisted of a puppet who approached a toy in a distinctive manner (sub-event A), and a toy that lit up or played a sound (sub-event B). In the causal versions of each event, the puppet contacted the object, and activation was immediate. The noncausal event versions were identical except that the puppet stopped short before reaching the object, and the effect occurred after a short pause (paralleling the noncausal interaction shown in Figure 1(b)). Because the resulting percepts can be difficult to visualize from still pictures, examples of events from Experiment 1 and 2 are available in the Online Supplemental Materials, and all stimuli from the studies can be accessed at the Open Science Framework, https://osf.io/u6s79/. Children viewed both events, heard a sentence with a novel verb, and had to choose one of the two alternate scenes. Because this kind of "gap" contrast necessarily adds more time to the noncausal control versions, we are sensitive to the fact that this could also be interpreted as a manipulation that made those movies seem less like a single two-participant event. In Experiment 2, we therefore made two changes, giving children evidence for only one syntactic frame (i.e., transitive without the intransitive alternation) and examining a different spatiotemporal cue, namely temporal order. In these stimuli, an actor approached or gestured at an object (sub-event A); the object immediately lit up or moved (sub-event B.) In the noncausal version of each stimulus, the events were identical except that the actor initiated her movement immediately after the toy activated.
In both experiments, each novel event pair was identical except for the critical causal perception cue. Thus they were equated for overall levels of activity, participant asymmetry, telicity, and most other factors which might reasonably identify a scene as member of a broad "two participant" category. The two hypotheses about children's initial expectations about event-to-verb mappings thus make very different predictions. If children map transitive sentences specifically to causal scenes (perhaps alongside other mappings) then after hearing novel transitive verbs, they should choose the event with spatiotemporal cues consistent with causal relationships. Under the theory that children use more holistic information about the number of participants in the target event, we might expect children to show no preferences between the two event variants.

Experiment 1
Experiment 1 is the first study to investigate how children's expectations about the meaning of a transitive verb are affected when only a single aspect of the event structure, spatiotemporal contiguity, is varied. All events in this study consisted of scenes containing two sub-events: an action performed by an animate entity (a female puppet), and an outcome effect in a novel toy (e.g., lighting up, spinning around.) Children were presented with minimal pair contrasts, identical to each other save for the spatiotemporal relationship between the sub-events. In the "continuous" (causal) scenes (see Figure 2 and in the Online Supplemental Materials and at http://osf.io/u6s79/), the puppet approached and made contact with the toy, and the effect took place immediately. In the "gap" contrasts, the puppet approached the toy but stopped several inches away from it; after a short pause the toy activated, apparently spontaneously. Both movies were initially described with an intransitive frame (Look, it's wugging!), and then in the transitive during the critical test questions (Can you find where she wugged the round Figure 2. Schematic of the event contrasts used in Experiment 1. Each novel event was filmed in a causal version and a "gap" control introducing a spatial gap and short pause between the agent's action and the toy's outcome. thing?), thus providing the causal/inchoative alternation for the novel verb. To control for the possibility that children might select the causal events simply because they were more interesting than the noncausal events, all children were asked in a counterbalanced order to identify scenes in which the puppet wugged the round thing and didn't wug the round thing.
Previous research has already established that infants use spatial contact and contingent timing between an action and an outcome to detect causal and noncausal events. What is at issue here is whether this distinction is also relevant to how children interpret the meaning of transitive constructions. If children expect transitive sentences to refer to causal scenes in particular, then when they hear transitive sentences like Sarah wugged the round thing, they should choose the continuous (causal) scenes over the (noncausal) gap variants. On the other hand, if children are sensitive only to coarser scene features such as the active presence of two participants, then children might choose between the events randomly. In particular, since the content of each of the two individual sub-events (puppet wiggling, globe spinning) is identical between event versions, children cannot depend on particular features of the entities or the sub-events (e.g., a particular intentional motion by the puppet, or a particular kind of toy effect) to guide verb preferences. Any preferences that children show in this task must therefore be due to the differing spatiotemporal relationship (i.e., contact and timing) between the sub-events.

Participants
Preschoolers were recruited from a local children's museum (n = 24 mean age: 3;11, range 3;0-4;9, 12 girls). Participants were replaced if they were unable to reach criteria on the pretest training (n = 3.) Five additional children were replaced due to refusal to point at the movies or parental interference. All children received a sticker and award certificate for their participation at the end of the session.

Materials
Gap and continuous video versions were created for six novel events. All events were initiated by a female puppet held by the experimenter. In each continuous event, the puppet contacted a novel apparatus that immediately moved, lit up, or made noise. The "gap" version of each event differed only in the spatiotemporal relationship between action and outcome sub-events, with a roughly 10-15 cm gap and 1-s pause between the puppet's final position and the activating toy.
In all videos, the event was played through three times, ending on a still shot showing the result and the final position of the puppet. Videos varied between 4.5 and 8 s in total length, with no more than a 1 s length difference between the causal and noncausal version of the same event. Descriptions of the events are shown in Table 1.
Video stimuli were presented on a 17-in laptop, using the Psychtoolbox extensions of Matlab (Brainard, 1997). An additional apparatus was used during the introduction, consisting of an openbacked box with a toy helicopter on top that could be covertly activated. Table 1. Novel events used for training (events 1-2) and verb-learning trials in Experiment 1. Causal and non-causal versions were created by varying the final position of the puppet and the relative timing of the two sub-events.

Event
Agent's action Outcome effect (1) Puppet hops over to land on green squeaky toy Toy squeaks (2) Puppet places ball on ramp, which rolls down to plastic donut Plastic donut "boings" Puppet bends over and places her head on box Wand on top of box lights up and blinks (4) Puppet pushes balanced pendulum Pendulum tips over and swings (5) Puppet wiggles down to globe Globe lights up and spins around (6) Puppet slides over to window shade Window shade pops up

Procedure
Each session consisted of an introduction, a pretest and the main novel verb test. During the introduction, children were introduced to "my friend Sarah," a puppet who liked to say silly words. The experimenter showed them the helicopter apparatus, demonstrating that Sometimes, Sarah puts her hand here [on top of the box] and makes it go. . .But sometimes, it just happens on its own, because there's a battery inside. Children were then prompted to activate the toy, and shown again that it could activate spontaneously. Then the experimenter prepared children for the rest of the session by explaining that in the movies they would see, Sometimes, Sarah makes something happen. Like this, when they're touching. And sometimes they don't touch and it just happens on its own, because there's a battery inside. This pre-training sequence was included because pilot testing indicated that without this training, children were unable to answer questions about the causality of the movie stimuli described below. This may have been due to the fact that, unlike infant causal studies (cf. Figure 1) our displays were complex, featuring multiple interesting objects and novel actions including toys that could activate both spontaneously and in response to an agent's action.
Understanding the structure of the scenes is a critical precondition for interpreting children's responses to novel verb sentences. That is, if children are unable to find the causal movie quickly enough, they cannot possibly use that causal information to determine the meaning of a transitive verb. This training may thus have focused children's overall attention on causal events over other types of events, but it did not provide the critical link to linguistic transitivity. To further ensure this, during training children never heard causal (or noncausal) events described with transitive sentences such as Sarah's touching the box. Following this introduction, children moved to the video presentation, beginning with two pretest trials using the first two events (see Table 1). The pretest was designed to train children on the forced-choice task and determine whether they understood the events they were seeing. During the pretest trials, children viewed both the gap and continuous version of one of the novel events, with version and side presentation counterbalanced between children. During these videos, children heard neutral language directing their attention to the video (Look over here! Whoa, look at that!). For each video, the experimenter pointed at the patient object, and asked the children Are Sarah and that thing touching? So, did Sarah make that happen or did it just happen on its own? After seeing both versions, children made two forced-choice decisions, identifying: (1) where Sarah and the object were touching and (2) where Sarah made the event happen. Positive and negative versions (i.e., where Sarah didn't touch the toy) of these questions were counterbalanced. The pretest procedure was then repeated with the second novel event. Children who could not provide correct answers to 3 out of the 4 total forced-choice questions in the pretest were not included in the analysis. Three children were replaced for this reason.
For the critical novel verb test, children saw a trial for each of the four remaining novel events. In each trial, children saw the continuous (causal) version of the event on one side of the screen, and the gap (noncausal) version on the other. The trial order, as well as version and side presentation for each trial, was randomized for each child.
On each trial children saw the two contrasting movies in sequence, with the same voice-over for both. The voiceover used the target novel verb in intransitive sentences: Look over here! The tall thing is wugging, it's wugging. Whoa! Watch one more time, it's gonna wug. . .Wow! Children were then reminded that In one movie Sarah made it happen, and in one movie she didn't. They saw each event a final time, and then the final freeze-frames for both movies were presented. Children heard two test prompts (Positive-Can you find the movie where she wugged the round thing?; Negative-Can you find the movie where she didn't wug the round thing?) with order randomized across trials. The experimenter waited until the child pointed to a movie; no additional verb prompts were given, but children were invited to "pick just one movie" if they hesitated to choose. Children very occasionally pointed to both movies, sometimes appearing to change their mind and sometimes appearing to indicate both movies; in all cases, children's first clear point to a movie was coded as their answer. As a manipulation check children were then asked to identify the movie where they're touching.

Results
All results and analyses for Experiment 1 are available on the Open Science Framework at http://osf. io/u6s79/.
The dependent measure of interest was how often children chose gap or continuous versions (i.e. spatiotemporal continuity disrupted or preserved) of each event following different prompts. We predicted that children would choose the continuous version of events when asked to find the movie where the puppet and toy were touching, and when given positive transitive prompts (e.g., Where she wugged the round thing). Note that this sentence suggests a causal referent only if children already expect transitive sentences to refer to this kind of scene.
Children's performance was converted to a score between 0 and 4, reflecting the number of trials on which they chose the continuous (causal) scene. Figure 3 plots children's responses to the three prompt types in Experiment 1. The manipulation check confirmed that children were successfully identifying the scenes where the puppet and the object were touching; children identified the correct movie at a rate significantly above chance (Wilcox signed rank test, p < 0.002; 3.04/4 mean correct choices.) For the positive transitive prompts, the distribution of these scores was also significantly above chance (Wilcox signed rank test, p < 0.001; 3.08/4 mean causal choices); no children chose fewer than two causal scenes in response to a positive prompt. To show that these choices did not result simply from a global preference for the causal movies, children's responses to negative transitive prompts (Can you find the movie where she didn't wug the round thing?) were also analyzed. For these prompts, children's scores were significantly below chance (Wilcox signed rank test, p < 0.001, 0.88/4 mean causal choices); no children chose more than two causal scenes in response to a negative prompt. Patterns were qualitatively similar and there were no significant differences between the performance of 3-and 4-year-olds (Wilcox signed rank tests, Control prompt: p = 0.31, Positive prompt: p = 0.11; Negative prompt: p = 0.21).
Figure 3. Children's choices of Causal ("Continuous") vs. Noncausal ("Gap") scene variants following the three prompts in Experiment 1. Children heard a manipulation check measuring their basic understanding of the scene ("Find where they're touching"), and (in counterbalanced order) a positive transitive question ("Find where she VERBED the toy") and a negative transitive question (". . .where she didn't VERB the toy"). Error bars represent bootstrapped 95% confidence intervals.
Because this study was designed as a forced-choice task, it cannot determine the total range of scenes that children might be willing to accept as targets of novel transitive sentences; it indicates that children prefer continuous over gap scenes, but does not directly measure the acceptability of the gap (noncausal) scenes. However, some evidence can be drawn from those trials on which children heard the negative question (. . .where she didn't VERB the toy) first. If children viewed either scene as acceptable, they might have been confused by this question and therefore answered randomly. In fact this was not the case: on trials where the negative question was asked first, children were somewhat more likely to select the gap scenes (62%) than on trials where the positive question was asked first (45%; these rates were not significantly different by a Wilcox signed rank test).

Discussion
Experiment 1 indicates that children's interpretation of transitive verbs draws on the cognitive capacities they use for detecting causation in nonlinguistic contexts. Specifically, their verb learning is sensitive to the causal structure of events, and not only to coarser contrasts between event types such as the number of active participants. 3-and 4-year-olds used spatiotemporal cues to causation to determine the meaning of a novel transitive verb: if The round thing wugged was followed by Sarah wugged the round thing then they chose scenes with the spatiotemporal continuity characteristic of causation. In contrast, if it was followed by Sarah didn't wug the round thing then they chose the "gap" variants in which the timing and contact cues were disrupted. Unlike previous novel verb studies, all other properties of the causal and noncausal videos were matched: the participants, the actions performed by the agent, and the physical outcomes were identical in both versions. Thus, rather than expecting the mere presence of particular kinds of sub-events, children expected transitive verbs to refer to scenes where those sub-events stood in particular spatial and temporal relation to each other.
Because we had found in pilot testing that children were not able to immediately zero in on the spatiotemporal relationships in these more complex videos, we provided our participants with initial training that helped them focus on this relevant distinction, by identifying some movies as causal and some as noncausal. For this reason, children were likely particularly attentive to the causal dimension of the events they saw during the novel verb phase. However, nothing about this training involved transitive syntax, thus at minimum children had to make a generalization from the causal questions they were asked in the training to the novel transitive sentences they heard at test. If the children had not expected a transitive-cause mapping they might have made a different inference: when the question changed (from "find where she made it happen" to "find where she VERBED the toy"), they might have made a pragmatic inference that the experimenter was now hoping for a different kind of answer. To ensure that it is transitivity specifically that children are using to guide this inference about the test sentences (rather than the more specific causal/inchoative alternation), Experiment 2 used a between-subjects design that allows us to contrast the inferences that children make about the critical verbs after hearing either transitive or intransitive syntactic frames: by giving just one frame we assure that children hear the same amount and diversity of information about the novel verbs, and, as an added benefit, we test for causal expectations resulting from transitive syntax alone.
The results of Experiment 1 are compatible with the proposal that children have a specific mapping between transitivity and causation. However, there are other interpretations of this finding. First, the presence of the spatiotemporal gap could have led children to interpret the "gap" movies as two separate, sequential events (the puppet's action, then the object's action) rather than a single event involving both entities. Event segmentation-dividing events up at their natural boundariesis an important and difficult problem (Zacks, 2010). Some degree of event segmentation would seem to be a pre-requisite for event conceptualization: how could you consider possible construals of an event if you hadn't identified some chunk of experience to interpret? We know that spatiotemporal contiguity is likely to play a role in children's event segmentation: young children look longer at videos where pauses are inserted in the middle of actions, than ones where the pauses coincide with event boundaries (Baldwin, Baird, Saylor, & Clark, 2001). The process of segmenting events may play a role in discovering their causal structure: sub-events which are close in time and space are probably more likely to be lumped together as a single event and more likely to be causally linked. However, to understand the role of causality, independent of segmentation, we must also consider cases where the difference in spatial distance and onset timing of the two sub-events is matched, such that the causal and non-causal alternatives are likely to be segmented in the same way.
The new stimulus set created to address the segmentation issue in Experiment 2 also pulls apart causation and contact. In Experiment 1, causation was manipulated by the presence (or absence) or physical contact between the agent and patient. Consequently, it is possible that children succeeded in Experiment 1 by mapping transitivity to contact rather than to cause. This wouldn't be a crazy thing to do: in English, many verbs of physical contact appear in transitive sentences (e.g., touch, pat, rub) and, as we noted earlier, prior studies have found that children prefer to map transitive verbs to canonical (noncausal) contact scenes rather than parallel actions. In fact, infants' initial nonlinguistic understanding of causation may be closely linked to physical contact (see also Muentener & Carey, 2010). Thus it is important to understand whether children's novel-verb preferences track with causal relations independent of physical contact.
To address these questions, we return to the literature on causal perception for an additional early index of causation. In addition to spatiotemporal continuity, young infants are sensitive to spatiotemporal order-if the second billiard ball's motion starts before the first one arrives to hit it, the causal illusion is broken (Leslie & Keeble, 1987;Michotte, 1963). In Experiment 2, we test children's preferences for novel transitive verbs with stimuli that hold contact constant and match the timing structure of the sub-events: the delay in the onset of the sub-events is matched, but reversed. In addition, by using only a single syntactic frame with each novel verb, we can see whether the disruption of causal structure affects expectations about transitive verbs in the absence of any evidence that they participate in the causal/inchoative alternation.

Experiment 2
As in Experiment 1, each event has a causal version and noncausal version that have the same participants and the same sub-events. In Experiment 2, the spatial relationships in the two versions are also matched, and only the order of the two sub-events differs. In the agent-first versions, an agent makes a gesture, and 1 s later (at the endpoint of the actor's gesture), a toy activates (e.g., lighting up, spinning around). In the agent-last variants, the outcome effect begins (apparently spontaneously), and 1 s later the agent makes the same gesture. The timing of the causal and noncausal event variations used in Experiment 2 is illustrated in Figure 4.
In order to dissociate causal interpretations from contact interpretations, all events used in Experiment 2 involve "action at a distance," events with no physical contact between the agent and patient. These scenes can be understood as instances of a causal illusion, like the one that you might experience if you drop a book on a table at the exact moment that someone else turns the lights off. Despite the fact that the agent never touches the object that they change, adults who were informally queried interpreted the agent-first videos as causal, presumably because the outcome followed immediately on the agent's action. To ensure that the agent-last versions appeared sufficiently natural, all of the actor's gestures in Experiment 2 were plausible responses to an interesting event, such as pointing or clapping. Stimuli can be viewed in the Online Supplemental Materials and at https://osf.io/u6s79/.
By removing the spatial connect between sub-events, Experiment 2 goes beyond previous novel verb studies and allows us to ask whether children associate transitive syntax with events that involve causation but not contact. This new manipulation required an additional change in our experimental design. In Experiment 1, there was a clear difference in the two scenes that was visible at the moment when children were asked to select that correct event (i.e., the presence or absence of the physical gap). This would not have been the case for the timing contrast in Experiment 2-at the end of both the causal and the noncausal events, the actor has executed her gesture and the novel effect has taken place (see Figure 4). To address this issue, children were given stimuli in contrasting pairs (e.g., She's meeking it could refer to either the causal version of Event 7 or the noncausal version of Event 8, with event versions randomized between children.) The event randomization scheme is illustrated in Figure 5; across children, any remaining differences in preference would have to be due to the spatiotemporal changes. In presenting the novel verbs (in a between-subjects design) we compared scene choices after transitive sentences (She meeked the toy) to two control conditions: intransitive sentences (The toy meeked), and sentences with no novel verbs which directly probed causal knowledge (She made something happen.) The dependent measure of interest was how often children chose the agent-first or agent-last scenes depending on the prompt type that they heard. Under the specific-mapping hypothesis, we should expect that children who hear a transitive prompt (Find where she meeked something) Figure 4. Schematic of an event contrast used in Experiment 2. Each novel event was filmed in a causal version and an "agent-last" control which varied the relative timing of the agent's action and the beginning of the outcome effect. Figure 5. Example of the randomization scheme used in Experiment 2. One participant might see the causal (agent-first) version of the globe event and the non-causal (agent-last) version of the spinner event, while another child would see the reverse. In the Transitive condition, both would then be asked to identify "where she's meeking something". will be more likely to choose the causal (agent-first) version of the test events. In contrast, children who hear intransitive prompts (Find where something meeked) should have no such preference. But if children attend mainly to holistic information about the number of unique participants in an event, both agent-first and agent-last versions would qualify as two-participant events, and thus children should also have no preference in the transitive condition. Finally, the performance of children who were given the causal-knowledge prompt (Find where she made something happen) will allow us to validate our manipulation (by showing whether children view the agent-first scenes as more causal) and to determine the degree to which performance in the transitive condition matches or departs from children's ability to simply report the causal relationships in the study paradigm.

Participants
Preschoolers were recruited from a local children's museum (n = 74 mean age: 4;0, range 3;0-4;11, 37 girls), and tested in one of three conditions, Transitive (mean age 4;0, 12 girls), Intransitive (mean age 4;0, 12 girls), and a Causal Knowledge manipulation check control (mean age 4;0, 13 girls). Participants were replaced if they were unable to reach criteria on the pretest training (n = 20.) Nine additional children were replaced due to refusal to participate or parental interference. All children received a sticker and award certificate for their participation at the end of the session.

Materials
Agent-first and agent-last versions were created for four new novel events. All events involved a female actor interacting with a novel object. In contrast to the events used in Experiment 1, the agent did not touch the novel objects, but made gestures toward them. The sub-events of each novel event are shown in Table 2. In the agent-first version of each movie, the initiation of the apparatus' effect was closely timed to follow the actor's gesture. The agent-last version of each movie was created by filming a version in which only the timing was altered: the actor began her gesture 1 s after the apparatus activated. Because all the actor's gestures were chosen to be plausible "social responses" to an interesting effect, adults who were informally queried found these movies natural, but did not view the actor as playing a causal role.
In all videos, the event was played through three times, ending on a still shot showing the result and the final position of the experimenter. Videos varied between 4 and 5 s in total length and the agent-first and agent-last versions of each event were equal in length. In addition to these stimuli, two additional causal movies and two additional "social response" movies were used during the pretest trials.
Video stimuli were presented in the same manner as Experiment 1, and the helicopter toy was also used for the warm-up phase of Experiment 2.

Procedure
As in Experiment 1, each session consisted of an introduction, a pretest and the main novel verb test. During the introduction, children were introduced to "my friend Sarah," a puppet who liked to say Table 2. Novel events used for verb-learning trials in Experiment 2. Causal and non-causal versions were created by varying which of the two sub-events began first. Note that while outcome effects are repeated from the novel events (1)-(6) used in Experiment 1, the spatiotemporal relationships between agent actions and outcomes differ (see Figure 4).

Event
Agent's action Outcome effect (7) Actor claps while looking at globe box Globe lights up and spins around (8) Actor points at windmill box Windmill spins (9) Actor slaps the table Balanced pendulum tips over and swings (10) Actor raises both hands toward herself Window shade pops up silly words. After this the experimenter showed them the helicopter apparatus, demonstrating that Sometimes, Sarah makes it happen, like this. . .But sometimes, she just watches it happen, it happens on its own because there's a battery inside. Then the Sarah puppet either approached the toy without touching it, and then the helicopter activated, or she approached it after activation. Note that the demonstrations for Experiment 2 were different from the one for Experiment 1 in two critical ways: Sarah never touched the toy, but in the causal version the toy activated immediately when Sarah approached. Thus, the perception of cause was driven by temporal contingency rather than contact. After this demonstration, children were prompted to activate the toy (again without contact), and shown again that it could activate spontaneously. Then the experimenter prepared children for the rest of the session by explaining that in the movies they would see, Sometimes, my friend Hannah makes something happen. And sometimes she just watches it happen-it happens on its own because there's a battery inside. As in Experiment 1, children never heard any events described with transitive sentences during the warm-up. Children then moved to the video presentation, beginning with two pretest trials which used the training movies described above. Each pretest trial consisted of one agent-first and agent-last movie. The child viewed the two videos, one at a time, and was asked after each video whether the actor made it happen or watched it happen. Finally, the children were asked to Find where she made it happen and Find where she watched it happen. Children who could not provide correct answers to 3 out of the 4 total forced-choice questions in the pretest were not included in the analysis. 20 children were replaced for this reason, a higher rate than in Experiment 1, indicating that children might have had more difficulty passing the first precondition of recognizing the critical movies as causal. Thus, in both experiments we analyze data from children who did understand the causal structure of the movies; the difference in inclusion rates might mean that we are analyzing slightly different subpopulations of preschoolers in the two studies.
During the novel verb test, children saw two trials. The trial order, as well as version and side presentation for each trial, was randomized for each child. In each trial, children saw the agent-first version of an event (e.g., Event 7) on one side of the screen, and the agent-last version of a different event (e.g., Event 8) on the other. On each trial, children saw the contrasting movies presented sequentially, accompanied by identical, neutral voiceovers (Look over here, look at that, wow!) Children were then reminded that In one movie Sarah made it happen, and in one movie she didn't. Finally, children watched both movies playing simultaneously a final time, accompanied by a voiceover appropriate to the between-subjects condition: Transitive: She's gonna meek something. She meeked it! Wow, she meeked it! Intransitive: Something's gonna meek. It meeked! Wow, it meeked! Causal Knowledge: She's gonna make something happen. She made it happen! Wow, she made it happen! The final freeze-frames for both movies persisted, and children were asked to select the movie where she meeked it/where it meeked/where she made something happen. The experimenter waited until the child selected a video before continuing, providing only general prompts (Go ahead and pick!) if the child did not immediately point.

Results
All results and analyses for Experiment 2 are available on the Open Science Framework at http://osf. io/u6s79/.
Our analyses focused on how often children chose agent-first or agent-last versions of the events and whether this depended on the type of prompt that they heard. The dependent measure was the number of test trials (out of two) on which the child selected the agent-first scene. The results of Experiment 2 are summarized in Figure 6. Children who heard the causal manipulation check questions selected the agent-first scenes at a rate above chance but below ceiling (Wilcox signed rank test, p = 0.048; 1.32/2 mean causal choices), indicating that the contingency was used to infer a causal link between the two events but that either the inference was difficult for children to make or the task itself was demanding. Children who heard transitive prompts, also showed a significant preference for the agent-first scene (Wilcox signed rank test, p < 0.002; 1.5/2 mean causal choices). This was not the case for children who heard intransitive prompts (Wilcox signed rank test, p = 0.35; 1.12/2 mean causal choices).
In addition to differences from chance, we also performed planned comparisons of the Transitive vs. Intransitive conditions, and of the Transitive vs. Causal Knowledge conditions. Children in the Transitive condition were more likely to choose agent-first scenes than the children in the Intransitive condition (Wilcoxon signed rank test, p = 0.03). Performance in the Transitive and Causal Knowledge conditions were not statistically different from one another (Wilcoxon signed rank test, p = 0.46).
There were no significant differences between the performance of 3-and 4-year-olds, and patterns were qualitatively similar across the three conditions (Wilcox signed rank tests, Transitive prompt: p = 0.09; Intransitive prompt: p = 0.75, Control prompt: p = 0.26). Three-year-olds in the Transitive condition chose causal scenes somewhat less often than 4-year-olds (1.31/2 vs. 1.73/2 mean causal choices); however, this was also true for the Causal Knowledge control condition (1.14/2 vs. 1.55/2 mean causal choices), indicating that this difference had to do with relative success at identifying the causal relation or proficiency on the task in general.

Discussion
In Experiment 2, children showed a clear preference to map new transitive verbs to causal scenes, rather than closely matched noncausal foils. Specifically, children who were asked to Find where she meeked something usually chose events where the agent moved before the onset of the outcome action, rather than where she moved after its onset. This preference was specific to transitive verbs: Children who heard Find where something meeked did not show this pattern. In fact, children learning transitive verbs performed just like children who were asked to find Where she made something happen, suggesting that their errors in the transitive condition were due to uncertainty about the causal structure or lapses in attention, rather than uncertainty about the mapping between transitive verbs and causal scenes in general.
These results indicate that children are able to take fine grained spatiotemporal information into account when determining the meaning of a novel verb. The causal and noncausal stimuli in this experiment were constructed to match timing delays (as well as participant and sub-event information)-in both types of movies, one sub-event was initiated one second after the other. Knowing that an intentional gesture by an agent can have causal power is not sufficient-only when the agent's action preceded the physical outcome (rather than vice versa), did children interpret the scene as causal and map novel transitive verbs to that scene. This expectation for the transitive frame occurred in the absence of any evidence that the verb appears in the causal/inchoative alternation (which is more diagnostic of causal transitives in English). In this respect, children are similar to English-speaking adults who also assign causal meanings to novel verbs presented in the transitive alone (Kako, 2006).
Interestingly, the noncausal foils used in Experiment 2 (where the agent acts after the toy moves) can be plausibly described by another class of English transitive verbs, e.g., She applauded the orchestra. This is because we chose gestures, such as clapping, that could be either goal-directed actions or reactions to a surprising or interesting event. In the real world, causal actions and event reactions are likely distinguishable on many other grounds, but in Experiment 2 only the timing information differentiated these classes. This study thus shows that children's preferences for transitive verb meanings are sensitive to the same kind of fine-grained spatiotemporal information that guides infants' early nonlinguistic understanding of causal events.

General discussion
Across two experiments, children aged 3-and 4-years-old had a bias to map transitive syntax to causal scenes. Unlike all previous studies of this kind, we manipulated specific spatiotemporal cues to causation, rather than using contrasting prototypes from two different event categories. This advance reveals that children's expectations about transitive verbs are influenced by their nonlinguistic understanding of causal events: when two alternatives have identical sub-events, and differ only in the spatial and temporal relations between those sub-events, children expect transitive syntax to refer to scenes where the agent's action is a plausible cause of the outcome. In Experiment 1, this was implemented by contrasting intact causal interactions with scenes with a spatial gap and temporal pause between the agent's action and the outcome event. In Experiment 2, it was implemented by reversing the order of the agent's action and the activation of the toy. In both cases, 3-4-year-old children linked the verbs that appeared in the transitive frame with the causal variant. Critically, the scenes used in Experiment 2 involved no physical contact between the agents and the toys (in either variant). Both adults and children interpreted these scenes as casual when the agent's action came first, though they are certainly not prototypical examples of a causal event. Children's success with these events indicates that by 3 years of age their syntactic mappings reflect a broad and robust notion of causation.
The two experiments also differed in the range of syntactic frames that the verb appeared in. In Experiment 1, the transitive verbs appeared in the causal/inchoative alternation which is used solely with causal verbs in English. In Experiment 2, the transitive verbs never appeared in the intransitive frame and thus there was no evidence that the verbs could be used in the inchoative alternation. In English, transitive verbs may be causal (break) or they may not (applaud, touch). The tendency to link transitive verbs to causal scenes even in the absence of this distinctive syntactic alternation is consistent with the cross-linguistic connection between transitivity and causation, and the tendency for adult English speakers to interpret novel transitive verbs as causal (Kako, 2006;Kako & Wagner, 2001;Levin & Rappaport Hovav, 2005).
This work represents an important advance in our understanding of how young children map between syntactic structures and semantic event representations. Our results strongly suggest that children have a mapping between a fairly abstract nonlinguistic notion of causation and the transitive construction, ruling out a number of alternatives. 3-4-year-olds' preferences do not rest solely on properties of the agent's action, the nature of the outcome, or the mere presence of two active participants, although these additional cues may be important parts of naturalistic verb learning. In distinguishing between these possible referents of a new transitive verb, children attend to exactly those event cues that drive their awareness of causal events from infancy. This suggests that, by the preschool years, children possess (in addition to other possible constraints) a relatively specific mapping between transitive syntax and causal events. This conclusion makes predictions about a range of other manipulations that should affect transitive verb learning. For example, Gopnik and colleagues (2004) demonstrated that children can use patterns of covariation to determine which of several possible causes is responsible for an effect (using the "blicket detector" paradigm). If children interpret transitive verbs as encoding their nonlinguistic conception of cause, then they should assume that a novel transitive verb picks out the event with the statistically probable cause even if no spatiotemporal cues to causation are present in the events that are currently being labeled.

Learning noncausal verbs
On the other hand, an understanding of causality alone will be insufficient for children to learn the meanings associated with transitive syntax in English: many common transitive verbs like enter, touch, love, and see do not have obvious causal components to their meaning. These additional transitive verbs are not random: they are organized into clusters of verbs that participate in the same alternations and share aspects of meaning (cf. Levin, 1993). Children learning languages like English (i.e. those that have several different meanings for the transitive) know and use many noncausal transitives in a broadly adult-like way by the time they are three years old (Fenson et al., 2000;Wagner, 2010). However, we do know that English-learning preschoolers struggle to learn some kinds of noncausal transitives. Even though (noncausal) subject-experiencer verbs like fear are more common by token frequency, children appear to have more difficulty interpreting these non-causal relationships correctly (Hartshorne, Pogue, & Snedeker, 2015). Similarly, Naigles and Kako (1993) suggest that children may fail to map new transitive verbs to contact events when a causal alternative is available.
One way children might learn noncausal transitives is syntactic bootstrapping across multiple contexts. Specifically, children could note the range of syntactic frames that different verbs appear in, and expect commonalities in the meaning of verbs that are used in similar ways (Fisher, Gleitman, & Gleitman, 1991;Gleitman, 1990;Landau & Gleitman, 1985). Several novel-verb studies have shown that children prefer different kinds of event prototypes following different syntactic alternations (see, e.g., Fernandes, Marcus, Di Nubila, & Vouloumanos, 2006;Naigles, 1996Naigles, , 1998Scott & Fisher, 2009). The results of the present studies show both the availability of and the limits to this strategy for the broader verb-learning problem. If children understand the information carried by the causal/inchoative alternation, as supported by Experiment 1, then the failure of a particular verb to participate in it could serve as a (probabilistic) source of evidence for some other noncausal meaning. But Experiment 2 shows that children's causal biases go beyond transitive verbs that participate in the inchoative alternation. Thus, while a specific mapping between causation and transitive syntax gives children an advantage for learning some verbs, it could also be a disadvantage for learning other transitive verbs, requiring more evidence to override, just as a bias to interpret novel nouns as labels for whole objects can make learning other kinds of nouns more challenging (Markman, 1992). As far as we know it is currently an open empirical question whether children in fact require more evidence (either linguistic or nonlinguistic) to learn a novel noncausal transitive verb.
In addition to event structure, transitivity is also connected in a complex way with how those events are measured out, a feature known as telicity, which describes whether an event is completed or ongoing from a particular perspective. For many verbs, the direct object defines the end point of an event (I ate the apple, called an incremental theme by Dowty, 1991). Some causal verbs are also telic, but the class of transitive verbs which are telic also extends to other events like motion (i.e., reaching an endpoint in space). Like the relationship between cause and transitivity, this telicity linkage is imperfect. The end point of an event can also be defined by prepositional phrase (John walked to the store) and transitive verbs do not necessarily have defined end points (Mary drank juice). Nevertheless, adding a direct object often results in adding an endpoint (I dance/I dance a waltz), leading to a telic interpretation of the event in the right grammatical contexts. Children as young as 2-years-old expect novel transitive verbs to refer to completed (telic) event perspectives, even when there is no causal component to the activity (such as a character entering a room, Hohenstein et al., 2004;Wagner, 2010). While both telicity and causality involve an encoding of the result of an event (a newly-finished built house, a newly broken lamp), the concepts do not reduce to one another, and they can have independent grammatical representation (e.g. separate morphological markers) within the same language (cf. Dryer et al., 2013). Despite all of this, by the time they are preschoolers, children have put together a rich theory of how event perspectives map to syntax, despite cross-linguistic variation and the intersecting properties of transitive sentences.

Models of event representation for language
In these studies, we created events that differed in their spatiotemporal properties and assessed which event the participants thought a sentence with a novel verb referred to. Our goal was answer questions about the meaning that children attribute to transitive sentences. But since internal representations of meaning cannot be directly manipulated, our study-like all other mapping studies-is open to a number of interpretations. How are these meanings represented in the mind? One possibility is that the process merely involves tracking individual features of events and sentences to determine the correct weighting of each cue, based on the properties of the language in question. This approach faces an immediate problem: the same event can be described many ways. In the causal scenes presented in Experiment 2, a person might choose to focus on and mention the causal relationship (She's making it go!), the outcome effect (It's spinning!), or the agent's action alone ("Look, she's pointing!"), among others.
It thus seems clear that children use some kind of structured representation of events (rather than purely holistic "snapshots") to put an event perspective into words. How do children represent event structures at the right level for doing this? Are those representations shared with other cognitive processes, or are they part of a language-specific semantic system? Many theories of argument structure implicitly or explicitly try to define what might be necessary for a meaning-representation system to be able to produce the patterns of syntax that we see. One such possibility is found in Dowty's (1991) theory of Agent and Patient proto-roles, which proposes that the syntactic position of arguments (as subject and object) is determined by a set of features or criteria that define prototypical roles. Prototypical agents are sentient, they are volitionally involved in the event, and they are causers. Prototypical patients undergo changes of states, are affected by the event, and tend to be stationary or inanimate. The argument with the most agent-like features will become the subject of the sentence, even if both arguments have some features from both categories. Although this is a theory about event participants, it implicitly defines a theory of what event representations are like: they are an asymmetrical relation between two (or more) elements, roughly of the form (Proto-Agent ACTS-ON Proto-Patient). This approach can explain many patterns of argument realization in language, but it leaves some intriguing questions open. In particular, it misses an important level of organization: it is probably not an accident that ideal proto-Agents are both sentient and volitional, or that the "causer" feature of proto-Agents is mirrored by the "caus-ee" feature of proto-Patients (see Levin & Rappaport Hovav, 2005 for discussion).
Another class of approaches focuses not on the agent-hood of participants, but on the decomposition of verb meanings into structured representations. These representations are intended to factor apart what is shared across verbs from what is not-their idiosyncratic root meanings (like the difference between heating and cooling). In these theories, causal verbs are semantically represented with an internal structure that references the dimensions of meaning that matter for syntax (e.g., like [X CAUSE [BECOME [Y open]]] cf. Croft, 2012;Jackendoff, 1983;Levin & Rappaport Hovav, 2005;Van Valin & LaPolla, 1997). These theories make some aspects of why event properties are related in syntactic realization explicit: many transitive verbs have both a "causer" and "caus-ee" because there is an associated event structure with two distinct positions (X,Y) corresponding to these roles, containing a predicate CAUSE that refers to a representation of what causal events are like. These structures also implicitly explain other generalizations about verb and sentence meaning: if (as is the case) humans make good causers, than many agents of causal sentences will also be human (as is the case). Of course, we are left with a similar question as before: why do we have structures like [X CAUSE [BECOME [Y<state>]]] and [Z BECOME-AT W<location>] instead of some other set, and why is it that both X and Z (rather than Y and W) are reliably the subjects of sentences?
Both of these event-perspective representation options leave us with the problem of explaining why the system is set up the way it is. It is possible that semantic mapping patterns occur for reasons related to the underlying nature of syntax. For instance, Levin and Rappaport Hovav suggest that the correlation between telicity and syntax is a side effect of the hierarchical nature of semantic structures (2005, Ch. 4.2) Arguments that measure out events are typically arguments of an embedded sub-event (in their framework), consequentially they generally surface as direct objects or prepositional phrases (build a house, walk to the store). However, when the embedded event does not define the end point (roll the ball), the affected entity still surfaces as the direct object because the broad general mapping principles make no references to telicity per se.
Another possibility is that the nature of syntax-to-meaning mappings is related to how we nonlinguistically represent events. This might happen alongside language-specific phenomena. In Levin and Rappaport Hovav's account, they argue that the connection between transitivity and cause (unlike the connection to telicity) is a principled one because each sub-event in a semantic structure must introduce a new argument, and causal events necessarily have two sub-events. If structural patterns in language arise from mappings like these, we need to understand how these structures are created: how do our general cognitive capacities for understanding events (in terms of cause and other dimensions) get connected in just the right way to an abstract representational system leading to external human language?
Understanding these nonlinguistic representations may be key to fully explaining why human language is the way it is. In the case of transitivity, it is not surprising that animacy of the subject, physical contact, causal relationships, and goals are all features of many transitive verbs, because these three properties are closely related in early cognition (and subsequently in our adult common-sense expectations about how the world works). Even babies expect that causing changes usually requires contact, and that people have a special status both as causers and "havers of goals" (c.f. Muentener & Carey, 2010;Saxe, Tenenbaum, & Carey, 2005;Woodward, 1998). Yet the exact details of these early representations are still debated; it is not clear which if any of these ways of seeing events are primary for babies learning to understand and organize the world around them.
Over the past two decades, there has been an increasing appreciation for the role that richly structured models of the world play in the cognitive processes of both adults and children (cf. Gopnik & Meltzoff, 1997;Xu & Tenenbaum, 2007). Rather than representing important categories with prototypical examples or critical features, cognitive representations are embedded in a system of meaningfully related concepts that can support flexible kinds of inference-for instance, people may have a theory of physics which incorporates ideas about weight, motion, causality and allows them to make arbitrary new predictions such as whether a particular block tower will fall over (Battaglia, Hamrick, & Tenenbaum, 2013). Perhaps children use these same kinds of cognitive models to determine which types of relationships can be expressed by a particular linguistic structure (see Hartshorne et al., 2015 for related ideas). In this view, the distinct-but-related roles of causality, telicity, and other features of meaning associated with transitive syntax may result from the flexibility and richness of the underlying cognitive models: just as they can support new on-the-fly predictions, these cognitive models could support the learning of multiple subtle, related, and yet very abstract generalizations that are reflected in the grammatical structure of human languages.

Developmental trajectory
Returning to the specific dimension of meaning addressed in these studies, our findings also raise the question of how mappings between cause and transitivity (whatever representational form they take) emerge during development. We see three possible accounts of how the observed mapping that is established by the preschool years could develop. First, as proposed by Fisher and colleagues (Yuan et al., 2012), younger toddlers may begin with a global bias to match the number of linguistic arguments to the number of event participants, leading to the broad 'two participant' preference for transitive verbs discussed in the introduction. As they learn their native language and as their nonlinguistic cognition develops, additional more specific biases for event construals (e.g., expectations about cause, contact, or telicity) might arise if and when they are supported by evidence from the particular language the child is learning. At a minimum we know that there must be some change in this mapping over time, because languages differ in the range of events described by transitive syntax (Haspelmath, 1993;Levin & Rappaport Hovav, 2005).
Alternately, like preschoolers, infants might have additional biases about the links between events and the basic structures of language. These biases could result from language-specific expectations, from more general expectations about communication and social interaction (e.g., Tomasello, 1992), or from an expectation that language will mirror nonlinguistic cognitive representations (e.g., Pinker, 1989). To the best of our knowledge, there is no existing work that clearly demonstrates that the specific cause-transitivity bias we show exists in younger children. While prior work suggests that children as young as 19-21 months prefer to map transitive verbs to prototypical causative scenes over other scene categories (Arunachalam et al., 2013;Yuan et al., 2012), the scene contrasts used in these studies vary along multiple dimensions leaving the conceptual basis of the preference unclear. There are, however, good reasons for supposing that a causal-transitive bias might be present early in life. As we noted earlier, while causal knowledge develops throughout life, some of the guiding principles of causal reasoning are in place by 6 months of age (Leslie & Keeble, 1987). Furthermore, the cause/transitive mapping is cross-linguistically robust, raising the possibility that it has origins in conceptual and learning biases that young children bring to the problem of language acquisition (though see Christiansen and Chater (2009) for other explanations of cross-linguistic patterns). This hypothesis would be particularly compatible with theories where causation plays a central role in argument structure (e.g., Croft, 2012).
Complicating this question is the fact that children are not born with fully adult-like causal knowledge. While spatiotemporal cues to causation affect infants' attention from the first year of life, children initially have very different expectations than adults about the causal powers of animate and inanimate entities (Leslie & Keeble, 1987;Muentener & Carey, 2010). Any theory involving early connections between argument structure and causation will have to account for how children associate linguistic representations with cognitive representations that are themselves still developing (e.g., "agent", "causation", etc.). Understanding the development of early syntax-to-meaning representations therefore will necessarily involve progress on many fronts: we will need to understand the development of these nonlinguistic representations, the development of the syntactic representations that express them, and the nature of the mappings that relate them. The methods presented in this study provide an important avenue for beginning to answer these questions.

Conclusion
These experiments are the first novel-verb comprehension studies to examine the mappings between children's argument structure knowledge and their nonlinguistic causal models of the world. We show that children use syntactic information to guide inferences about transitive verb meaning that are closely related to their nonlinguistic concepts of causation. This finding shows the importance of exploration into the relationship between children's linguistic and nonlinguistic knowledge. If we can clarify how children's inferences about word and sentence meaning make contact with their nonlinguistic representations, then we will be better equipped to understand how children learn about the world from second-hand testimony, updating their beliefs about world from the sentences they hear (Bonawitz et al., 2010;Harris, 2002;Harris & Koenig, 2006;Nazzi & Gopnik, 2001;Schulz et al., 2008).
Critically, this study shows that any examination of the semantics of early language must be considered a question of cognitive development as well as linguistic development. Understanding early representations of verb argument structure will require understanding how children in the first few years of life are representing the scenes they see. Even as infants, young language-learners also have rich, but not necessarily adult-like, representations of what constitutes an agent, a cause, or an event (Baker et al., 2011;Luo et al., 2009;Sobel et al., 2004;Spelke, 1990;Woodward, 1998). Any detailed understanding of what children encode in their early verb meanings must reckon with the kinds of meaning that a young toddler might have available to encode in language. As has been frequently noted before (c.f. Pinker, 1989), it does not seem likely to be a coincidence that many of the proposed central primitives in argument structure theories are also available to the young childbut we do not yet know what makes an early cognitive representation a candidate for becoming part of the grammar (on either evolutionary or developmental timescales).
Exploration into children's early linguistic representations is a critical part of the effort to understand how humans of all ages represent events (Baldwin, Andersson, Saffran, & Meyer, 2008;Baldwin et al., 2001;Wolff, 2008;Zacks, 2010). By bringing together the linguistic tests for novel verb comprehension with stimulus manipulations from research on prelinguistic cognition, we can make detailed, testable predictions about how children make inferences about language from the events they see, and how language in turn reflects the structure of event representations.