Research Overview
My research focuses primarily on the linking theory between syntax and computational psycholinguistics. I am particularly interested in combining insights and methods from these fields, pursuing an integrated theory of language. The overall research questions that guide my research include: what is the computational nature of core operations for structure building in language and how language learners acquire them via what learning algorithms, and how the knowledge of these operations is used in online sentence processing in interaction with other cognitive systems, e.g., the memory system? Can we build computational models to test the answers to the above questions?
Ongoing Projects:
1. Computational modeling of language acquisition of parameters
How children acquire their languages is an important and thorny question. While the linguistic system under learning is notoriously complicated, children typically master it within three to six years in a seemingly effortless way without formal instructions. What mechanism makes language acquisition possible? Previous accounts either have been demonstrated as not being able to converge on the target grammar in computational implementations or were not tested by real language data. To address the research question, we propose our main hypothesis: statistical learning underlies children’s learning of linguistic knowledge in sentence parsing. To test this hypothesis, the present project features an in-depth integration of methodologies and tools from linguistics, statistical learning, and computational linguistics. It develops a novel computational language learner to test critical questions as to what linguistic knowledge is acquired by children during their early years and by what order are the pieces of knowledge acquired. This computational learner implements a statistical learning algorithm based on the concept of clustering, and a learning parser is built to evaluate the predictions of this computational model. The parser will be equipped with linguistic knowledge that is assumed to be acquired gradually piece-by-piece from the parsing of the primary linguistic data that children receive. Psycholinguistic experiments on children’s production and comprehension of these pieces of linguistic knowledge are then designed to further evaluate the adequacy of the computational model and the learning parser.
2. The syntax of coordination structures
In the joint work with Andrew McInnerney and Yushi Sugimoto, we argue for lack of c-command relations in coordinate structures with evidence from a new observation which we call ramification effects. Ramification effects are attested in quantifier binding and logophoric binding in Chinese, English, and Japanese, requiring the first conjunct that binds into a pronoun/logophoric reflexive in the nth conjunct also (recursively) binds into (n-1)th conjunct (n >= 3). We argue that the analysis of ramification effects singles out Chomsky’s (2021) Form Sequence, which yields a construction without internal hierarchical structure, as a plausible apparatus for generating coordination structures among all competing approaches.
3. The syntax of null objects
This study contributes to the current debate on the analysis of null objects across languages and systematically evaluates important empirical motivations against the well-accepted argument ellipsis analysis of null objects. The investigation of the interaction between the topic and null objects in Chinese reveals new data that provide strong evidence for the topic-bound null object analysis: null objects are licensed only if they are bound by the topic.
Recent Projects Summaries
1. Cue-based memory retrieval in sentence processing
My dissertation, entitled “Feature retrieval in the syntactic processing of grammatical illusions”, seeks to understand the interaction between linguistic representations and memory retrieval in sentence processing. I address two problems confronting the current cue-based memory retrieval model: encoding c-command and locality constraints as local features on lexical items, and correctly modeling the processing differences between subject-verb agreement and reflexive binding. My work is based on the computational modeling of cue-based memory retrieval model, which is originally proposed by Professor Richard Lewis (one of my dissertation committee members) and colleagues (Lewis & Vasishth, 2005; Lewis, Vasishth, & Van Dyke, 2006).
2. Universal quantification and its acquisition
This project explores what knowledge of universal quantification Chinese speaking adults and children have, and how children acquire the knowledge. Acquisition researchers are intrigued by an acquisition problem called “quantifier spreading”. When children are presented with a picture depicting three boys each of which is riding a horse, and also depicting an extra fourth horse without a boy riding it, the experimenter states the following (in fact, true) sentence.
Every boy is riding a horse.
Three to ten-year-old children provide 30%-70% “no” answers when they are asked to judge whether the statement describes the picture properly. These children justified their “no” answers with reasons such as “No, not this horse.” while pointing to the empty horse. It seems that children allow the quantifier every to spread to the noun horse which it does not attach to directly, yielding a semantic interpretation of the form “every boy is riding every horse”. My study suggests the quantifier spreading problem caused by a combination of linguistic operations: universal quantification and domain selection. Children may have no problem with a single operation, but when multiple operations need to be computed simultaneously, the quantifier spreading problem emerges.
3. Relational nouns
In this project, co-authored Professor Acrisio Pires (University of Michigan), and with Professor Liqun Gao and other colleagues (Beijing Language and Culture University) we examined the syntactic representation of bare relational nouns in Mandarin (e.g. fuqin ‘father’) and their implicit anaphoric argument. Mandarin is different from English in that it can use bare relational nouns without an overt possessor, whereas English must have a possessor in standard cases. For instance, for the English sentence John called *(his) father this morning to be grammatical, the possessor his is needed, whereas in a corresponding Chinese sentence, the possessor can be left out. In Mandarin, when the possessor of the relational noun is absent, its interpretation is subject to certain syntactic constraints. We further identified that the constraints are the same as those constraining the interpretation of the simple reflexive in Mandarin, i.e. ziji ‘self’, but not with those constraining the interpretation of the pronoun ta ‘s/he’. We conducted two Truth Value Judgment Task experiments to ascertain that these syntactic constraints are consistently present across speakers. Based on the experimental results, we concluded that there is an implicit simple reflexive argument on the relational nouns. Different syntactic structures of relational nouns are thus proposed for Mandarin and English to account for the cross-linguistic differences.
In an extension of this project, we compare syntactic properties of two typical relational nouns, i.e. body-parts (bizi ‘nose’) and kinship nouns (erzi ‘son’) in Mandarin. We found that, surprisingly, they are actually subject to very different syntactic constraints. We found that the constraints on the body-parts are similar to both English reflexives and Chinese complex reflexives ta-ziji ‘him-self’ that must be bound in their local clause. However, kinship nouns are similar to Chinese simple reflexives that allow long-distance binding. We develop various testing methods to tease these types of relational nouns apart. We further pointed out the same difference also exist in Norwegian as well.
4. Full Phase Transfer in Minimalist Syntax
In this project, I propose an alternative theory concerning Chomskyan phase transfer theory, and argue that transfer of the full phase, rather than the phase head complement, is theoretically and empirically preferable and is more consistent with the Strongest Minimalist Thesis. The full phase transfer theory solves various problems that the standard transfer theory faces, and gives correct predictions concerning edge effects, as well as long-distance wh-movement and head movement, with two core assumptions that are deducible from SMT: Escaping Movement and the Two-phase Workspace Hypothesis.
My research focuses primarily on the linking theory between syntax and computational psycholinguistics. I am particularly interested in combining insights and methods from these fields, pursuing an integrated theory of language. The overall research questions that guide my research include: what is the computational nature of core operations for structure building in language and how language learners acquire them via what learning algorithms, and how the knowledge of these operations is used in online sentence processing in interaction with other cognitive systems, e.g., the memory system? Can we build computational models to test the answers to the above questions?
Ongoing Projects:
1. Computational modeling of language acquisition of parameters
How children acquire their languages is an important and thorny question. While the linguistic system under learning is notoriously complicated, children typically master it within three to six years in a seemingly effortless way without formal instructions. What mechanism makes language acquisition possible? Previous accounts either have been demonstrated as not being able to converge on the target grammar in computational implementations or were not tested by real language data. To address the research question, we propose our main hypothesis: statistical learning underlies children’s learning of linguistic knowledge in sentence parsing. To test this hypothesis, the present project features an in-depth integration of methodologies and tools from linguistics, statistical learning, and computational linguistics. It develops a novel computational language learner to test critical questions as to what linguistic knowledge is acquired by children during their early years and by what order are the pieces of knowledge acquired. This computational learner implements a statistical learning algorithm based on the concept of clustering, and a learning parser is built to evaluate the predictions of this computational model. The parser will be equipped with linguistic knowledge that is assumed to be acquired gradually piece-by-piece from the parsing of the primary linguistic data that children receive. Psycholinguistic experiments on children’s production and comprehension of these pieces of linguistic knowledge are then designed to further evaluate the adequacy of the computational model and the learning parser.
2. The syntax of coordination structures
In the joint work with Andrew McInnerney and Yushi Sugimoto, we argue for lack of c-command relations in coordinate structures with evidence from a new observation which we call ramification effects. Ramification effects are attested in quantifier binding and logophoric binding in Chinese, English, and Japanese, requiring the first conjunct that binds into a pronoun/logophoric reflexive in the nth conjunct also (recursively) binds into (n-1)th conjunct (n >= 3). We argue that the analysis of ramification effects singles out Chomsky’s (2021) Form Sequence, which yields a construction without internal hierarchical structure, as a plausible apparatus for generating coordination structures among all competing approaches.
3. The syntax of null objects
This study contributes to the current debate on the analysis of null objects across languages and systematically evaluates important empirical motivations against the well-accepted argument ellipsis analysis of null objects. The investigation of the interaction between the topic and null objects in Chinese reveals new data that provide strong evidence for the topic-bound null object analysis: null objects are licensed only if they are bound by the topic.
Recent Projects Summaries
1. Cue-based memory retrieval in sentence processing
My dissertation, entitled “Feature retrieval in the syntactic processing of grammatical illusions”, seeks to understand the interaction between linguistic representations and memory retrieval in sentence processing. I address two problems confronting the current cue-based memory retrieval model: encoding c-command and locality constraints as local features on lexical items, and correctly modeling the processing differences between subject-verb agreement and reflexive binding. My work is based on the computational modeling of cue-based memory retrieval model, which is originally proposed by Professor Richard Lewis (one of my dissertation committee members) and colleagues (Lewis & Vasishth, 2005; Lewis, Vasishth, & Van Dyke, 2006).
2. Universal quantification and its acquisition
This project explores what knowledge of universal quantification Chinese speaking adults and children have, and how children acquire the knowledge. Acquisition researchers are intrigued by an acquisition problem called “quantifier spreading”. When children are presented with a picture depicting three boys each of which is riding a horse, and also depicting an extra fourth horse without a boy riding it, the experimenter states the following (in fact, true) sentence.
Every boy is riding a horse.
Three to ten-year-old children provide 30%-70% “no” answers when they are asked to judge whether the statement describes the picture properly. These children justified their “no” answers with reasons such as “No, not this horse.” while pointing to the empty horse. It seems that children allow the quantifier every to spread to the noun horse which it does not attach to directly, yielding a semantic interpretation of the form “every boy is riding every horse”. My study suggests the quantifier spreading problem caused by a combination of linguistic operations: universal quantification and domain selection. Children may have no problem with a single operation, but when multiple operations need to be computed simultaneously, the quantifier spreading problem emerges.
3. Relational nouns
In this project, co-authored Professor Acrisio Pires (University of Michigan), and with Professor Liqun Gao and other colleagues (Beijing Language and Culture University) we examined the syntactic representation of bare relational nouns in Mandarin (e.g. fuqin ‘father’) and their implicit anaphoric argument. Mandarin is different from English in that it can use bare relational nouns without an overt possessor, whereas English must have a possessor in standard cases. For instance, for the English sentence John called *(his) father this morning to be grammatical, the possessor his is needed, whereas in a corresponding Chinese sentence, the possessor can be left out. In Mandarin, when the possessor of the relational noun is absent, its interpretation is subject to certain syntactic constraints. We further identified that the constraints are the same as those constraining the interpretation of the simple reflexive in Mandarin, i.e. ziji ‘self’, but not with those constraining the interpretation of the pronoun ta ‘s/he’. We conducted two Truth Value Judgment Task experiments to ascertain that these syntactic constraints are consistently present across speakers. Based on the experimental results, we concluded that there is an implicit simple reflexive argument on the relational nouns. Different syntactic structures of relational nouns are thus proposed for Mandarin and English to account for the cross-linguistic differences.
In an extension of this project, we compare syntactic properties of two typical relational nouns, i.e. body-parts (bizi ‘nose’) and kinship nouns (erzi ‘son’) in Mandarin. We found that, surprisingly, they are actually subject to very different syntactic constraints. We found that the constraints on the body-parts are similar to both English reflexives and Chinese complex reflexives ta-ziji ‘him-self’ that must be bound in their local clause. However, kinship nouns are similar to Chinese simple reflexives that allow long-distance binding. We develop various testing methods to tease these types of relational nouns apart. We further pointed out the same difference also exist in Norwegian as well.
4. Full Phase Transfer in Minimalist Syntax
In this project, I propose an alternative theory concerning Chomskyan phase transfer theory, and argue that transfer of the full phase, rather than the phase head complement, is theoretically and empirically preferable and is more consistent with the Strongest Minimalist Thesis. The full phase transfer theory solves various problems that the standard transfer theory faces, and gives correct predictions concerning edge effects, as well as long-distance wh-movement and head movement, with two core assumptions that are deducible from SMT: Escaping Movement and the Two-phase Workspace Hypothesis.
Previous Projects
In my master's, I mainly work on the child acquisition of the syntax and semantics of quantifiers and NP/DP structures in Mandarin Chinese. You can find my Master's thesis, its English summary as well as a paper based on part of the thesis in the links following:
Master's thesis: 2012. Restrictors of the distributive operator "dou" in adult and children Mandarin (in Chinese: 分配算子“都”的限制域及其习得). Beijing Language and Culture University. Link
English summary of my master's thesis. Link
A paper manuscript based on part of my master's thesis. Link