Morphological generalization, or the task of mapping an unknown word (such as a novel noun Raun) to an inflected form (such as the plural Rauns), has historically proven a contested topic within computational linguistics and cognitive science, e.g. within the past tense debate (Rumelhart and McClelland, 1986; Pinker and Prince, 1988; Seidenberg and Plaut, 2014). Marcus et al. (1995) identified German plural inflection as a key challenge domain to evaluate two competing accounts of morphological generalization: a rule generation view focused on linguistic features of input words, and a type frequency view focused on the distribution of output inflected forms, thought to reflect more domain-general cognitive processes. More recent behavioral and computational research developments support a new view based on predictability, which integrates both input and output distributions. My research uses these methodological innovations to revisit a core dispute of the past tense debate: how do German speakers generalize plural inflection, and can computational learners generalize similarly? This dissertation evaluates the rule generation, type frequency, and predictability accounts of morphological generalization in a series of behavioral and computational experiments with the stimuli developed by Marcus et al.. I assess predictions for three aspects of German plural generalization: distribution of infrequent plural classes, influence of grammatical gender, and within-item variability. Overall, I find that speaker behavior is best characterized as frequency-matching to a phonologically-conditioned lexical distribution. This result does not support the rule generation view, and qualifies the predictability view: speakers use some, but not all available information to reduce uncertainty in morphological generalization. Neural and symbolic model predictions are typically overconfident relative to speakers; simple Bayesian models show somewhat higher speaker-like variability and accuracy. All computational models are outperformed by a static phonologically-conditioned lexical baseline, suggesting these models have not learned the selective feature preferences that inform speaker generalization.
Lossy Context Surprisal Predicts Task-Dependent Patterns in Relative Clause Processing
Kate McCurdy, and Michael Hahn
In Proceedings of the 28th Conference on Computational Natural Language Learning, 2024
English relative clauses are a critical test case for theories of syntactic processing. Expectation- and memory-based accounts make opposing predictions, and behavioral experiments have found mixed results. We present a technical extension of Lossy Context Surprisal (LCS) and use it to model relative clause processing in three behavioral experiments. LCS predicts key results at distinct retention rates, showing that task-dependent memory demands can account for discrepant behavioral patterns in the literature.
Toward Compositional Behavior in Neural Models: A Survey of Current Views
Kate McCurdy, Paul Soulos, Paul Smolensky, and 2 more authors
In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Compositionality is a core property of natural language, and compositional behavior (CB) is a crucial goal for modern NLP systems. The research literature, however, includes conflicting perspectives on how CB should be defined, evaluated, and achieved. We propose a conceptual framework to address these questions and survey researchers active in this area.We find consensus on several key points. Researchers broadly accept our proposed definition of CB, agree that it is not solved by current models, and doubt that scale alone will achieve the target behavior. In other areas, we find the field is split on how to move forward, identifying diverse opportunities for future research.
2023
Differentiable Tree Operations Promote Compositional Generalization
Paul Soulos, Edward Hu, Kate McCurdy, and 4 more authors
In Proceedings of the 40th International Conference on Machine Learning, 2023
Artificial language learning research has shown that, under some conditions, adult speakers tend to probability-match to inconsistent variation in their input, while in others, they regularize by reducing that variation. We demonstrate that this framework can characterize speaker behavior in a natural-language morphological inflection task: the lexicon can be used to estimate variation in speaker productions. In the task of German plural inflection, we find that speakers probability-match a lexical distribution conditioned on phonology, and largely disregard an alternative possible strategy of conditional regularization based on grammatical gender.
2021
Adaptor Grammars for Unsupervised Paradigm Clustering
Kate McCurdy, Sharon Goldwater, and Adam Lopez
In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, 2021
This work describes the Edinburgh submission to the SIGMORPHON 2021 Shared Task 2 on unsupervised morphological paradigm clustering. Given raw text input, the task was to assign each token to a cluster with other tokens from the same paradigm. We use Adaptor Grammar segmentations combined with frequency-based heuristics to predict paradigm clusters. Our system achieved the highest average F1 score across 9 test languages, placing first out of 15 submissions.
Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network
Verna Dankers, Anna Langedijk, Kate McCurdy, and 2 more authors
In Proceedings of the 25th Conference on Computational Natural Language Learning, 2021
Inflectional morphology has since long been a useful testing ground for broader questions about generalisation in language and the viability of neural network models as cognitive models of language. Here, in line with that tradition, we explore how recurrent neural networks acquire the complex German plural system and reflect upon how their strategy compares to human generalisation and rule-based models of this system. We perform analyses including behavioural experiments, diagnostic classification, representation analysis and causal interventions, suggesting that the models rely on features that are also key predictors in rule-based models of German plurals. However, the models also display shortcut learning, which is crucial to overcome in search of more cognitively plausible generalisation behaviour.
2020
Inflecting When There’s No Majority: Limitations of Encoder-Decoder Neural Networks as Cognitive Models for German Plurals
Kate McCurdy, Sharon Goldwater, and Adam Lopez
In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
Can artificial neural networks learn to represent inflectional morphology and generalize to new words as human speakers do? Kirov and Cotterell (2018) argue that the answer is yes: modern Encoder-Decoder (ED) architectures learn human-like behavior when inflecting English verbs, such as extending the regular past tense form /-(e)d/ to novel words. However, their work does not address the criticism raised by Marcus et al. (1995): that neural models may learn to extend not the regular, but the most frequent class — and thus fail on tasks like German number inflection, where infrequent suffixes like /-s/ can still be productively generalized. To investigate this question, we first collect a new dataset from German speakers (production and ratings of plural forms for novel nouns) that is designed to avoid sources of information unavailable to the ED model. The speaker data show high variability, and two suffixes evince ‘regular’ behavior, appearing more often with phonologically atypical inputs. Encoder-decoder models do generalize the most frequently produced plural class, but do not show human-like variability or ‘regular’ extension of these other plural markers. We conclude that modern neural models may still struggle with minority-class generalization.
Modeling grammatical gender and plural inflection in German
Kate McCurdy, Adam Lopez, and Sharon Goldwater
In Proceedings of the 26 Architectures and Mechanisms for Language Processing Conference (AMLaP), 2020
Grammatical gender is a consistent and informative cue to the plural class of German nouns. We find that neural encoder-decoder models learn to rely on this cue to predict plural class, but adult speakers are relatively insensitive to it. This suggests that the neural models are not an effective cognitive model of German plural formation.
2019
Tutorbot Corpus: Evidence of Human-Agent Verbal Alignment in Second Language Learner Dialogues
Arabella Sinclair, Kate McCurdy, Christopher G Lucas, and 2 more authors
In Proceedings of the 12th International Conference on Educational Data Mining, 2019
Prior research has shown that, under certain conditions, Human-Agent (H-A) alignment exists to a stronger degree than that found in Human-Human (H-H) communication. In an H-H Second Language (L2) setting, evidence of alignment has been linked to learning and teaching strategy. We present a novel analysis of H-A and H-H L2 learner dialogues using automated metrics of alignment. Our contributions are twofold: firstly we replicated the reported H-A alignment within an educational context, finding L2 students align to an automated tutor. Secondly, we performed an exploratory comparison of the alignment present in comparable H-A and H-H L2 learner corpora using Bayesian Gaussian Mixture Models (GMMs), finding preliminary evidence that students in H-A L2 dialogues showed greater variability in engagement.
2017
Grammatical gender associations outweigh topical gender bias in crosslinguistic word embeddings
Katherine McCurdy, and Oğuz Serbetçi
In Presented at WiNLP (Women in Natural Language Processing), 2017
Recent research has demonstrated that vector space models of semantics can reflect undesirable biases in human culture. Our investigation of crosslinguistic word embeddings reveals that topical gender bias interacts with, and is surpassed in magnitude by, the effect of grammatical gender associations, and both may be attenuated by corpus lemmatization. This finding has implications for downstream applications such as machine translation.
Linked Data for Language-Learning Applications
Robyn Loughnane, Kate McCurdy, Peter Kolb, and 1 more author
In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, 2017
The use of linked data within language-learning applications is an open research question. A research prototype is presented that applies linked-data principles to store linguistic annotation generated from language-learning content using a variety of NLP tools. The result is a database that links learning content, linguistic annotation and open-source resources, on top of which a diverse range of tools for language-learning applications can be built.
2013
Implicit prosody and contextual bias in silent reading
Kate McCurdy, Gerrit Kentner, and Shravan Vasishth
Eye-movement research on implicit prosody has found effects of lexical stress on syntactic ambiguity resolution, suggesting that metrical well-formedness constraints interact with syntactic category assignment. Building on these findings, the present eyetracking study investigates whether contextual bias can modulate the effects of metrical structure on syntactic ambiguity resolution in silent reading. Contextual bias and potential stress-clash in the ambiguous region were crossed in a 2 2 design. Participants read biased context sentences followed by temporarily ambiguous test sentences. In the three-word ambiguous region, main effects of lexical stress were dominant, while early effects of context were absent. Potential stress clash yielded a significant increase in first-pass regressions and re-reading probability across the three words. In the disambiguating region, the disambiguating word itself showed increased processing difficulty (lower skipping and increased re-reading probability) when the disambiguation engendered a stress clash configuration, while the word immediately following showed main effects of context in those same measures. Taken together, effects of lexical stress upon eye movements were swift and pervasive across first-pass and second-pass measures, while effects of context were relatively delayed. These results indicate a strong role for implicit meter in guiding parsing, one that appears insensitive to higher-level constraints. Our findings are problematic for two classes of models, the two-stage garden-path model and the constraint-based competition-integration model, but can be explained by a variation on the two-stage model, the unrestricted race model.
2010
Poetic rhyme reflects cross-linguistic differences in information structure
Identical rhymes (right/write, attire/retire) are considered satisfactory and even artistic in French poetry but are considered unsatisfactory in English. This has been a consistent generalization over the course of centuries, a surprising fact given that other aspects of poetic form in French were happily applied in English. This paper puts forward the hypothesis that this difference is not merely one of poetic tradition, but is grounded in the distinct ways in which information-structure affects prosody in the two languages. A study of rhyme usage in poetry and a perception experiment confirm that native speakers’ intuitions about rhyming in the two languages indeed differ, and a further perception experiment supports the hypothesis that this fact is due to a constraint on prosody that is active in English but not in French. The findings suggest that certain forms of artistic expression in poetry are influenced, and even constrained, by more general properties of a language.