Project
Big Question: Putting a conscience into the language-learning algorithm
Principal Investigator: John Goldsmith, Linguistics & Computer Science
Funding Type: Seed
Focus Area: Information
Big Idea: Every language is hard to learn, but every language is learnable. The central task of linguistic theory is to make such a learning algorithm explicit, providing an account of how words, word-internal structure, and grammar can be induced from a finite amount of data. We propose to use parallel learning algorithms for these three aspects of language learning, allowing each component to share its hypotheses with the other two in order to take advantage of epistemologically low-hanging fruit. We will integrate computational methods that have been developed for each of these problems, and develop a "Conscience'' that will oversee the sharing of tentative conclusions across the three separate components. The result of the project will be open source Python code to analyze natural language transcription, allowing us to implement our theoretical ideas into code and applications that can be tested by a wider community of researchers.
Check out other funded projects