##One finding that has emerged strongly from our research is that there are large differences in how different people read (interpret) and write (produce) language; in other words in people's "idiolect". I believe that a good understanding of idiolect would enable us to generate much better texts than we currently do; and more generally to develop a better understanding of how communication between humans and computers can fail.## This could have major benefits commercially, and also to society. For example if we could do a better job of explaining basic health information to people with limited literacy, for instance by explaining medical concepts in the language of the reader, this could have a major impact on health.
From the perspective of this research agenda, I'm excited about Memories for Life because it's a way of building up a large amount of data about how individuals use language. Current linguistic corpora tend to be based on things like newspaper articles; in other words, texts written by professional writers (who in many cases are not known), without much (if any) non-linguistic context. What I need for my research is sizable collections of language produced (orally or written) by individuals, preferably individuals from a variety of backgrounds (eg, single mothers from council estates as well as professional journalists), ideally accompanied by information about the context the language is being used in (eg, said to a small child at 10PM, in the child's bedroom), and also be information about the language that the subject hears or reads (because this also influences idiolect). Building such a resource on my own is a daunting task; but if other researchers would also value such a resource, then perhaps the Memories for Life community as a whole can build it. Furthermore, if my ideas work and we can generate better texts for individuals by using data about how individuals use language, then this would provide a concrete benefit to people ("put your memories into our system and you'll in return get easier-to-understand health information").
I'm also very interested in how language relates to the world, in particular in what words mean in terms of non-linguistic data (for example, what RGB colours does "pink" refer to? What clock time does "evening" refer to? What spatiotemporal trajectories can be described by "meandering"? Etc). I think a good Memories for Life corpus could again be very helpful in investigating this.
