We’ve made remarkable advances over the past 22 years, thanks to our progress in some of the most challenging areas of AI, including translation, images and voice. These advances have powered improvements across Google products, making it possible to talk to someone in another language using Assistant’s interpreter mode, view cherished memories on Photos or use Google Lens to solve a tricky math problem.
We’ve also used AI to improve the core Search experience for billions of people by taking a huge leap forward in a computer’s ability to process natural language. Yet, there are still moments when computers just don’t understand us. That’s because language is endlessly complex: We use it to tell stories, crack jokes and share ideas — weaving in concepts we’ve learned over the course of our lives. The richness and flexibility of language make it one of humanity’s greatest tools and one of computer science’s greatest challenges.
Today I am excited to share our latest research in natural language understanding: LaMDA. LaMDA is a language model for dialogue applications. It’s open domain, which means it is designed to converse on any topic. For example, LaMDA understands quite a bit about the planet Pluto. So if a student wanted to discover more about space, they could ask about Pluto and the model would give sensible responses, making learning even more fun and engaging. If that student then wanted to switch over to a different topic — say, how to make a good paper airplane — LaMDA could continue the conversation without any retraining.
This is one of the ways we believe LaMDA can make information and computing radically more accessible and easier to use (and you can learn more about that here).
We have been researching and developing language models for many years. We’re focused on ensuring LaMDA meets our incredibly high standards on fairness, accuracy, safety and privacy, and that it is developed consistently with our AI Principles. And we look forward to incorporating conversation features into products like Google Assistant, Search and Workspace, as well as exploring how to give capabilities to developers and enterprise customers.
LaMDA is a huge step forward in natural conversation, but it’s still only trained on text. When people communicate with each other they do it across images, text, audio and video. So we need to build multimodal models (MUM) to allow people to naturally ask questions across different types of information. With MUM you could one day plan a road trip by asking Google to “find a route with beautiful mountain views.” This is one example of how we’re making progress towards more natural and intuitive ways of interacting with Search.