My Research
My research interests include machine perception, multimodal models, and machine understanding.

My Open Questions

I like to organize my research along questions with specific themes. They're mostly AI-related (for now, at least)!

What is the nature of intelligence?

There are several competing definitions of intelligence, so this question is an open-ended exploration seeking to understand why it's so hard to choose a definition and what we can do once we know what intelligence is.

How can knowledge be shared between AI systems?

Right now, foundation models like GPT-3 are very useful at generating new data, but the knowledge within these systems are embedded in weights that do not have semantic meaning. From transfer learning to newer methods like RLHF use human judgement to "transfer" knowledge between successive versions of models, but we don't have easy ways of, for example, sharing specific facts from one LLM to another one trained on a wholly different architecture.

I believe the field will see a boom in activity from hobbists and researchers with limited resources after we solve this problem. Once AI R&D becomes less about architecture and more about the learned knowledge, I believe it will be come much more valuable to the average person.

How does higher-order reasoning arise from lower-order functions?

Some in the field of AI debate on the extent to which symbols are necessary to perform reasoning, but answering this question requires an understanding of the mechanisms. Researchers like Chris Olah [1] are doing great work in interpretability, but we still don't entirely know how emergent behaviors arise from neural networks. I believe understanding thies will help us solve the engineering problems in AI.

In what ways is the architecture of the human brain applicable in designing intelligent systems?

Because the human brain is an organ that evolved to help our ancestors become better hunters and gatherers, I do not believe it is necessary to replicate all of the all of its functionality to create intelligent systems. However, some modules (like vision) seem to be useful in creating modules for AI (like convolutional neural networks).

How do these pieces fit together? Who knows!