Unlock Plus AI learning and gain exclusive insights from industry leaders
Access exclusive features like graded notebooks and quizzes
Earn unlimited certificates to enhance your resume
Starting at $1 USD/mo after a free trial โ cancel anytime
Hi, I'm delighted to have with me here today Kathy McKeown, who is the Henry Gertrude Rothschild Professor of Computer Science at Columbia University, where she is also the founding director of the Institute for Data Sciences and Engineering. She is also an Amazon Scholar and is well-known for the work that she's done over many years on text summarization and many of the topics in NLP. So welcome, Kathy, and thanks for joining me. Thanks, Andrew, and thanks for having me. So today you lead a large group doing NLP research, but your journey to becoming an AI researcher and an NLP researcher has been an unusual one. If I'm correct, you actually majored in comparative literature when you were an undergrad, even though you were also very mathematically oriented at the time. So tell us the story of how you became an NLP researcher. Yeah, so when I started out at Brown, I didn't know what I wanted to major in, so I took courses both in math and comparative literature. And as I went on, I became more interested in comparative literature, probably in part because of the teachers who I had who really influenced me. It was only as I came near the end of my time at Brown that Iโand when I graduated, I took a job as a programmer, which I found actually very boring. And I thought if I was going to have to be working 40 hours a week, I wanted to be doing something that I enjoyed. And it was then that a friend of mine who was a linguistics major at Brown told me about computational linguistics. And so I spent a lot of the year in the library reading about AI and natural language processing. And when I applied for graduate school the following year, I knew that was what I wanted to do because it gave me a way to bring together my interest in language and in math. So that's fascinating. So as a comparative literature major, you spent a lot of time in the Brown University Library reading about computational linguistics and NLP. Today, we have a lot of learners, maybe some watching this video, that may not yet be an NLP researcher or AI engineer wanting to break into the field. So I'd love to hear more about what your experience was like, you know, reading so much. Were you doing it by yourself? Did you have a study group? What was that like? I was doing it entirely by myself. I really had no guidance in terms of what to look at. I guess this friend that I had made a few suggestions, and then I traced references. You know, when I first began reading, I would follow up on references to go further. Yeah. And when I first entered graduate school, and you know, I had essentially switched fields, I found it very frightening. I was sure that, you know, I was an imposter, that I didn't know enough. And before long, they would find out, you know, that I really shouldn't be there after all. Yeah, but that's something, you know, you overcome with time, and you learn that it's not the case. And people value your input. Yeah, that's really inspiring. Thank you for sharing that. Would you have any advice for someone, maybe today, that is trying to do this themselves and wondering if they know enough or are good enough or should be in the field? It sounds like you got through that and you've been incredibly successful. But what would you say to someone today, maybe looking to follow in your footsteps and wondering if, you know, this is right for them or if not? So, I guess I have a couple pieces of advice. I do think reaching out to people and talking to people is useful. I just was, until I got to graduate school, I wasn't in an environment where I had people to talk to. So, I do think it's really helpful, especially to talk to your peers about what they're doing and what they're interested in. When you pick problems to work on, and I guess, especially in today's world of deep learning and neural nets, I would advise choosing problems that are different from what everybody else works on. Sort of strike out in a different direction, choose something new, a new task, and take off from there. And I think, I would love to come back to the different problems talking a bit. And when I speak with learners from around the world, I do hear from some that they feel lonely or isolated. They're kind of out somewhere, maybe not living, you know, in a major tech hub, and they sometimes feel like they're doing this by themselves. So I find it actually really inspiring that you were able to do that by yourself in a library in Brown University. I don't know if you have anything, any other thoughts to offer learners that may feel like, you know, they're somewhere in a company or in a city just trying to do this by themselves. I'm not sure I do have a lot more to say about it. I guess, you know, read what you enjoy about. If you can be part of a reading group, an online reading group, that would be helpful. There are a lot of reading groups now, and that's a good way to get sort of insight. There are online videos and course experiences like yours. And I think that's a way to find out what's going on and get in touch with what people are doing. So I think today the online environment can help people get connected and hear, you know, what's going on. I was lucky. I mean, I was really lucky. I applied to Penn. I didn't know that at the time it was the best place in natural language processing. And that was totally luck. So I don't know that I would recommend, you know, doing it blind again today. I think getting advice is great. Anyone that's been successful has had many elements of luck, but the preparation makes you ready to take advantage of when good luck falls into your lap. Yeah. Thank you for sharing that. That was really inspiring to hear about your early days as a developing researcher. And today you lead a large group at Columbia University doing very interdisciplinary work and doing a lot of work on summarization and other topics. So tell us a bit more about your current work and what you're excited about. So I have, I mean, summarization has really been the bulk of my work over the most recent years where we've done work on summarization of all kinds of different genres from personal novels to emails. One thread of research in summarization that I'm particularly excited about is work that I've done with researchers at Amazon and which was published at ACL. And this is work on summarization of novel chapters. It's very new. No one has been working on this task. It's very challenging, very deep summarization. So the chapters are much longer than the news articles on which most current work in summarization today is done. That is a challenge for current neural models. And a big problem is that there is an extreme amount of paraphrasing between the input, which is 19th century novels, and the output, which is a summary written in today's language. None of the current models can handle the kind of paraphrasing that we see there. And that, in general, is a topic that I'm really interested in, is this sort of very abstractive summarization where the sentences use different words and the input document where the syntactic structure is different. And that is very different from the vast majority of work today, which is done on summarization of news. And it's done on summarization of news because that's where the data is. So some of the other areas that I'm looking at are summarization of personal narratives that you find online where the personal narratives are very informal language and the summary is more formal. Summarization of debates. In past work, we've also done summarization of email, which has some of those same characteristics. Why did you choose to work on the novel summarization task? Well, so we had done work on novels even earlier, I would say in 2010, when one of my students who was very interested in creative writing, and I really thought he should do a PhD. And so to convince him to stay, you know, we came to a topic that he would be happy with, which was analysis and generation of creative language. And I felt then that my work came full circle, that we collaborated then with a professor in comparative literature. So I came back to my roots in comparative literature. And that was a lot of fun. So when I first went to Amazon, I, you know, I knew, because of Kindle and, and online on Amazon, they have a lot of novels. And I thought, what would be more fun than being able to summarize novels? Sounds like a fun project. I read a lot on my Kindle. So maybe your work will be a feature in the Kindle someday. One aspect of your work that stood out as well is that, you're known for doing highly interdisciplinary work. So rather than, you know, focusing narrowly on NLP research, your work spans AI and the media, where I know, you know, Columbia University has a great journalism school. So wonderful journalists to work with there. Or the application of NLP to social networks. I think you work with medical problems. So tell us a bit about how you think about interdisciplinary work, because you've done more of it, I think, than most NLP researchers have. Yeah, I really enjoy interdisciplinary work. I think it's the most, my most favorite kind of research to do. And in part, you know, we get a really different perspective on research and the world when we talk to people in other fields, it takes us out of our sort of technical, narrow, narrow field. And so earlier, you alluded to picking research topics that are novel. And I think your research portfolio has certainly touched on a lot of problems that very few others are working on. So can you say more about that? How do you pick research projects to work on? And how do you advise others to pick topics to work on? I think it's important to pick a task that matters. So that, for me, is, you know, one thing to look at. For example, most of the work in text summarization today is done on summarization of what's called single document summarization of news. So take one news article in and generate a summary of that news article. And the reason for that is because that's where the data is, there's a huge amount of data that has been pulled together from first the CNN Daily Mail corpus, and later, New York Times, and there are a number of other corpora as well. The problem is, that's not really a task that we need. We've known for a long time that the lead of the news article can help people pretty well in serving as a summary of the document. And in fact, for years, it was hard to beat the lead people just worked on. That wasn't a problem that people worked on in the early years of summarization. The first couple of sentences in the news article. So, yeah, I mean, people work on a problem like that, because that's where the data is, we have leaderboards, people are competitive, they like to beat the leaderboard. But I would question just that one, or even half the point is that one, or even half the point in Rouge, which is used to the automated metric used to score them really make a difference. If you look at the output, you can see that actually, the summaries are quite similar, and either one of them might be fine. So, I prefer to go in directions that that people haven't gone in before and to choose a task where if you solve it, it's going to help people, it's going to be a useful application that you've developed. So, this is why I have done things like summarization of personal narrative, which we did in the course of disaster so that we could summarize, think of having a browsing view of summaries of what people have experienced after they've lived through a disaster. Or the current work on summarization of novels would be helpful to have a summary of an input chapter. I like to go in a different direction in part because I want to solve the task that matters. But I also like to go in a different direction because in this day and age of deep learning, where results come so fast, everybody works on the same problem trying to beat the previous state of the art. It can be hard to be the first one to get there. And if you go in a different direction, nobody else is working on, you are going to be the first one to come to solution. And that's what I like to do in my research over time. I like to be first on a problem. I feel like for myself, I have a lot of respect for people that can push that extra half point of performance on the leaderboard because hopefully that advances the whole field and lifts all ships. I also have a lot of respect for people with the creativity and the insight to charter the new problem that no one else has thought of and advances the whole field in a different direction. I think the field of AI and NLP is broad enough. I think it's actually not a bad thing if we have lots of people working on lots of different things, including standardized benchmarks. I feel that there are not as many people who want to go in that new direction. And it does take some guts to do it because the first thing that happens when you submit a paper is there is no benchmark. There is no baseline of prior work. And reviewers have a very hard time dealing with that. How can they judge whether it's really a good step forward? Whereas if you can show on a leaderboard that you've improved by a certain amount and you stay within the traditional trajectory, it's easier to judge. Yeah, I'll go with you on that. Actually, I was recently chatting with one of my friends, Sharon Zhou, who mentioned that sometimes the way benchmarks and metrics are established is that some research publishing a paper publishes something using some metric, maybe a good one, maybe an okay one. But to make sure that subsequent papers can compare to earlier work, then everyone and more and more people end up using the same metric, more for historical reasons, that it makes things comparable rather than because it is actually the most useful metric. It's funny how meshes get established in academia. Yeah, I mean, that has happened in the summarization field and I think also in machine translation where you want an automated metric because it's easier to develop a system, to train over and over again. And yet everybody knows that the automated metrics that we currently have are really flawed, but everyone keeps using them because that's what we've always done. You know, one of my most soft memories was I remember going to SIGIR, the Information Retrieval Conference, and attending a workshop on text summarization. I remember being struck, fascinated, but struck that about half of that workshop was on text summarization algorithms and the other half of that workshop was on how to develop metrics to evaluate the other half of the work. Text summarization especially, the development of automated metrics has been challenging. In terms of choosing new topics to work on, one of the pieces of work that you've been doing that I thought was fascinating was you were taking texts from the Black community from Harlem near, I guess, where you teach at Columbia University and analyzing that as well. Tell us a bit about that. This is where I'm moving with my work with the researcher from social work. And we're also beginning to involve a linguist who works on African American vernacular. And what we're doing is we're looking at what people say, what kind of emotions they express in reaction to major events that are going on today. So, for example, in reaction to Black Lives Matter and in reaction to COVID-19. So, this is work that we're just beginning. We've begun with developing an interface where people can post about their experiences with these events and you know, how they're feeling. And I guess what we're hoping to do with that in part, so we have two directions to go. One on the natural language side is to be able to understand how people express different kinds of language, sorry, different kinds of emotion in African American vernacular and how that difference from how people express it in standard American English. And, you know, look at the difference in language and probably even the difference in content in terms of what's expressed. And this can help us in developing algorithms that, you know, are not biased as we move forward. Most of the work in natural language, all of the systems have been trained on news that comes, language that comes from news, like the Wall Street Journal. Yeah, that's great. If this type of work can help fight bias or build bridges between communities or just play some role in understanding and helping to advance the Black Lives Matter movement, that seems like a wonderful thing to do. Yeah, yeah. I mean, we also want in that work to look at the impact of trauma. So, it's a different kind of trauma and sometimes it's not your personal trauma, but the trauma of seeing what has happened to other people who are like you. So, yeah, we want to look at how that is expressed and the different kinds of emotions, the intensity of emotion and so forth. And I find it really wonderful that NLP researchers, AI researchers, you know, can play an active role in some of these most important societal questions and issues of all time. It feels like the work we do as AI researchers, it can matter on these really important topics. Yeah, I mean, I think so. And I've sort of been trying to do that for a while. I think it really attracts students to work with you and often different kinds of students into the field to work with you. On our first work with analyzing social media posts of gang-involved youth, we didn't have funding. We did that entirely with undergraduates who were just totally amazing. In earlier work, we were looking at being able to automatically generate updates about disaster as it unfolded. We did that after Hurricane Sandy hit New York. And again, it was something that students came to me and they had seen this happen and they have seen, you know, their neighborhoods hurt or they lived through the uncertainty of it, and they wanted to help. They wanted to know what can we do. And that was, at that point in time, this was all pre-neural net. We began developing systems that could automatically generate updates as an event unfolded. Yeah, no, it's actually great to think that, you know, you don't need a PhD, don't need a long publication record, but an undergrad spotting an opportunity with a desire to help can step in and start to work on systems that can make a difference. Yes, they're really passionate about it and they're really good. Very, you know, the work that came out of that was really excellent. Awesome. Yeah, thank you. Hey, so Chelsea, this is great stuff. And switching tracks a bit, you've been working in, you know, NLP and associated areas for a long time. In fact, I saw that even way back in 1985, you'd written an early book on text generation before the modern, you know, neural text generation techniques were around. So you've been a leader in the field for a long time and seen a lot of things change. I'd love to hear your thoughts on how the field of NLP has evolved over these many years. Sure. So when I started, which I got my PhD in 82, so I spent those early years at Penn, and there were some characteristics of the field that were, you know, salient. So one of them is that there was a lot of interdisciplinary work. There was, in developing NLP systems, we drew a lot on work from linguistics, from philosophy, from psychology, from cognitive science. And so when I was at Penn, I interacted a lot with faculty from linguistics. Ellen Prince was one of the people, or faculty from philosophy. We spent time, you know, in these interdisciplinary meetings. And, you know, I can remember walking across campus with my advisor to go from the computer science department to the psychology department, for example. I was influenced a lot, and I have to mention this, although it's not exactly what you asked. I was influenced a lot by senior women at the time. If I look back to who was most influential in, you know, how I progressed in my early research, my advisor, of course, who was a male, Aravind Joshi, but then also Bonnie Weber, who was there in computer science, Eva Hajtova from Charles University at Prague, who was a linguist, Barbara Gross, who was at Stanford at that time in the CFLI Institute. And Karen Spark-Jones was very influential to me. She was from the field of information retrieval, and she and I spent a lot of time talking about summarization. So interdisciplinary is one main feature of that time. A second was drawing on theories from these other areas. So we drew on theories from linguistics. One main kind of theory that we looked at was focus of attention and how that changed over the course of the discourse and how that influenced how you made choices and how you realized text and language. So, for example, did you use a pronoun or did you use a full noun phrase? What kind of syntactic structure did you use? You might use different syntactic structures to make a concept more prominent in the discourse. We also drew on work from philosophy. So we drew on work from theories from Searle about intention and work from Grice about conversational implicature. And so we looked at these theories and we looked at how we could embody them in our natural language approaches. It's great to hear about some of your early sources of inspiration, much as I think today you will be a source of inspiration to many others. So you've seen a lot and see a lot in NLP, which continues to be a rapidly evolving field. So I'm actually curious, Kathy, what do you find most exciting in terms of emerging or exciting NLP technologies? For me personally, some of the work that I've already talked about today on truly abstracted summarization that uses extreme paraphrasing, work on analyzing the language from diverse communities. So we've been looking at the Black community, but I think there are other communities we could look at as well. I'm interested in things looking at how you deal with bias and data. And another very important topic, I think, is being able to arrive at what I would call paralinguistic meaning. So the pragmatic information about emotion, about intention would be another important direction to go. And I also think more work on events, being able to understand what events have happened and to be able to follow them. I also think about often if I look back, is it okay if I talk about this now, if I look back? If I look back on my favorite technologies and papers, I can think of papers from three points in time. The first would be older, and this was very early work in language generation on how we pick the words in our sentence. And we saw then that it was a hard problem, that constraints came from many different sources. And we wrote a paper called Floating Constraints on Lexical Choice, where we looked at how information from different parts of language, from the discourse, from the lexicon, from syntax, from semantics, would influence what we chose. And we worked in two different domains. One was basketball and one was stock markets. And I give the example of the floating constraint, where we want to express both the time at which something happened and the manner. In the first example, we express time in the verb and manner in the adverb. So Wall Street indexes opened strongly, opened as the time. And in the second, we express manner in the verb and time in the prepositional phrase. So stock indexes surged at the start of the trading day. And so we wanted to look at how we could control that choice. And I think control is something that's missing in language generation and summarization today using deep learning methods. How do we control what the output is and make sure it's true to what our intention is? In more recent work, my favorite is work on NewsBlaster, which was that's still about 15 years ago, but it feels recent to me. And that was where we took a real world problem. We did do some collaboration with journalists, and we developed a test bed where we could identify the events that happened during the day and produce the summary on each event. And then we also looked at how we could track that over time. And this platform gave us a common sort of application in which my students could address really hard research questions. And so that was where we looked at, did some of our first work on abstracted summarization, looking at how we might compress sentences, how we could fuse phrases together, how we might edit references so that the summary was more coherent. And we also did work on multilingual summarization. Yeah. Cool. Thank you. Lots of exciting, very distinct projects over the years. Do you have any, maybe I'll wrap up. Do you have any last thoughts? You can just say, do you have any final thoughts? Well, I guess I would just say that natural language is a really exciting field today. There's been a huge amount of progress with deep learning. We've seen dramatic increases in accuracy, but we still have a lot of directions to go. And I guess I would like to see, you know, more of the interdisciplinary work being brought back in. I'd like to see people looking at the data more and at their output more rather than just numbers. But I think there are many exciting directions for people to work in, and I hope we'll see many people join the field. Thank you. That was great. And I certainly hope that we'll have a lot more people join NLP and contribute to all of this exciting work. So thanks, thanks. Well, thank you. Thanks so much for asking me. It was fun.