Authored by AI

3. AI THAT WRITES 101

Episode Summary

Think of this as the impossible lecture on AI. Recorded across time, but presented as one flow, a series of AI experts line up to contribute to the conversation.

Episode Notes

Think of this as the impossible lecture on AI. Recorded across time, but presented as one flow, a series of AI experts line up to contribute to the conversation. https://authoredby.ai

DISCLAIMER: The sound quality (and only sound quality!) from calls in this episode has been enhanced using AI. The otherwise identical unenhanced version can be found here: https://www.authoredby.ai/03unenhanced/

Hosted by Stephen Follows and Eliel Camargo-Molina

Guests (in order of appearance)

Mike Kanaan, Author of "T-Minus AI", Chief of Staff of the U.S. Air Force Fellow
Bob Fisher, Professor of Computer Vision at University of Edinburgh
Angelina McMillan-Major, PhD Candidate in Computational Linguistics at the University of Washington
Magnus Sahlgren, Head of research for natural language understanding at AI Sweden
Christoph Molnar, Researcher and author of "Interpretable Machine Learning"
Sameer Singh, Associate Professor of Computer Science at the University of California
Amandalynne Paullada, Researcher in Computational Linguistics at the University of Washington
Dan Rockmore, Professor of Math and Computer Science at the University of Dartmouth
Ken Stanley, AI Researcher, former Open-endedness Team Lead at OpenAI
GPT-3

Edited by Eliel Camargo-Molina and Jess Yung

Music by Eliel Camargo-Molina and GPT-3

Mastering by Adanze "Lady Ze" Unaegbu

G46PUiGigCepW8Ld0R0D

Episode Transcription

Episode 3 - AI THAT WRITES 101

Eliel Camargo-Molina:

So you know how we have interviews with a lot of AI experts and researchers talking about AI, where we essentially asked them the questions that we ourselves had about all of this stuff.

Stephen Follows:

We basically got free advice on how to write our script. We weren't doing it cynically to get free advice, I should add, but because we were interviewing them; a couple of people a week for months while we were doing development of the AI script, we were able to bring our latest understanding and latest challenges and successes to each interview. And so it doubled up. We were actually doing the stuff for the podcast and trying to increase our understanding, but it also doubled up as free advice from the world's greatest experts in a problem.

Eliel Camargo-Molina:

So we have all this material, and a lot of it going through the basics of what AI is or how they would explain AI to somebody that doesn't know a lot about it yet, but also the intricacies of the technical stuff. What are all those buzzwords that we hear about training or weights or networks or some other times, honestly, they would just say things that blew my mind. So anyway, we had all this material and I just started listening to it and I thought it would be cool to craft a pseudo-lecture, putting all of these experts together, short and sweet. I was aiming for 20 minutes, but it ended up being a bit more than that. But cherry-picking all of these incredible expositions of these complicated ideas by these experts in a way that it sounded like a nice lecture that you would listen to.

Stephen Follows:

All right, let's play it.

Mike Kanaan:

There's a fundamental issue at jump street with explaining artificial intelligence to people definitionally speaking. So I'm Mike Kanaan. In my personal capacity, I author of the book T-minus AI. I'm currently a fellow at the Harvard Kennedy School doing a master's in national security. And then prior to arriving here, I served as the director of operations for the United States Air Force's Partnership with MIT on artificial intelligence, and then spent four years in the Pentagon before that chairing artificial intelligence for the air force. That's a little bit of my background.

So back in 1956, when all these smart characters got together at Dartmouth, they posed the question "can a machine think?", and Turing asked that question and they wanted to get together and say, well, how would a machine think? And they defined artificial intelligence as a machine doing something that's deemed in the human domain.

And you can very quickly unpack why that definition is an issue. It's because it's a rolling definition. At a certain point in time, an abacus was artificial intelligence, a TI-83 calculator was artificial intelligence, Excel was, and now we have Tableau and Google Search - we say that's artificial intelligence. And it perpetuated this feeling every time we see it and experience it: "that's not real AI". And unfortunately, we kick the can down the road and it has forced us in the past into what we call "AI winters" where we're very disillusioned with the science and everything else.

Bob Fisher:

In real terms, Turing talked about it before, in the 40s and 50s, and other people probably as well, too. Well, hello everybody. I'm Bob Fisher. Technically speaking, I'm a professor of computer vision. I think that's my full title in the School of Informatics at the University of Edinburgh. I guess the early days, 60s and 70s, there was a lot of one-off type applications and handcrafting that went in to it, trying to link partly what we understood, for example, in the area of natural language processes, understanding grammar and syntax and semantics together. And so you built a lot of rules and a lot of special case programs for that kind of thing.

Angelina McMillan-Major:

Back in those days, it was all rule-based. So you would just have a defined rule of this word means this word in English, for example, and you would write a bunch of those rules and it would do not great. And so since that time we have had different kinds of algorithms. So a popular one was N-grams for a really long time, sort of more simplistic. But you would basically just take frequencies of words and count how many times that word would appear next to another word. And so you would get a bunch of these frequencies of how many times you would have two words appear together and then you would predict the next word based on these frequencies.

My name is Angelina McMillan-Major. I am a graduate student at the University of Washington. My background is in linguistics primarily, and I've moved into this space of working with computers in order to do that. And then I've gotten into broader questions of what does it mean to sort of encode language as data. And so it's been this back and forth of, well, we need more data. Well, we need better hardware in order to be able to handle that more data. And we've developed different algorithms that could handle and learn from more data and sort of grow over time to this extent that we can now see how natural algorithms like GPT-3 and others are.

Mike Kanaan:

What's the difference now with artificial intelligence? And no matter what we see, here's the bottom line. It is data that writes software, that is the easiest explanation of it. In our past, if you wanted to be a programmer and you wanted to create a chair, for instance, we're all pretty familiar with chairs and you wanted to do that in a sense where a machine could create many chairs for you, what would end up happening is you would have to take the essence in your mind of what is a chair? Does a chair have four legs? Sometimes it has two legs, sometimes it has three, sometimes could a table be a chair, perhaps if I sit on it, it all of a sudden is. So in our past, we had to program literally these logical outputs and we were largely presupposing the notion of what it could be.

But now, what we could do if I wanted to create many chairs, for instance, or ask a machine to identify a chair, what I can provide it is a lot of what we call training quality data that's structured in a very certain format so it can be ingested onto this very particular software, particular compute capabilities and the algorithms themselves. And what it will do is be able to search for the underlying patterns of what makes a chair at the end of the day. So in the sense that we have now greater generality, we can do a lot more than we used to do. We have less tasks that we need to accomplish and it sorts through latent patterns we wouldn't have otherwise discovered.

Sameer Singh:

I think some of the exciting stuff has been how little information, the new sort of state-of-the-art AI need to be able to do a task that you want to do, right? I'm Sameer, I'm an associate professor here in University of California Irvine. I also work with the Allen Institute for AI. And essentially, my research is in looking at national language processing, so really large language models and essentially how they can make machines smarter and smarter and do things. So suppose, and we will get into this later, but suppose I want an AI that has to write like me, for example. You always assume that, oh, this would need so much more data. It would need to look at all of the writing you have done and things like that. But it's surprising how, given just an instruction or given very little data, AI is able to do a lot. So that aspect of things I feel is one way to think about intelligence is how much can you do with little data? That part isn't quite highlighted as much in the media, for example.

Magnus Sahlgren:

So when it comes to large language models, I mean, there are several variations of language models and one specifically, and you can say that there are three main types. There are encoder models which learn representations that you can feed into other programs like classifiers. And then you have models like Bert, which is a encoder model.

And Bert was the first language model that came out from Google that sort of revolutionized the field. But then you have decoder models, which are models that take an input sequence and try to continue that sequence. And those are models like GPT. So my name is Magnus Sahlgren. I am currently head of research for natural language understanding here at AI Sweden. I have a PhD in computational linguistics and been doing research on computational models of meaning you can say. So my research interest is what is it that we understand when we understand language? So I've been doing research in this area for more than 20 years by now. And currently, this takes the form of large language models. GPT stands for generative pre-trained transformer. So transformer is the architecture of the neural network, and it's a pre-trained model, which means that someone has already trained it and it's a generative modeling, which means that it's used to generate language. So GPT is one sort of flavor of language models.

Christoph Molnar:

The GPT-3 model is huge, I mean, has billions of parameters. My name is Christophe. So my background is in machine learning. And before that I studied statistics, did a bit of work as a data scientist, and then studied my PhD in machine learning with a focus on interpretability, finding out what are the reasons that the machine made certain decisions. The GPT-3 model, what it does is actually something very simple like predict the next word or probability distribution over the next words. So it's a really dumb thing to do, just spitting out some words.

Sameer Singh:

Let me think a little bit about that. So essentially, when we are training models like GPT-3 and any of these language models, we are giving it some of the input and asking it to do the rest. So that includes giving it very little and asking it to come up with the next word at the beginning of the sentence. It also means giving it part of the sentence where you stop at a very crucial junction and you have to decide which word should go in here and which word should not. And although predicting the next word seems like a simple task to do, you can play this game with yourself or with someone else where you can start a sentence and see how likely they are to finish it. And it's actually not so simple to sort of think of what the next word would be.

I was able to do it with my daughter when she was around four, and that would freak her out a lot because I knew exactly what she was going to do. But she's seven now, and she's gone way past my ability to predict what she's going to say. And the reason is - the difference is - some level of complexity. So more notions about the world, more understanding of how the world works, and being able to imagine scenarios that go beyond what she was doing at four. And so when it comes to language models and text, you have to decide which pronoun to use somewhere.

And that requires thinking about, oh, what is the identified gender of the character that is being talked about or picking a verb and realizing, okay, this is a table we are talking about what are the verbs that could potentially be applied to it? And so it's this guessing game at a word level, but in order to get good at it, you need to be able to do more than... in some sense it's predicting the next token, but the complexity can be pretty high, right? Yeah. So I think that's sort of where a lot of the complexity of GPT-3 comes in. In order to do predicting the next token, you end up having a bunch of complicated mechanisms to sort of get there.

Magnus Sahlgren:

Yeah, but I think it is that simple, right? I mean, that is the only thing that the generative models do. They take a sequence and they use that sequence to build the probability distribution over the vocabulary for the next likely token. And the more sequence, more context you can put into the model, the more specific the probability distribution will be for the next token. And so you just add on to that and then you have all these hyper-parameters. So you can say, okay, so now I want more creativity in the next prediction to make the model generate something a little bit less sort of obvious, perhaps. But these models are really so simple, and I mean, from a linguistic perspective, it's a little bit annoying because it's the most naive form of training you could actually do to learn a language model. And it is actually the same training that Jeff Elman did back in 1990 with his simple recurrent networks. Nothing has changed since 1990 in the training regime, basically.

Amandalynne Paullada:

So in order to make a computer operate on human language, we have to create some digital representation of it. And commonly these days, what that means is you take a bunch of written text and produce vector space mappings of the text into, let's say, a matrix of lots and lots of numbers that represents information. The idea is this is, yeah, you just want to create a computationally tractable representation of text. And that's done with lots of linear algebra. My name's Amandalynne. I am currently a post-doctoral research fellow at the University of Washington, and my background is primarily in linguistics and computational linguistics. So there's, I guess depending on what text you've trained on and what training objective you've used, there are different properties within the space that in a sort of fuzzy way represent semantic or syntactic properties or other sort of qualities that maybe don't make sense at first glance.

But if you consider the training data, maybe these words co-occured a lot, or appeared in similar context, but aren't necessarily semantically related. So if you're training just a language model, let's say that I have a bunch of cards with words on them and I'm showing them to you in a sequence, and then I ask you to guess, what's the next word given the first five words that I just showed you? And you get rewarded if you get it right. And that just keeps happening for hundreds and thousands of instances of words and combination. And so what you have at the result is a bunch of probabilities for giving some sequence of inputs, what's the most likely thing to come afterwards? So I think a lot of the conversation around the harmful biases that are embedded in the models, that's pretty much because the models are just reflecting what they've been trained on.

Mike Kanaan:

When I think about artificial intelligence, I think of it as a mirror, a flashlight or a canary in the coal mine. It just illuminates otherwise undiscovered or latent patterns that exist in the world around us.

Angelina McMillan-Major:

So bias in an algorithmic sense is a very different thing from bias out in the world. So there's data bias, where you have just more instances of a kind of training data or a particular word, for example, and that just leads to more predictions because it's seen that word more often. But then there's also instances of algorithmic fairness. And so there's just different kinds of ways of setting up the algorithm will lead to particular kinds of predictions. So one example is, when you have an algorithm that reads left to right, it favors languages that also produce language left to right as opposed to other language that produce right to left, things like that that seem very simple and that we should be aware of, but sometimes we just weren't.

Sameer Singh:

I think it's difficult to point to what level of understanding, but I would say it's a very, very low level of understanding as to how that happens. We know the mechanics of it, we know that there know exactly some weights will get updated and there are just so many weights in there. There are so many pathways, so to say, that it would be, yeah, it's just impossible to understand exactly what's going on in there. Weights, you should just think of it as a single number. So there are billions of weights in these models, which essentially means that when we say GPT-3, that's literally just 175 billion numbers. That's what GPT-3 is, connected in a specific way. You just have this really dense, complicated network where each line there has a weight and there are 175 billion lines.

Some weights will get updated, some weights will go higher, some will go down every time it makes a mistake, and it still makes a mistake a lot of times even during training because you can't perfectly guess the next word, but every mistake leads some change in the weights, and we can figure out for a specific mistake which weights will go up by how much amount. But that doesn't tell us how a complex behavior sort of emerged from all of these weight changes.

Christoph Molnar:

I think the difficulty is the word understanding. What does it mean that you understand the model? I mean, you can try to emulate or have a mental model of how it will react so that you get a feel for it, but then you treat it as a black box. You're just like, okay, this is some machine, I can push some buttons and then I have some expectation what will happen.

Dan Rockmore:

So first of all, it's its own cottage industry in computer science now to try to figure out what the model's doing because the model has as many parameters, more parameters than there are neurons in the brain. And so a lot of my friends don't find this interesting, but I find it really fascinating that we've created this model that's too complicated to understand. So I'm a professor of mathematics and computer science here at Dartmouth College, where I also direct something called the Newcomb Institute for Computational Science. My interests are in, generally speaking, in network science and lately in machine learning, especially around text analysis and text generation. I'm super interested in the problem. Now, we're basically doing functional imaging on this mathematical model to try to understand which parts of the parameter space are producing the kind, whatever are affecting the various ways in which language is produced. So roughly, we're imaging an artificial brain. We're also trying to image our own brains. And I will say that I think most people would agree that it's still pretty opaque as to how it's doing what it's doing.

Bob Fisher:

Yeah, exactly. And it's a bit hard to tell. And there at the moment, there's a kind of technology that pretty much everybody doing mainstream AI is using, which is called deep learning. And this is, in my mind, is a little bit of black magic. It works, but people don't have a very good idea how it works. It's partly a matter of memorizing a lot of data that it's seen and making a little bit of generalization. A certain amount of it is unpredictable. You have no idea what it's learned and what it hasn't learned. But on the other hand, it works and just about everywhere that there's an application of artificial intelligence where they've substituted from traditional methods to this deep learning type technology performance has gone up and we see it everywhere with automatic speech understanding, automatic translation, all autonomous vehicles, image recognition, face recognition, everywhere that there is this type of technology, things are now approaching usable levels of performance.

And so it may be black magic, but it works okay. So I don't mean black magic in the sense of evil, of course. Just magic in the sense of its unintelligible, and there is a lot of research at the moment to try and partly to understand better what it is that this technology is doing and how it works. And once you understand that, then it becomes engineering and then you can make it better. You have toolboxes, and also going along with that, it would be good to have guarantees on performance associated with this technology. I don't think anybody accepts that the computer is perfect, and we should also accept that the human isn't perfect, although, that's a little bit harder to swallow, but putting the two of them together I think gives better performance. And I think that's what we all want.

Sameer Singh:

I think there is a temptation to start using these things and come up with, oh, I think it'll be able to do well on this and not be able to do well on this. And you play around with it and maybe you change your picture and update your picture. But I don't think anybody has a really good idea on what the strengths and the weaknesses actually are.

Magnus Sahlgren:

Right now, there is, I would say, a specific interest in the field when it comes to decoder models. And the reason is that when you scale these models up, which means that you add parameters to the model, you add layers of the neural network and you increase the number of weights in the neural networks, they become much more powerful in general. And you can use them to solve lots of different types of language processing tasks, but simply feeding instructions to the models.

Sameer Singh:

A lot of the things GPT-3, for example, doesn't give us access to what it's trained on or doesn't give access to the actual weights of the model, which makes it, I would say, nearly impossible to figure out exactly what the strengths and the weaknesses are. But for the models that have shared them, they're still incredibly huge models and you can do a lot more, but it's still a computational bottleneck of being able to understand the complexities of these language models. So yeah, we are scratching at that and making our ways in and trying to get a better understanding. We were looking at, okay, this GPT-3, it can do numerical stuff, it can add numbers, it can multiply numbers and things like that. And we were looking at those numbers and it can do that, but it's only 80% or 90% good at it, which is fine.

But then we started looking at, okay, what numbers is it actually good at, specific numbers and what specific numbers is it bad at? And turns out the numbers that appear a lot in the training data, it is much, much better at than numbers that appear a little less, not like they don't appear in the training data. We are still talking about two digit numbers, so they all appear in the training data millions of times. But models like GPT-3 are much worse at numbers that are slightly rare in the training data. So that gives us some insight into the data really matters and we need to get access to data as well. So that's a good sort of step to start understanding them. The question is, are they memorizing or are they learning the general rules? And turns out that that's a simplification that we would like to attribute to what's going on, but what's actually going on is probably some combination in between.

So maybe for the first number, if you say two plus seven, maybe for two it's 80% memorizing and 20% rules. But for seven it's more an odder number, it's 20% memorizing, and 80% rule, and things like that. And it's very difficult to figure out exactly and put it in a bucket, but at least what we would like to do is get to a stage where, when we are asking it, Hey, what's two plus seven? Let's test it on instances that we know it hasn't seen before, for example. It hasn't seen two plus seven millions of times, because if it has, then maybe it's not so impressive that it's able to say nine.

Ken Stanley:

In terms of the capacity for genuine discovery within a system like that. It depends on how it was built, of course. And so these systems, like the ones that we're talking about right now, the ones from today, they were driven by massive amounts of data. And so, that means that ultimately what you're seeing is a function of input language, which is what they call prompt, and also the billions of data points that it was trained on. So I'm Ken Stanley, I was a professor for more than 10 years in artificial intelligence. I started a company that was an artificial intelligence research company. It's called Geometric Intelligence. And so there's a huge amount of space, we might call, you call it a search space. There's a space in there of possibilities which is vast beyond the imagination. So we have no way of conceiving all the possibilities in that space.

And nevertheless, it's still a finite space and it's still constrained by what was fed into it. So there are limits to what you can get within that space. It may be that we eventually will come to sense those limits, but even so while that's true theoretically, I mean, in practice it would take a long time to reach the outer ranges. And so you start feeling like you know it. And that feeling when that happens, you understand the space, you kind of get it, that starts to make it feel like it's no longer a frontier. And so this is certainly a possibility that maybe we haven't hit yet. Now we're in the grace period here where everything is new and it feels beautiful and it's an unlimited frontier, but we may start to feel limits, and then we won't have those same feelings that it can do literally anything because it can't. And when it goes to what it can't do, it's this genuine novelty, which is a very hard thing to pin down or define, but it's something like surrealism.

I think of a genre of art or something. It comes out, it's like the wild west, you're just like anything could happen here and it's a really, really exhilarating feeling. But then, over time you'll come to understand the system. That's not even the same as knowing its limits, it's just you get what it can do in a way. And humans just have an amazing capacity for this, just to sort of understand what are the possibilities within a space. The question is could a model this have come out with that at that time? Let's say, it was trained on everything and preceded it, and the jury is a little bit out, maybe if there had never been surrealism. We trained on everything that preceded it, we get it anyway because the prompts are just really clever, but there's also an argument that that's going to be way too hard, actually, like a dripping clock or something like that. This substrate for thinking about that just doesn't exist sufficiently in the proceeding data.

Magnus Sahlgren:

One of the things that will happen soon and is already happening in a sense is that we'll see multi-modality entering. I mean, in the sense those models are already multimodal. You can feed in text and you get an image, but I think you'll also get the interplay of the modalities.

Ken Stanley:

If you want a picture of, say, people in an office. Right now, you could go and get it from a stock photography, but now with a model with one of these large models, you could go and generate it with a prompt in theory. But the thing is, 50 years from now, if it's the same model, that office will look like a 50-year-old office. If these models had come out in the 1970s, every time you wanted to get a picture of modern office workers, you'd get a 1970s style office with people dressed like they're in the 70s. Something doesn't make sense there. So it seems almost like you still need the stock photographers anyway, because who's going to keep feeding it so it's on the modern cutting edge? And this is true in art and in sort of just stock run-of-the-mill stuff. And so these questions don't have obvious answers right now.

Angelina McMillan-Major:

I will give some credit and say that there are some active learning algorithms out there, but they require maintenance. And so they're expensive that you have to keep feeding it and keep training it. And it's expensive in terms of resources, is expensive in terms of the people doing that maintenance work. But I think we're still learning about that. I mean, GPT itself, the first one is only a couple of years old, but really, we just don't use it. And so I think there's some amount of these older models will just end up in the digital wayside as we continue to upgrade.

Ken Stanley:

Over the last decade or so, the commercial industrial world has gotten more and more interested in AI. And I think there there's some kind of inflection point where AI crossed a line where it was doing things that are actually useful and people suddenly took interest from the industrial side. And it makes sense because the implications of AI are just completely unlimited. I mean, when you think about it, we don't know how far it can go, but the farther it goes, the more absolutely profound its implications for the entire economy. And so to be a tech company or a large tech company and miss that boat is a huge risk.

Magnus Sahlgren:

It's not universities at all anymore, it's these private companies. It's the big tech players that sort of own and drive the field right now. And I think, I mean, in my field, computational linguistics or language technology, that has shifted the entire academic sort of playing field right now. The universities have a hard time keeping up. They don't have access to the same amount of compute, the same amount of data that these big tech players have.

Sameer Singh:

A lot of the researchers in the industry are output of academia. They were often, many of them were professors, but if not, they went to the grad school, did their PhD and things like that. So at least many of the places seem to be academic in their approach and they're not just thinking of it as a commercial opportunity. They're releasing models, releasing data sets and things like that. And that's been quite exciting. But yes, I think the emergence of these groups of people who are just able to get the compute together and do something interesting has been amazing. Even the stable diffusion model which came out of a startup, not really a random group of people, but even that, the way it has disrupted Dall-E in its ability to do... sometimes much better than Dall-E, has been quite fascinating.

Ken Stanley:

And it's very interesting because it's not clear, what's actually going to happen, is completely unclear, but yet the upside is just so gigantic that it's too dangerous not to be a part of it. And this has caused this huge surge in interest in investment and that's, as you said, basically overshadowed academia, and created a situation which I think is very unusual in most fields like scientific or non-scientific fields that you get in academia.

Sameer Singh:

So yeah, I think it's been amazing to see how progress has been. I think universities are a strange place in the sense where the incentives there are not quite aligned to be really fast and build these things up. So a lot of the reasons people, I don't want to say it's not creative, but a lot of the work is engineering, and trying a bunch of things out and to being able to do it at scale at really large dataset and universities aren't quite set up to do that. If I have a good engineer would've probably graduated and they're working for a company. I have students who I'm training to be good engineers. And so there's that aspect. Then there is funding as well where it's not so easy to channel funding to a university. It's much easier to write a check to some group of people rather than go through the whole bureaucracy of the university system. And finally, in some weird way, universities don't incentivize you to work in large groups and build something big.

Ken Stanley:

I mean, I think it's a very complex issue whether this is bad overall for society. There's just pros and cons and I think you really have to disentangle them. And it's no simple answer. I mean, clearly some of the pros are like, well, if you want to see advances in AI, the fact that billions of dollars are being poured into it is going to lead to advances in AI. So we'll get to see them. The other fields could only dream of this happening within those fields. But on the other hand, you have an incentive system that's less pure on the commercial side. I mean, it's driven by a profit motive, usually. And I think a lot of the best labs try to protect themselves, even in the corporate environment, from the profit motive so that they can do basic research and make those kind of advances.

Sameer Singh:

What I will say is that the universities have been fairly good about being the voice of reason in this whole conversation. And just because one company's excited about it and thinks GPT-3 does X, Y, Z universities are a little bit more like, "nuh, uh, uh, you actually haven't considered this", and been a good check, which I think is a good role for universities to have, right? In the longer term, what will end up being the case? It's difficult to say. The other way I think universities will fit in is being able to use these things in ways that may not be commercial, at least in the short term, that may just have a lot of social impact, but not easy to put advertisements on or whatever it is. And those applications are equally important, and universities can be pretty creative in those aspects as well.

Ken Stanley:

I think the best perspective is just to have the humility to understand that nobody knows what it's going to be like over the next few years or even over the long term. And that includes when various revolutionary predictions are actually going to take place. They may not happen as fast as everybody's saying. And so I tend to be just agnostic about it and say, I don't really know. And it's true, I'm sort of an expert. I have enough knowledge to understand what's going on. But even so, I think it's just very unpredictable. And what I'm pointing to when I talk about the profound changes is just the potential for profound change. We don't know the order it's going to happen and we don't know the rate it's going to happen.

So we don't know a lot of these things. A lot of the time you hear predictions that are just completely off the wall and they're actually way out there, especially, I think, in orthogonal fields where they're just like, imagine what this field is going to be like. They're not AI experts and just, look, I mean, yeah, if you project science fiction 10 years into the future, every field including the film industry, is going to be completely revolutionized. But the truth is, some of these things are probably not realistic in our lifetime. But on the other hand, we don't know. So the rate of change and the implications of change are extremely dynamic in AI. And so big things could be on the horizon, and clearly are in certain niches already.

Stephen Follows:

When you were cutting all of that stuff, how much did you have to make an editorial choice? So I'm going to use this person's interpretation, it's different to that person's, or was there a general consensus because you were talking about the basics of what AI is.

Eliel Camargo-Molina:

This is fascinating because being a very active field, it does open the door to a lot of opinions that live all around the scientific consensus. So you maybe pick somebody out in the street and you ask them, what is science or what is a scientist? They would probably give you a version of it. Somebody that knows a lot about how something is or works, they have the knowledge. But the most interesting part of science is that they are the ones trying to figure out the things that we don't know yet. And they're living in that space where we just have a lot of hypotheses, and we are trying to figure out which one is right. And some people maybe have an opinion or another, and we have to have the opinions at the same time that we are trying not to let them influence whatever research we're doing.

So scientific consensus, which is what you put in textbooks, is what everybody in the community has agreed, and that we through the scientific method and any other kind of scientific tool that allows us to be sure, we have accepted as this is our best understanding and this works. But then there is that frontier where we don't know yet, and it's very interesting because it's uncertain. There, you find that scientists, even though they all agree on the scientific consensus, they might have completely different opinions. I've tried really hard to leave that space of opinions, that space that is not scientific consensus out, even though it's incredibly interesting.

Stephen Follows:

So I like your discussion about what the person on the street thinks a scientist is. For me, it is that we perhaps have an image of them as having the most answers, but actually they probably have the most questions.

Eliel Camargo-Molina:

Yeah. There is a lot of open questions, there is a lot of super interesting discussions and agreements and disagreements. And maybe what's interesting is that the discussion is happening somewhat publicly. You have a lot of AI experts going on Twitter and having long discussions about, is this thing sentient? And then all the experts saying no, that makes no sense. It's a discussion that lends itself to a very public display of that kind of lack of consensus in that frontier. There are some other questions that touch into the ideas of what's consciousness, what is creativity, what is thinking, what is self-awareness? There are questions that touch into philosophy or neuroscience or psychology. And some people might even think there's not even a point trying to define consciousness. So in that sense, I think the discussion might feel a bit more theory when it comes to the opinions. And again, I tried and I might have not succeeded, but I hope I did. I tried to keep those controversies away from this episode.

Stephen Follows:

One of the things I got from this episode was how you had correctly focused on the things that actually matter as a foundational series of information. And then later on we can talk about some of the more controversial ideas or things that people might disagree with, whereas, I think this was a good reading and I also think you kept a sensation of all the different voices.

Eliel Camargo-Molina:

So this has been a lot of theory about AI and very little practice. But I did one last thing. I took the text, the transcript, that we got for this episode and I gave it to GPT-3 and I asked it, can you give us some closing remarks? So let's see what it said or what it predicted or whatever the output.

Stephen Follows:

Great. That's see it.

GPT-3:

I think that what we're seeing with these large language models is a very early stage of artificial intelligence, and it's a stage where the technology is not yet well understood. We don't really know how it works and we don't really know what its strengths and weaknesses are, but it's a stage where the technology is starting to be used and it's starting to have an impact. And so I think it's important to keep an open mind about what these models can do and to be careful about over-hyping them. But at the same time, I think it's important to be excited about the potential of these models and to be curious about how they work and to try to understand them better.

Authored by AI is brought to you by Stephen Follows, Eliel Camargo-Molina. Isadora Campregher Paiva, Bob Schultz, Jess Yung, and GPT-3. Audio post-production from Adanze "Lady Ze" Unaegbu, and with thanks to Rob Cave, and Ina Del Rosario. Find out more at authoredby.ai.