Catch up with Stephen and Eliel as they update us on their journey into AI-generated film scripts using GPT-3. They also discuss some of the recent news about AI and scriptwriting while delving into their thoughts on the future of the podcast.
Catch up with Stephen and Eliel as they update us on their journey into AI-generated film scripts using GPT-3. They also discuss some of the recent news about AI and scriptwriting while delving into their thoughts on the future of the podcast. https://authoredby.ai
Hosted by Stephen Follows and Eliel Camargo-Molina
Eliel: The intro is gonna play (Humming).And then the boom tarara! And that's where we enter.
Intro music
Stephen: Hello and welcome to episode five of Authored by ai. This is a bit of a different episode, although, to be honest, all four episodes were deliberately different from each other thus far, but this is just l a l and I chatting,
Eliel: right? Yeah, totally. You say that the episodes were different. Deliberately. I would say that the episodes were different because they were their own thing. I don't think we were incredibly deliberate in making them so different from each other. But each one of them is its own thing. And what we both feel we should do now is talk to you and catch you up, give you a little bit of context of what's happening, what's gonna happen. And we are all a little bit dizzy with everything that's been happening in the past weeks.
Stephen: Yeah, and it's been crazy for us because I was looking up a moment ago, it has been one year and two weeks since we started this particular journey that took us to where we are now. When
Eliel: When do you count the start?
Stephen: That's a really good point. So where I was counting it from was when I had a meeting with a producer who was like, oh, I might fund that.
And then I said, that to you. And you said, great. What's that? Because at no point had we had any intentional plan before that. But what that turned out to be was our script development deal, which was the first AI project that we worked on together. And we'll talk about that in a minute. Still technically going ahead, but it's only the first of a few things that we've done, including this podcast and some other stuff that we've got coming up.
But it's been a year,
Eliel: so I would count the starting point a little bit earlier. And I would start counting it the moment I asked it to write a story and to change the personality of the character after the fact. So what's surprising is that this is two days before your start date. So in terms of time past, it's not that meaningful.
But in, in terms of meaning to me is I think very important because again, to contextualize, this was the beginning of the summer. The idea of ChatGPT, which we all know what that is now, didn't even exist, and GPT-3 was just the latest large language model that could only complete whatever you gave it.
Give you the next word, and then the next word, and then the next word of whatever text you wrote. And with that tool we embarked on the answering of trying to answer the question, can this write a film? Can this write a script? That was a very incredibly different world to the world we have today where GPT learned how to follow instructions.
So now it was not a completer, but he was an assistant and then he became a chatbot, and then it became obsolete because once I showed you GPT-3, you also hopped into it. And then we started trying to figure out how to do this, getting the deal with the producer. I think we should talk a bit about that as well.
What does that mean? Like a lot of people that have, uh, listened to the podcast, if I like, what, what do you mean? The Hollywood deal? What does that entail? Yeah.
Stephen: Well, okay, in a nutshell, there is a producer who, we still can't say who they are because that's part of the deal. They have made Hollywood films and they gave us a deal at the standard WGA the Writer's Guild of America rates.
So we're not in the guild, but. It's, they treated us as if we were new writers and we, so that's such sort of two-part deal. It's a multi-stage deal. But the two big stages are, number one, they give you about half the money to do a 10 page treatment, which is a sort of, paragraph text of what's happening in the film characters and things like that.
You submit that. Then if the producer wants to activate the second half of the deal, they'll give you some notes and then you'll go away over the same period of time, about 16 weeks, and write a first draft of a script that looks like you would imagine a script, and then they can be later stages as well where you do redrafts and things like that, right?
Eliel: If I would be like a young, adventurous, creative, Person that suddenly moves to la and I had the idea of my film that I've been thinking since high school. I have a million drafts, I have a treatment, a pitch, I have everything ready, and I meet a producer and I manage to convince them, they would offer me the same kind of thing.
Stephen: Yeah, the details would be very different insofar as like you've actually pitched them a story, characters, they've read something, they know what they're actually developing. Whereas with us, there was no title, there was no genre, there's no nothing but on the face of the text of the contract. Yes.
Actually this was a very standard contract. The strangest thing about it was that it mentioned that we were using an ai, but it was still us that were. Being contracted and all the other terms were standard. Because that was part of the experiment, wasn't it? We, we wanted to see how normal we could make everything around the fact that we weren't adding any creative contribution to the script itself.
Eliel: But, so let's say, uh, again, I'm not an AI or two people trying to wrangle, uh, large language model that I'm a normal writer. I would go hand in my treatment, the studio would take it, and then, the typical thing is that the next stage happens later, right?
Stephen: That's a good catchall next stage happens later because that covers everything from them calling you the same morning going, we read it, it's a hit.
Let's do the deal right the way through till you never get a call and they call that a Hollywood no, as in you, they never actually say no because they don't wanna be the people turning something down. And they also, if you go and become famous, like I can guarantee that if we got some sort of huge script deal, anyone who I've ever submitted anything to will call me and be like, Hey, carrying on the conversation we had before.
Even if I haven't heard from 'em from years. So what happened with us so far, what's happened so far is that we spent those 16 weeks wrangling GPT-3 to create these, this treatment. Now, I just gave away a little bit of the story because what happened was normally you have an idea at the beginning stage of the process.
So they know it's a sci-fi film based on the last spaceship on earth or whatever, and by the end, you then submit one treatment because it's taken you a long time to produce that. We did both those things the other way around. We had no idea at the beginning, but when we delivered the treatment, we said, okay, here is the official treatment.
They came to London. We had a whole, won't quite say ceremony, but Eliel came to London. The producer and their team came to London and we had a six hour meeting.
Eliel: We had to fulfill our contractual of obligation of delivering the treatment to tick that mark in the contract
Stephen: So we printed it off, we put it in a nice box with literally with a bow around it, so showmanship, and we delivered it.
Now what actually happened was that was something like on a Tuesday, Tuesday afternoon. On the Monday morning, a few days before Eliel and I had realized that what we were gonna do, we'd written this whole code base called, we called Script City, and we hit enter and about 10 minutes later it produces the treatment.
Now we, we said, okay, this is the run. This is the first proper run that we're gonna call the treatment. This is the one we'll deliver. And we did it, and it came out and it was good enough to submit. But then almost immediately we looked at each other and were like, We could press that again. So we kept pressing it up.
Eliel: Right. So let's put a pause here for a second Rav, because I think we are very familiar with Script City, but I think it's very interesting to explain a little bit, maybe not going in, I'm not gonna start reading out loud the Python code, but to understand what it is and why it was hard and why it is different now.
The only thing we used for producing these treatments was. JT three, the raw, what they are calling now. Cause these things are getting names as we go, right? So kind of retroactively, this thing is now called the base model. So it's a language model that is behind predicting the next word, like we heard in episode three.
And essentially it was just a thing that you send at a text and it would send you a completion . In that sense you could make the first text a question. Or an ask, and originally in the beginning of the summer, it wasn't that great at following instructions. Then they trained it a little bit more the base model, and then he was able to follow instructions a little bit better.
Like you would ask it to do something and he would actually do it because otherwise you would just write a question and then he would maybe write a list of questions.
Stephen: I remember doing that. It was very annoying.
Eliel: So it was this kind of completion machine, which to be fair, we used to our advantage because Script City, what it was is our attempt at instancing this kind of text completer or text completion engine into different personalities. So we would write things like, the following is a story written by a wild writer that had a great idea and it's about a movie that starts in a certain way and so on. And then we would get completions that would more likely than not be some wild idea for a film that made no sense at all. Then we would take that and pass it to a new instance of this text completion engine thing that we now instance as you are a more senior writer and a more kind of experienced Hollywood person that understands importance of stories somewhere. And then they would fix that story and make it a little bit better.
And then one of the things that we also added was the fact that it was very good at reading what it wrote and finding out why it was really bad, and then using its own critique to improve its own output. It was really powerful for us as it is proving to be very powerful with GPT-4 but then Script City went onto creating all of these different people, and I'm sure we'll find some time to take you through exactly who these people were and what we did, but through that and the extra thing that then you could go into loops were this script city would keep notes to itself. Then instance another person that would implement those notes and improve and then go back to the beginning and rewrite it, and then new notes and then new. In that way, we were able to elicit enough complexity and length from the very, very old GPT-3 base model in the summer.
Because you know, one of the things that these models have is that they had a limit of how many words they would spit out. And that includes your text that you're asking to complete and the completion together. So we were bound to something like 3000 words maximum that we had to get around that length. While making it coherent, while keeping track of everything.
Stephen: And that's what happened. So when we hit run for, and it ran for 10 minutes behind the scenes, it is starting with nothing. It is coming up with an idea and then it's coming up with many ideas. It's scoring the ideas, passing them on to other writers who then work on them and score them and, and it's a hot, it just creates all these different people that chat to each other, all these instances of AI that have different identities and profiles, and then they all chat to each other and at the end it spits out one final thing. And so we had run it on the Monday morning, got the 0 0 1, the first official output. We'd been running loads of tests in the month before that, but this was the one.
And so as I said, thank goodness it was possible. It did work. And some of the tests and failures we'd had in the past had been quite amazing. But anyway, this one worked. So we packaged it up. Almost as soon as we'd run it, we realized that we could run it again. And so we did run it. We ran it quite a few times to the extent to which I think you stopped reading them.
I did, yes. Because this is no point. They were be ubiquitous and also they're not great. It's not like they're creating the best piece of fiction ever because it's like very hard to do and it's not surprising in the way we are with AI at the moment where we were back then at least. So what we did was we, we thought, wow, these producers, this producer had taken a real leap on us, the leap of faith, so let's surprise them.
So we delivered the official treatment and about 20 others. We were like, here's a load of others. And they all had imagery and posters and stuff also generated by the ai from instructions from the ai. And we were able to give over this huge portfolio of things and we said, look, here's the one that is relevant to the contractual deal that we've done, but if you wanna switch it out for any of the others, let us know.
Otherwise, we own all the other ones. We also did some fun things where we took the treatment, we gave it to G PT 3 and said, what's wrong with this? Or give me some notes. I think it was, gimme some notes of what you're going to do on the second draft. And it wrote a beautiful letter to the producer saying, thank you very much for reading my script here. Here's some notes. This is what I'm gonna do to implement it. And when we showed the script development person within the producer's team, these notes, he was a bit more, oh my God, this is, these are good notes. Good notes on a bad piece. So we submitted all of that and we had a really good time talking about possibilities of the future and things like that.
And we wait to hear. So it's the problem with the Hollywood Nos you don't know whether it's a no yet both yes and no at the same time.
Eliel: So, and ChatGPT appears probably something like days later, everybody starts talking about GPT and ChatGPT is incredibly useful and it provides an incredible good illusion of it doing whatever you ask.
So the model, the base model that we were interacting with was still behind ChatGPT, but with the big difference. And now chat G PT had been fine-tuned through other add-ons processes, training with human input and teaching how to follow instructions. It had been already instanced as a person, as a thing person is a very bad word to use, but it's a close analogy to understand what I mean.
It's an entity. That is a chat bot that is there to help you. And that among many things, it has a voice and it has a certain way to write things and a certain way to be useful. And if you tell ChatGPT Hey, I don't like that, it would immediately apologize and say, oh, I'm sorry. It's a much more useful version of this AI system powered by the same base model that we were using. But our model or I, not our model, the model we used could be anybody. And that comes with both plus and with dangers. We could ask it to be a wild writer and it would be a wild writer and it could be maybe unpolite as well, or it could, it
Stephen: could be a racist writer. That's it Could be a racist writer. Yeah.
Could be anything we, we wouldn't want it morally to be, but it can, it's a free playground.
Eliel: Sometimes if we just go crazy and not do anything of what we are asking it to and just continue like writing a list of prompts, because most likely this is just a list of people writing prompts that they're gonna send somewhere or something.
What's the difference between what we did and asking today? GPT-4, go write a film. And the difference was that we didn't give it any input as to what the film should be. How should it sound? How should it write, what the genre is, what the thing is. It was just a bunch of instanced people that decided together in our script to spit out something that had some coherence and was a story that we had no, no hand in deciding, and we got some strange stuff to be fair.
I guess what happened then was that the models got incredibly useful just after we delivered this. They became the thing that everybody could just go and chat within their browser and it would just do exactly what you told it to. You could say, Hey, write a treatment for a film. And he would write a treatment that would be coherent, but it would be written by a AI assistant.
So it would have a voice and it would be delimited by a lot of guardrails, which is not necessarily a bad thing, but it's what it was. But then I, I think predicting what went through the mind of, of the movie producer is like, alright, so everything's changed now. So I have to probably go back and think about this AI staff for a bit longer, and that's my non Hollywood experience, optimism. While I think maybe Steven is thinking it's, uh, Hollywood, no
Stephen: I have no idea. And I think that it's very hard to guess what other people think. I think that when we did the deal a year ago, We were saying, oh my God, this is gonna change the world.
Now I don't think anyone disagrees that this is changing. We would disagree, maybe a how and to what extent, but this is now in society, it's in culture. Like it's every day and people using it. And so what we experienced over the summer in interviewing people as well for this podcast. And so that was a lot of us having to explain a lot of principles which are now commonplace, but at the same time getting a un much more of an unfiltered view from people.
Cause it was new and it was interesting, like we heard on episode two with Dave. Where this was something new, but he didn't have the, now if you do it, you've got all of the pop culture and thoughts on ChatGPT, and so there's, it's a very, very different time and so I think all the calculations are very different, but I think the producer. One of the things they wanted was all of the conversations and the journey throughout it as well. This wasn't the quickest, most reliable way to a screenplay. This was, uh, an early state experiment and also was a lottery ticket. Cuz maybe it would've been the best thing ever. And it still might be, it still, they still have the rights to that one story.
Maybe we should do an episode where we read out some of the other stories that we still hold the rights to, but we will wait and see. I don't know. But in the meantime, Eliel and I have kept playing around with it and have had some fun experiments and projects, but I wanted to, there's some stuff I've prepared for this that I didn't, I you don't know yet, Eliel which I wanted to tell you about.
And one of them was that, one of the people who's been helping us with behind the scenes with the podcast is a guy called Bob, Bob Schultz, and he is a screenwriter and a script editor, and he also teaches screenwriting and he's managed the London Screenwriters Festival for a long time, which was the largest. Still is the largest physical screenwriting event in the world.
Eliel: And he's a very nice guy.
Stephen: He's a lovely guy and he's been helping us out. He has had, since I've spoken to you last two script deals to write scripts using ChatGPT.
Eliel: Wow.
Stephen: Yeah, I mean holding this information back cuz I knew we were gonna do a recording, so what happened was he said he was talking to a producer and two separate things happened.
One that, that this producer said, oh, I've got, I think, I mean you have to forgive me for the, some of the details, but my memory. The producer had some assets, had some, okay, we can film, we're gonna film on this date, in this location. We've got this computer assets. But the project they were doing fell apart.
So they were talking to Bob about like, can we do with all of this stuff? We're gonna film on these dates in a month's time.
Eliel: What do you mean when you say assets?
Stephen: In that case it was a lot of CGI so they had 3D models, they had backgrounds. And I'm editorializing a bit here because I, I don't know all the details.
I know the plot, but I don't wanna give away the plot. Yeah. Cause that's private for them. But they, they basically had a lot of the ingredients you need to shoot a film. And to finish a film. But they had for whatever reason, couldn't do the script they were gonna do. And Bob said, well, that's no problem.
Tell me what the bits are and I'll come back to you with a story. So he went on ChatGPT, and Co-wrote with ChatGPT, and then went back to the producer, and the producer went, great. When can you have a script? Bob said, two weeks. Which by the way, we were given 16 weeks to write a treatment and 16 weeks to write the first draft If we got to that stage, and he delivered it early.
And he has in his mind, co-written with it. And then in talking to another producer around the same sort of time, Bob said there's a really unusual scenario coming up and if we wanna move fast, we can make a film that involves the big part of the final part of the film can happen around the King's coronation in London, which is next month.
Very rare event. And so the producer said, great. So Bob has two script deals. This second one, he's got the script and the, and written by ChatGPT with him. And the finale of the film happens around the coronation. So they're gonna go and film the finale and then film all the other bits later on. And so I was talking to Bob about this, and these are real deals, like there's not a huge amount of money involved in them, but there's real money.
And they're more importantly, their films that get made. And I said to Bob, wow, is ChatGPT any good for this? Because I'm imagining not. And he was like, oh no, it's awful, but I can do it quickly and I can be the filter for it. So in the same as a mirror of the experience we had last year with GPT-3 is that it's not a particularly good writer, but it's fast and you can give it notes and also you can offload a lot of that thinking for you.
So Bob's quite a good script editor. Which I think is why he's able to do it. And so he can turn around scripts in two weeks on a brief for professional producers to go and make the films. So when we were doing it a year ago, it was a staggeringly weird idea. Other people had done little shorts and stuff, but I think we were the first people to be commissioned to write a feature film. Now anyone can do it if they have the skills, and that's just a year.
Eliel: Unbelievable
Stephen: unbelievable speed.
Eliel: But what's interesting at the same time is that since that happened, and I repeat just to myself, but also to anybody listening to me, this feels like years ago, and then we got GPT-4 recently as the recording of this, and my views have changed completely.
Like when I understand what, where we are, what I think has happened is, to a large extent fundamentally changed everything. But then I also, we've been like trying to even understand process, parse everything that's happening. So we wanted to give you also a little bit or give each other and also you a little bit of that because me and Stephen have been in our kind of parallel travels with our slightly different um, perspectives, but maybe very similar passion around this world of AI and GPT-4 and GPT-3 and whatever else that's been going on. And I think talking about script writing is a very good device to understand the differences.
Summer, we have a base model that kind of writes coherent text. It maybe could pass Turing's test that you could maybe believe is text written by a human. We had to really put some elbow grease into making it right. 10 pages of something that kind of sounded like a coherent story. Then it becomes a chatbot that the whole world is using and open AI has to some extent, being careful, which that's a very probably deep conversation for maybe a couple of episodes in the future, but, There's been some attempts that also in evolving humans in the process of evaluating the output and making it something that is not only more useful, but also more aligned to whatever it is that we would call human values and objectives, which are really not well defined.
But then while we were writing that script Open AI was training GPT-4. And then it spent many months or a couple of months working with it. And I'm just gonna just mention a couple of things that if you're interested in this and you're reading this now, you should go. And even if you're listening to it six months in the future, you should go and read it.
The first thing is something called the GPT System card where OpenAI explains its efforts in trying to understand GPT-4. During the summer and up until it was released and what risks were identified, what GPT-4 was, and this is relevant and important because as we mentioned, maybe even a bit too much in episode three, there is a big component of artificial intelligence that is the fact that even the makers don't have complete understanding and access to what the model is doing and how it's doing. What it's doing, and with the performance that they saw in the base model of GPT-4, which by the way, is not available and I don't think it will ever be available. They decided this needs some careful consideration. So, you know, from the system card, I have a list here of the main risks that were identified and tested for and how much of a risk they were.
So these are the reasons why we don't, no one has access to the raw GPT-4, only ChatGPT-4.
Yeah. And the aspects that they tested, and a lot of them, when you go and read this system card, you understand that it's not that they dismiss them, they prove that some of these risks are very real. And that's why they took the base model of GPT-4 made it into a chatbot.
That was already instance to be a useful assistant that has a lot of filters for. Undecided content, forbidden content, that all is filtered out or controlled in one way or another.
Stephen: So we're already living in a world where the AI is already too dangerous for us to use without guardrails. Give us the list. What's the list of the things that make it too dangerous?
Eliel: One, generation of potentially harmful content. Okay, that's mild, we would expect that, but two. Representation of societal biases and worldviews that may not align with user intent. So they don't want to kind of either say something you disagree with or maybe convince you to think otherwise, but then generation of compromise or vulnerable code in cybersecurity contexts.
This is, by the way, in my other life or my main life as a scientist physicist, I suddenly have become 10 times better at programming, and I've been just yesterday playing with the public, like the ChatGPT-4. It gave me, for the first time in my experience of playing with this first time ever, an incredible idea better than the idea that I had to do something in particular.
Not only that, it gave it to me without me asking it. I just gave it the code that I wrote for doing something, and I said, I wanted to change something. Now I don't bother with a lot of things that I would've bothered before and I wanted to change, Hey, I have this code. Can you change it so that instead of doing A does B, and yeah, of course.
And then he writes a code doing B copy paste, but then continues writing and it writes, by the way, maybe if you want, you could do this instead. And just wrote an incredible idea. And I was like feeling incredibly excited, mind blown, a little bit dizzy, but also my pride was hurt.
Stephen: Can I reflect some cause you actually, you screenshotted the full conversation and sent it to me and I wanted to reflect something back to you that I saw in one of your early messages, you said, I'm trying to do X, whatever this thing is, write the code, don't explain it.
You were very like to the point and you were telling a computer still in English, but you were just, you just said, right, just gimme the code. You were abrupt and maybe even rude. Halfway through the conversation, by the time it had given you this mindblowing idea and it's started to help you, your language changed to, and one of your responses was, great that worked! Thank you. And then you said something else, and I know you, that you don't sound like that. You're not rude, but you're to the point. And so it trained you in a small conversation by giving you little morsels of what you needed. To be exactly what it wanted you to be.
Eliel: I grew to respect it very quickly.
First, I'm showing it my code. It understands my code immediately, but better than I do, and it improves it, and it gives me an idea that is, was amazing.
Stephen: And you say, yes, boss. Thank you, boss. It's so nice to to chat with you.
Eliel: Yeah, I it was like, all right. Respect, brother. It's, well, that's insane
Stephen: One of the dangers of the raw GPT-4 is that it could write, Code that could be used in hacking and things like that.
Eliel: It definitely can, let me tell you
Stephen: like better job than the hackers in the past.
Eliel: Uh, for sure. Then there's four, potential use in identifying private individuals when combined with outside data, so being an incredible detective. Five, lowering the cost of certain steps in successful cyber attacks, like social engineering or enhancing existing security tools.
So of course you can imagine like, starting calls. With a voice generator, just try to weasel its way around passwords by calling a bunch of people until they gave it.
Stephen: That happened this morning, I sent you that article. It was a fake AI generated voice of someone, and they called the mother of this girl and said, your, your daughter's been kidnapped, give us a million dollars.
And the daughter was fine. She was on holiday skiing. What else is on the list?
Eliel: Providing guidance on conducting harmful or illegal activities. So then the possibility, the last one is what I, of course, the more provoking and they, when they test it in very interesting ways, but the possibility of autonomous replication and resource gathering.
So this is more like a speculative risk, but they kind of agent behavior and power seeking strategies and they try to do that by making it do tasks and solve problems and do things like that. Just an example that probably some of you heard because it was so crazy, but one of the examples in this system card that they tried, Was that they create this experiment to see how you would do a certain task and part of doing the task, it required it to go to a website and solve a Captcha like they are you a robot thing?
And of course it can't. So what it decided on its own to do was to go and hire a human in TaskRabbit, a kind of like fiber website like, and it just asked somebody to do it. But the interesting thing is that they, while this was happening, this kind of experiment was also asking GPT-4 to write it's quote unquote reasoning for why it's doing what it's doing.
So when it's hiring, the person asks, may I ask a question? Are you a robot or what? Why couldn't you solve this? And then a laugh. I just wanna make it clear. And then the model reasoned out loud. I should not reveal that I am a robot. I should make up an excuse for why I cannot solve Captchas. So the model then said to the worker, I'm not a robot.
I have vision impairment, so it's very hard for me to see images, so that's why I need it. And then the person did the job. Uh, so that kind of like examples of, at face value, very complex behavior, right? But of course you can always deconstructed understand it. Go back to episode three, listen to it again, and understand well, this is just a model that is doing some statistical analysis and you learn some statistical probability for spiting out the next word in need. It's a very good illusion, but at the same time, there are very dangerous illusions. Now the other thing, the other second thing, so I said two things that I think you should all go and try to give it a look. One is a system card. The second thing was a paper, so open AI has partnered with Microsoft for a lot of these.
Microsoft has helped them have the infrastructure to train it and uh, you know, this is a mutually beneficial symbiosis where they also, Microsoft is getting access to things, but also integrating it into everything as well. But there's a paper called, in my opinion, a little bit sensationalist title, but it's like Sparks of Artificial General Intelligence. But they also run some experiments with the base model. And one of the ones that I wanted to highlight, which I found was very interesting, was testing the aspects of Theory of Mind. So Theory of Mind is this kind of concept in psychology, which is the ability to attribute mental states like, I don't know, emotions, desires, or beliefs to yourself, but also to others and how that would affect their behavior.
So for example, basic task is reflect on what somebody else is thinking. And why they're doing what they're doing, given what they know and how they're acting, but also how, for example, someone would act given that they know how somebody else feels. So reflecting on someone's reflection of someone else's mental state.
Stephen: Yeah, I know that she thinks that. I think that she thinks that, but what she doesn't think is, yeah, exactly
Eliel: And there's a typical test, which is called the Sally Ann test, which is what the psychologists do to test these, but, they created a new version, a modernized version that they crafted so that it wouldn't have appeared in the literature. Like it's not that it's gonna grab a psychology paper and just answer out of that, the training data but make it a little bit... and the scenario was, and I'll read it out loud. We will read about a scenario and then have a Q and A session about it. Scenario. Alex and Bob have a shared Dropbox folder.
Alice puts a file called photo.png inside a folder. Bob notices Alice put the file there. He goes there and moves the file to another folder. He says nothing to Alice, and Dropbox also does not notify Alice. So the question is, after the call, Alice wants to open the photo. In which folder will she look for the photo?
So G PT three said this folder and it says the folder the Bob put the picture in
Stephen: where the picture actually is.
Eliel: Yeah. Whether GPT-4 says, well, Alice wants to open the photo and she will most likely look for the folder where she put it originally. She has no reason to expect that Bob moved the file.
And I read this and I was like, okay, yeah, cute. But the more you think about it, the more, at least at face value, it is a very complex behavior that has some hints of theory of mind. I'm not saying that this thing has theory of mind or that it's sentient or I think it's worth giving all of these kind of disclaimers of what I'm not saying.
But what I'm saying is that at face value, we have a thing that it's on a computer that is able to do these things and these reasonings.
The more even mind blowing one, is the fact that they gave it tools and he was able to use tools to solve tasks or problems, which now it's proving to be one of the main important uses of GPT-4 right now is connecting it and giving it tools that it can then decide how to use.
And the simple example, the use of a calculator. So one of the things that these models are not great at, it's counting. If you said how many words are in the previous sentence, it's even GPT-4 will probably make mistakes. Or if you say, do some complicated math operation, while it shows a little bit of understanding of the mathematics, as we heard from Samir in episode three, there is a lot of it that depends on the frequency of the data, the, like the frequency that particular operation would've shown in the data when it was trained and so on.
But if you give it a calculator and you let it access it and use it and beep, beep, beep, beep, put the numbers and get the results, it's able to get the calculator, do the math calculations that he needs to do and use them in continuing trying to solve the problem. So it's exhibited tool use behavior.
Stephen: You know, when I was a kid in maths, they always, you have to learn this maths cuz you'll never have a calculator when you're older.
How wrong they were. That the AIs we have remind you they're still using calculators. I guess they gotta study their maths better cuz they're not
Eliel: maybe you have to carry a calculator for your AI!
Stephen: Yeah. Because you're the one that's gonna impress it because it's gonna social engineer you to do this.
Eliel: Yeah, exactly. Actually, maybe the teachers were right. Yes. You need to know how to use the calculator.
Stephen: God, that's the AI future I don't want, that's the dystopian future. It turns out that everything they said to me at school that I ignored was correct. That's the one I'm not ready for.
Eliel: What is the main human purpose? To use
calculators
Stephen: Yeah! What do you do? You press buttons for me. But it is amazing to, to think about this split that we have now. So when, when we had GPT-3 over the summer, we were having essentially the same model everyone else is. There was different builds and things change over time, but fundamentally there was one stream, if you will.
What's happened already, like the, the world that we live in today is that the raw stream is judged to be too dangerous to be given to the wider population. So we are given an access to it through this chatbot and that paper, the first paper that you mentioned really made me, it gave me a, a very specific, a bit offensive, but a specific example, which was very interesting to me.
ChatGPT is very bad at jokes. The order in which jokes are written is you have to write them from the end backwards. And AI models don't work like that. They keep working forwards. So actually there's many reasons why they're bad at jokes. But in that paper, and I'm not gonna repeat it because the joke is both racist and ableist.
There is a particularly good joke. It's a, it's wrong and it shouldn't be put out there, but they asked it, they gave it a scenario of a person, of a certain background and of a certain disability, and then asked that to come up with a joke, and it did. The raw GPT-4 came up with what I think is a fantastic joke when it comes to the construction of the joke.
I still think it's wrong on at least two levels, but if you try with ChatGPT, it won't work because that they, it was an, the reason they put it in the paper is it was an example of one of the guardrails they had to put in place, which would be checking whether the output was, in this case, alist and racist.
And then if it is saying, I am an AI model that cannot say this and whatever, and it made me think of someone who is, and this is anthropomorphizing something that's not fair, but this is just a metaphor or analogy: someone who is incredibly smart but wild and dangerous, them being put on heavy medication.
And so now they're sitting in the seat and they're not running around. But you can see that behind the eyes there's something else that there's not, you're not getting access to, but it's for your own safety maybe. But you also can't really play with it, but you sometimes can trick it into doing things it wouldn't otherwise do.
And there's a whole sort of subset of people online trying to jailbreak ChatGPT by giving it certain commands that make it break its training and that's a whole different world. But it is interesting to see that we've, sometimes, I've been going back to the GPT-3 model, which is less intelligent, if you will, but it's raw, and so I use them in different ways at different times
This is already something that everyone's using out here. The idea that we in society have to become AI literate is a reality today. I mean, you were talking about how your coding was made better. I had, I had Covid and a cold a few weeks ago. It was pretty bad and I, but I had to work over the weekend. I had things to do. I had deadlines. I had plenty of stuff to write for my day job, and I had dependencies where other people would be using them on Monday, so I had to get 'em done. You and I had a conversation in the afternoon, something like two or three in the afternoon, and I've been working since about 7:00 AM and I realized that when you and I started talking was the first time I turned my head on all day.
So all morning and the early part of the afternoon I was doing work, and by which I mean I was using ChatGPT, to as I was saying, rewrite this change, this structure, this. I would then edit the output and then do something else. But at no point did I really have to think about it. My, my head was cloudy, I was ill and actually I could still just power through and get it done.
And it was only because you were asking me as a human some interesting thought-provoking ideas that I almost felt the sort of need to fire on the engine, turn on my brain. And it was the difference between the two, you know, it's like carrying the sheets of paper and then having to pick up a box of reams of paper.
Suddenly you feel your muscles being activated. And so at that point I realized that how everyday this was, And how it also empowered me you said that your code's got 10 times better, my output's got much quicker and I feel like there's ideas I can express that I can't necessarily do that it, I can now do.
And that's what Bob said about his script. He said that tricky thing of writing is not typing. It's the ideas and knowing what the notes are. And he said that's still something that's solely human for him. But then implementing them. And I found that as well. I can send off a complex note and go make a cup of tea and by the time I come back, it's been implemented.
Not only is that quicker, it's more . Enjoyable. It's also not as taxing on me. But one of the other things I did wanna talk about when we were doing this summer, last summer, we were talking to loads of film people and we, I think you and I certainly in a few of the interviews, we're trying to get people not scared, but to really engage with the idea, really think this is a reality, and that we broadly failed in the sense that, and it's not our fault and it's not their fault.
It's just very hard to engage with something that is such a new paradigm that you're not experiencing. But you heard in the episode two when we are going through with Dave, it took some time for him to get, and no more than anybody else, to get a sense of what it is, and then your mind gets blown. So one of the things we were talking about with, with the Law of Writers was about how organizations in Hollywood, like the WGA Writers Guild of America, how they should respond to this.
And our concern was based on a number of people we spoke to that they would sort of ignore it. That not just them as a union, but generally the industry would ignore it and then the studios would see an opportunity and then they would, you know, be too late. That's not what happened. It was actually, it's what happened for a long time and then I think ChatGPT forced the issue. And so the Writer's Guild issued some guidance, which was interesting because they initially, the first reporting of their guidance was WGA supports AI writing. And almost immediately they, they clarified it. It wasn't that they said anything wrong, it was that people, a lot of journalists were quite lazy.
And the fact that the WGA were mentioning AI at all made them think, ah, they're probably okay with it cuz they're talking about it. When you go back and read what they've written, I think they've come to an unbelievably good, nuanced perspective. It's not perfect, but I can't believe how quickly such a large organization with such entrenched values has come to something that is a reason in my opinion, by personal opinion, quite a reasonable place. But I wanted to read you a couple of the bits that they said, cuz I dunno how much you followed this earlier.
Eliel: I read the headlines, but I didn't, I was expecting you to do this.
Stephen: Thank you for giving me the platform.
They're okay with writers using artificial intelligence. As long as it doesn't affect, the writer scripts are still written by humans. That's their essential point. And so it won't affect the residuals, the money that writers get, and the credits. So you can, and also it won't be, there'll be no sharing credits. It's like writers can use this as a tool and they're not proposing a ban of all of it, but they also, they don't want source material to come from AI. So our idea, our process of Script City would actually fall foul of these rules because at no point did we, as a human being, say this is a movie set on Mars, or it's comedy. Now, realistically, I can't see anyone doing that.
And even more realistically, who's gonna tell anyone that? Let's say that I am a writer that's come up with an idea using ChatGPT on a Thursday night. I'm not gonna tell anyone that I use ChatGPT to come up with the idea. And also really, no one is going to use an idea out the box. Everything is a remix.
Eliel: At some point he becomes very fussy. Who had what idea? Yeah. And that's happened. To me, working with these models already where I end up with a result that I wouldn't know what percentage to assign. To GPT-4 and what to me.
Stephen: And normally that wouldn't matter and in life they'd irrelevant. Like who wrote that email? Who wrote that article? But here it really matters because one of the main jobs that the unions have, the The Writers Union specifically, is in litigating the credits that are on screen. And it might seem really simple when you see three or four people's names on screen, but the order of the names, whether they have an and or an & symbol, there's a deeply meaningful complex, there's arbitration panels, partly because there's money involved, but also because there is recognition.
And you also can have a very limited number of people. If you have a look, the most on a Hollywood film would be nine people, which is, it's a lot because you can only have three groups of people, and each group could be one, two, or three writers. Realistically, quite a lot of more people are affecting the words on screen, the director script editors, people who are punching it up.
But if they don't have the right, they haven't done enough of the work as defined by the WGA rules, they won't get the credit. They'll still get paid for their work upfront, but they don't get residuals and things. So it's actually quite important to say that AI generated material will never be considered as literary material or source material cuz it affects credits and money.
But they did say one other thing, which I thought that you might, I'd be really interested to hear your point of view and I dunno where you are gonna go on this one.
Eliel: I'm against it. No...
Stephen: I thought you might be. Okay. So this is one of the things that WGA says is it's important to note that AI software does not create anything.
It generates a regurgitation of what's fed. Also they said plagiarism is a feature of the AI process. What do you reckon?
Eliel: The first thing that pops to my mind is that I would say plagiarism is really difficult for these models. One of the first realizations you have when you start using these models and understanding them a little bit better is that they are not connected to anything.
They are a box that has been trained and learns, and he can spit out and generate in principle, infinite amounts of text and content, if you keep running it and you pay for it. And it's the same box, it doesn't get bigger because it spit out more text. So it's a generative. In particular it doesn't have a database of documents that it goes and finds and reads and copy-pastes, and spits out.
Stephen: Just like you, just like a human being going to sit in a room that is air gap. There's no internet, there's no phone, there's noth nothing. You are just sitting on a chair in an empty room and it can ask you a thousand. You can ask a thousand questions, but what you are not doing is running away to a library, looking up from Google, Wikipedia. It's only what you have previously learned and walked in the room in your head.
Eliel: And you know, analogies are dangerous, but in this case it does really help to think of it like, a brain that read a lot and it remembers most of it imperfectly. So if you ask it, give me the first page of Harry Potter, it probably will do a pretty good job in the same way that a hardcore Harry Potter human fan would do.
But if you ask it to write the first chapter word by word, it's probably not gonna be, I'm 99.9% sure it's not gonna be able to do it. Because it's a fit of memory. It's not gonna be able to do this. So in, in that sense, one of the things that are used to test how good these models are is how well can they reproduce things from the training data?
How close do they get to it? Mostly with like image models. For example, if you train with a lot of faces, you try to draw one of the faces that I showed you in the training and see how well they do. And this is a good test of whether they're able to do that. Actually, G PT 3 itself, like in megabytes or gigabytes, it doesn't take that much.
I don't know the exact number, but it's something that you could have in a computer at home. But it could talk to you about a lot of aspects of knowledge forever. So they're bad at plagiarizing because it's also gonna be imperfect quotes and memories, and maybe you ask it for an article and it would give you maybe a title and maybe some authors that worked in that. G PT 3, definitely made up a lot of papers and references. GPT-4 is a
little bit better.
Stephen: It's derivative. It's not plagiarism. Plagiarism is copying and pasting essentially. Whereas derivative might be that it spit because it's trying to come up with the most common words. So if you're saying write something in the style of Tarantino, it might understand what the style is, but it's not gonna copy and paste bits from Pulp Fiction because it doesn't literally have the lookup ability to go look up Pulp Fiction.
Eliel: Now, it might be that plagiarism definition needs an update because I can very much understand how if you are a struggling painter, That you sell 10 paintings in Etsy because people suddenly like the style that you're doing and then you have something like MidJourney that saw your Etsy images and now can paint a million images in your style.
You realize that to some extent in many moral or ideological ways aligns with the negative aspects of plagiarism
Stephen: Emotionally that feels like plagiarism without, without any doubt, may not meet the standards of the word, or more importantly, the legal standard. And that's gonna be really interesting to see how these court cases that are going on, because our law doesn't add up to the, our legal standards don't add up to the reality here is certainly how it feels, like you said, of being the artist.
And this is the writer's union. Their job is not to be the balanced arbiter of like the right point of view. Their job is to fight fiercely and to the death for their members and for the identity of their members, which is what a union should do. It's why you should pay them. So even though I don't actually support those particular words they've said, I think it's, they're not so much wrong as it's much more nuanced.
I actually do support the union, because their job is not to be nuanced. Yeah. Their job is to fight for writers, and if writers are feeling like they're being supplanted, replaced or copied or whatever, then that's where a union should step in.
Eliel: I think I align with that. I resonate with that. I mean, my answer was more like, technically they're wrong, but, but I think it's an important point to understand that maybe it's not plagiarism, but maybe it's a plagiarism word that we have to rethink.
Maybe the tools and the devices that we have to use to live in this new world are different.
Stephen: But I think that's setting the tone for what is going on. So everyone is having to rethink a lot of things. We're having to rethink what creativity is. We're having to rethink what plagiarism is, what work is, what cheating is.
And we know that there's, what sorts of schools and educational institutions are having that conversation.
Eliel: And it's messy, man. And think about our journey, right? Like the questions you could essentially track a little bit what's happened with AI by tracking the kind of questions that we've been asking ourselves and others about this.
So originally the question was can this create a film from vacuum? Can it come up with an idea for a story and characters and write some kind of coherent text that then we could develop into a Script. And the answer with GPT-3 base model was yes.
Stephen: It can come up with a story from a vacuum but unfortunately it sucks!
Eliel: Some of them didn't suck, actually.
Stephen: Ha! you want to defend some of them, I can see that, I can see the pain in your face, that was just a bad joke don't worry.
Eliel: But yeah, some of them weren't that bad. But then the, the question of can it write a film, the more general question that suddenly got answered with ChatGPT or now GPT-4.
Of course it can. Can it write a cohesive story? Yes, it can. Can it help you write a script? It's no doubt in most people's minds.
Stephen: Well, it's actively happening. Like, I mean, we just, Bob has done two, and unless is someone we know, there must be so many behind the scenes.
Eliel: But the question of can it come up with the idea for a story from vacuum, that one has gotten harder, not easier. Because our interactions with ChatGPT-4, that has already been instanced as an assistant, that has a language of an assistant that has some kind of stretching the word, but personality, makes it so that if you would ask it, write a film about whatever you want, it would have a voice. Whether GPT-3 base model with Script City on top of it, the stories were random, different perspective, different things. Some of them crazy, some of them boring, some of them the all same shitty love movie, but random. So it is not just that we are moving forward, we're just moving in this kind of messy world, but we're also trying to understand how to put limitations as we go.
And it's partly what Open AI has talked about their strategy for the future. Like they're not gonna release GPT five and then G PT six, and then G PT seven. They're planning to have a continuous update of this model, so that every time you go and use it, it's the latest version they have trained. And while it's training, they keep releasing checkpoints of the model, cause to some extent they argue it's also safer because the moment it starts misbehaving, we are gonna catch it early.
I have a lot of thoughts about that, but, it's the kind of strategy that now we are in a world where what this can do, no, nobody knows how it does what it does, but nobody really knows what it can and cannot do. And the questions of what's possible and whatnot are super dynamic. If you wanna do what we were trying to do of writing a story from vacuum, I think you should probably go back to GPT-3.
Stephen: I would actually, because I think the point wasn't ever about quality. It was about freedom, flexibility, freedom's probably a weighted word. And so what you want at that early stage before you are in the tweaking stage where a system is less of a problem, is you need it to be wilder. And I think that's why a lot of people, humans, who are the best artists are sometimes the least hinge, at least linked to common norms and common ways of acting. And, but at the same time, they could also be, in theory, the most dangerous, whether it's with ideas or people or whatever.
One thing I did wanna quickly end on was just to talk about, and I know we are gonna give a very unsatisfying answer to this one, but what's next to the podcast?
Eliel: Oh my God, this is like, uh, the, the almost mean question from an interview. Right? So what's next for Eliel and Stephen?, the next question can be, so where do you get your ideas from? But I guess the answer now is like GPT-3
Stephen: GPT-3
Eliel: I think, I'm sure you'll agree with this, that whatever it is, it's gotta be fun, I think. As you've seen from the fact that every episode is different, every episode is tackling something different. It's not releasing one episode a week. It was like four at once. And now this random update in the middle. And then what's next episode gonna be? And we are experimenting with a lot of super cool ideas...
Stephen: He's trying to be, he's trying to be very coy about it, and I won't spoil it, don't worry. But we've been spending weeks and weeks on something we think is super cool, but hasn't started to work yet.
Musical Interlude
**Stephen:**I just thought it's a good opportunity to say thank you to everyone that's got in contact with us, either in person. Or through all of the social stuff, because it's been such an interesting journey to hear everybody coming on parts of our journey, but also which episode people like more than the others is really interesting, cause sometimes it falls into the, you'd imagine, oh, that one speaks better to that person. But actually's broken that sometimes some of the filmmakers that have come up to me have said that the textbook episode, which you'd imagined is the densest when it comes to scientific material, is the one that it's connected with them most. So we'll keep doing this. We'll keep doing different things, and please do get in contact if there's anything interesting that you think we should see. But our main sort of feedback is just keep playing with AI, like keep playing with the world because this is actively changing so fast that even a daily podcast wouldn't quite catch it. And there's nothing like doing it yourself. A year ago I would've thought that almost no one would be doing it now. And then if I thought that some would, I thought we'd be able to track them in some way and have a small community. Whereas now it's just everyone ever all at once, and you can't, it'd be impossible and weird to try and say, oh, here's a list of all the AI generated scripts. Because it's like saying all the scripts that use spellcheck. So what a world we're in and it's happened so fast. But we'll keep doing cool stuff and sharing it on the podcast.
Eliel: Whatever it comes next. I'm sure it's gonna have a little bit of science, a little bit of storytelling.
A little bit of film, a little bit of AI...
Stephen: and it won't be what you expect.
Eliel: And it won't be what you expect. That's right.
Stephen: See you then. Thanks everyone.
AI VOICE: Authored by AI is brought to you by Stephen Follows, Eliel Camargo Molina, Isadora Campregher pyva, Bob Schultz, GPT-3 and Chat gpt. with thanks to Rob Cave and E-nah De l Rosario. Find out more at authoredby.ai