Originally Broadcast: June 19, 2023
I spoke with Adam B. Levine, cofounder of Blockade Labs, the company responsible for the generative skybox AI that can create 3D skyboxes from text prompts or sketches. One of the best applications of this technology is created environments for use in game engines like Unity and Unreal. We explored some demos of what this can do covered, and discussed the future of games, creativity and technology.
Beyond the technology and its applications, we discussed the merits of decentralized AI vs. decentralized approaches ] (i.e., open-source models like Stable Diffusion compared to API-centric models like OpenAI).
For the complete Show Notes (with links to what we discussed), please visit: https://meditations.metavert.io/p/skybox-ai-with-adam-b-levine-blockade?sd=pf
00:00 Introduction & Demo
03:30 How Skybox AI works
06:45 Defensibility of AI tech
09:25 Accelerating Game Development
15:20 Exponential Speed of AI
17:40 Depth Maps
22:22 96,00 Skyboxes per Day!
26:15 Search vs. Generation
31:15 Skybox API
33:33 Building in Public
37:12 Disruption from AI
44:40 Centralized vs. Decentralized AI
53:31 Regulatory Capture?
Game Development: The thing that people should be rightly concerned about is process oriented jobs. Process oriented jobs are going to go away in the way that we understand them today. But that's because we're opening up a world where process is not the barrier. And right now process is the barrier and almost anything that you try to do, right? Execution is the challenge. And so it subjugates idea behind the ability to actually execute. And I don't think that that's actually a good thing.
Jon Radoff: I'm with Adam B. Levine, who's the chief innovation officer of blockade labs and a co-founder of the company. So today we're going to be talking about some pretty amazing stuff. I have been playing with blockade myself. I'm going to show you some of what I've done. But Adam, thanks for being here. Looking forward to this.
Game Development: John, thanks for having me.
Game Development: I'm really looking forward to it too.
Jon Radoff: So just to kick things off, I'm going to throw a video onto the screen here. If you're listening to the podcast version of this, by the way, this is probably one of the times you want to actually go check out the video itself because there's some really cool things to look at. So I'm going to explain what I did here. I took a photo that I had taken when I was out in Nepal near Mount Everest and I traced over it. So in blockade, you can trace over the image and then you can feed that tracing into a prompt that then paints it in all these different ways. I tried different styles like fantasy and painterly and all that stuff. So let's just start there because that's just like super cool. Like what made you want to do this? What was the inspiration behind it and how are people making use of this technology?
Game Development: When we're looking at kind of disruptive technologies, one of the kind of factors about them that's always interesting and always challenging is to the extent that they're powerful, you can do so many different things, right? That's so many different things means that a lot of people wind up doing a lot of things with it. I've fallen into that sort of path myself many, many times. But really what we wanted to do is we wanted to pick a particular use case where the technology today could really start to have a meaningful impact on the world that would then help us define what the world of tomorrow would then look like. AI is disruptive on a level that is unlike anything I've ever seen before and again I spent 10 years in blockchain immediately before this. It is also disruptive but the disruption here is so much broader whereas it's very specific there. Very powerful again about ownership on the internet and stuff like that. But in the world of kind of AI just everything is going to change and it's hard to fathom that. So we picked a game asset or what we thought was a game asset. We've had digital filmmakers come at us. We've had VR experience creators come at us kind of all kinds of people have found uses for this. But we picked something that was more compatible than the vast majority of things out there, right? Because if you're looking at kind of a game development pipeline or really any type of
Game Development: pipeline, there's a lot of really specific kind of wonky formatting stuff that goes into
Game Development: making it work. The differences between unity and unreal engine for instance are quite broad. That's not really true when it comes to sky boxes, right? Sky boxes are pretty standardized. You can use them in all kinds of different mediums, all kinds of different things. There's really just like two or three formats. We picked that as kind of the first place to go. And also it was just so cool. It was just so cool to be able to, you know, like we cracked the ability late last year to generate images with coherency that were effectively any size. And this was our second idea for how to use it. But it's the one we went with and I'm very happy that we did because it's turned out
Jon Radoff: great. Can we talk a little bit about what's actually going on with the technology? So the sky boxes is like this inside of a sphere and we're painting it with an image that's been prompted into existence. So let's just even start there. What happens when I enter the prompt? What's the technology that it's using for the image generator and then how does it apply
Game Development: that? I started 330 AI innovation, started work on the precursor to this project and started building infrastructure about two years ago. And that infrastructure that we built, I'm going to go ahead and brag a little and say it's pretty much world class. And it has allowed us to be both faster and higher resolution by a significant margin relative to anything else out there. So the one thing I didn't want to do was create base level models, right? Like, you know, I want to create Dolly, I didn't want to create stable diffusion, I want to create any of those things because options exist out there to use those to then build off of. So we kind of did a ship a theses thing where like we took a very early version of stable diffusion, ripped it apart and then put it into the box that we'd been building for about two years. And that then allowed us to break through a lot of barriers that many other companies, you know, have run into, which is kind of held back everybody else, but not us. So that's how we're able to do kind of the six case stuff. At its core, what we're really doing is we've created 2D images, very large 2D images that have 3D characteristics about them. And the interesting part about artificial intelligence is that it's all essentially fancy versions of autocomplete, right? So that means that each additional piece of information you sort of add to the equation allows you to then infer as you would with a math problem, you know, the missing pieces of data there. And that's the trick that we're really relying on here. It's not so obvious today. Right now we're just offering, for example, depth maps, which start to provide 3D characteristics. And it's important to note that so we spend 10 seconds of GPU time on creating the, you know, 6K, the 6K, you know, 2D image. And then we spend one 10th of one second generating the depth map. And so typically you would expect that the 2D part would be the, you know, the easiest part of a hard process. But in practice, what you find is that if you can do the easiest part at a high level, then you can actually infer the harder parts and the more complex parts increasingly fast. And so it winds up just completely disrupting the entire way that everything is done. And even the way we think about these processes. I'm so here for that.
Game Development: It's not even funny.
Jon Radoff: One of these things that I've seen just floating around, it's practically a meme at this point, amongst some venture capitalists, is that applications that are taking stable diffusion and then building a, building around it or Dolly or any of these off-the-shelf models, like maybe they're not adding, you know, enough value to the equation, which I don't agree with because it seems like a lot of the problems here is how do you cover the last mile through user interface? So it seems like you tapped into something deep or in what people want to actually get out of these systems when you created this really simple web-based interface to create the sky boxes. Can you just sort of elaborate on that, like thinking behind it?
Game Development: Let's talk about the concern, which I think is totally reasonable. And it's not so much, like there's two sides to it, right? There's product market fit and then there's defensibility. And defensibility tends to be where BCs go with that question. And they're not wrong. Again, like anybody who right now is in the business of building, for example, the ability to ingest data process through GBT4 and then give you back something, like that's not a defensible business at all. That's effectively an application integration. If you can run fast enough to win a dominant place in that market, then maybe you can establish yourself in a defensible position. But it's not because of the technology, it's because of how fast you move and ultimately if you win the race. We wanted to do something that would allow us to accomplish both of those things effectively. So again, from a technology standpoint, we just didn't want to train the big model. There's liability stuff that's going to get sorted out over the next couple of years as far as how those big models scrape the internet and we're trained. I just didn't want to spend my time working on that. I wanted to focus on the applications. But in order to build the application that we built, we had to solve fundamental problems that have not been solved out there. And so again, I'm very proud of what we've been able to accomplish on those lines. And it's such that we've even considered going into the business of just helping AI companies optimize their own things because we could cut their costs by somewhere between 50 and 75%. And that's a really great business to be in by itself. But it's not the business that I want to be in. I want to be in the business of helping creatives to really break out of the constraints that they have and take sort of the professional creative class and say, hey, the barrier to entry to really doing this at a high level, it was up here and now it's almost nothing. It's more about the idea than the execution. And for me as an idea guy, that's always been the fantasy is that the idea itself could be powerful without execution because execution is so, so hard. And it gets in the way of good ideas being produced. I'm so bored of games. The vast majority of the time last 10 years, I got into James, James journalism first. And that's been my biggest complaint is just how hard it is to create games means that we get worse games, even if technically they are good games. From an idea perspective, you leave so much on the table just to accomplish execution that it's a real problem. So that was kind of the thinking behind this was, hey, let's just attack like the most general thing that we know we can do the best, that's super impressive and that we think will capture the imagination of the world. And so far, pretty much all of those things have been correct assumptions. I'm happy to say.
Jon Radoff: There's just so much to unpack and what you just said, like on the whole game design side, it's been my contention for a while that something standing in the way of innovation in the game market is just how resource, so how capital intensive it is to just make stuff. And if we could get back to frankly the way things were back in the 80s and 90s of game making, if it's two or three people can make an amazing game and they've got the ability to produce the kind of content that would be market competitive today, then you'll start seeing all kinds of experimentation. You won't need a hundred million dollars to build the next story based game and don't get me wrong. I love those games. I love playing, last of us, and Mass Effect and all these really big budget story driven games, but I'd love to see individual people be able to run with that. People's like that's a little bit of what you're getting to in terms of the potential for more innovation and disruption in the market.
Game Development: You can have problems of having too much money and you can have problems of having not enough money and it's really hard to hit the sweet spot of having exactly the right amount of money where you aren't going to go overboard and where the commitment isn't such that you need the thing to win. You know, that's the thing about those hundred million dollar titles is they can't take risks in the way that a small indie can. I think that we're moving towards a world where that's not true any longer, where, again, the barrier to entry, both for the low end in terms of folks who were just Indies, a lot of my favorite games are like one man, two man, three man teams. Right now, shadows of doubt is a fantastic game that's out there. Again, that's like a two or three man team and they accomplished it because they use a lot of procedural generation and they use, it's a hugely complicated game. Because they use like boxal based graphics, right? So super basic graphics. I think that that level of creativity is only possible at the indecent of things and I think that the quality is very quickly going to get up into the A level where you're not really worried about boxal graphics because you can do a lot of the stuff from a generative kind of standpoint without having to build the textures in advance, which is kind of a big problem around that. But I think it helps the big guys too because it means that they can create much lower cost or much larger games too. Again, like you can look at kind of the AI innovation stuff from two sides, either it's going to make it so that game sizes stay the same and teams get smaller or game teams stay the same size, but the games themselves become 10 times, 100 times larger than we have today. And that's pretty exciting to me. So I think that, you know, I think we're at the precipice of a really, really cool time to both be making games and to be playing games, frankly. And let me just real quickly go into kind of some of our deeper thesis here because I'm pretty excited about this. So we increasingly talk about text to game and text to game. What we're doing right now for it is a very, very small portion of that. But imagine if you wanted to prototype a game, right? And you're like, hey, I had this idea, you know, I had a dream last night. You know, I want to create a Victorian zombie shooter game, you know, that only has like rodents for characters and they all sing all the lines because it's also a musical, right? Like if you if you pitch that to somebody in like the real world, they'd be like, okay, first off, you're crazy and second off, no, right? In a future where where kind of all of these different modules exist for, you know, for like for physics engines, for environmental engines, like what we're building, you know, for NPCs, for speech, for, you know, for economy, for game rules, et cetera. You know, imagine that, you know, all of those things have been created in modular form. And at the top of this stack, you have the prompt. And it's what I just said to you. And then that prompt feeds down into an engine. It feeds from there into a different engine into a different engine, different engine. And at the end of it, you have a prototype. There might not be great, but you have a prototype in hours or days from your prompt that and then configuring sort of all of these different modules to create it in the way that you want to the extent that you even want to set those because if you don't, then it'll figure it out itself. That's a super, super cool world where suddenly, you know, a kid in high school who has a really amazing idea, but no money or connections can be like, Hey, here's my idea. Here's my game. Here's this thing that I created. And simultaneously on the, like, the triple A side, well, if they want to now create a thousand prototypes over the course of a six month sort of experimental run, like they can do that. And you can have it so that all of those can be getting played. And you can figure out what really is the best possible version of this thing before we invest into the maybe, you know, a year of development time at that point to take it from kind of this prototype concept to, Hey, this is something that scalable
Game Development: can actually work, you know, over a medium term.
Game Development: So that's very high level about text to game, but it's, it's where we think we're going in the next two to three years.
Jon Radoff: It parallels or is practically the same thing that I talk about sometimes, which is this direct from imagination experience. If you can think about it, get it onto the screen and you're describing this very hierarchical system of all these components that have to fit together and invoking them through some kind of prompting. It seems like that's a big part of where user interface comes in as well because it's funny to talk about this whole, you know, the thin wrapper argument that people critique. But if you look at chat GPT is just the example you were sort of touching on earlier, chat GPT was essentially an interface on top of something that had existed for over a year, right? And it was just the fact that people added some reinforcement learning and made it a nice UI and then people could actually make use of it. I'm really compelled by how you're starting to put some of the pieces together. In your product at blockade. So if I look at just even last few weeks, like the rate, the rate at which you're adding things is pretty amazing. So you started with the sky boxes and it's really cool. I can build up these environments and then you added this sketch mode. I'm not sure exactly what magic is happening there. It may be, maybe think of control net or something like that. It is control net. Yeah, it's an implementation of control net. Tell me a little bit more about how that works and what people are building with it.
Game Development: One of the interesting, terrible, amazing parts about working in AI is just how fast everything happens and how it's like, oh, hey, well, all this stuff over here is impossible today. But if you stretch the time zone, the time frame out to a month from now,
Game Development: then maybe it's not impossible, right?
Game Development: And again, like cryptocurrency is a fast space. AI is incomparable to anything I've ever seen as far as disruptive technologies are concerned. So what's really happening there is effectively we can control the latent space, kind of behind the behind the image itself. And then again, like I said, our process, like we used elements of stable diffusion, but we ripped the thing apart, put it back together in a way where if we handed the model to somebody without our infrastructure, it would not make any sense at all. And so again, we're able to basically use these types of tools to produce the types of results that you're seeing there. And it's very, very early. I just want to mention that that again, like sketch as it stands where like should we release this? Should we not release this? Like it's not really ready because the results it generates. Like what I like to do, I'm a terrible, I'm a terrible artist. I like to just like sketch mountains with like simple lines and stuff like that. And that will work sometimes. But a lot of times it's like, well, this line must be like a suspension bridge or some wire that's stretching across the thing. And so it's not really kind of where it needs to be yet, but it's, it's going there very quickly.
Game Development: And these elements of control are super important, you know, to it actually being a really useful tool.
Game Development: So I appreciate that I'm not getting too deeply into the technicals on it because it's, it's just hard for me to explain. But, but yeah, we are using control net. We use it in a couple of different ways actually. And we'll be expanding that along with a whole bunch of other technologies that are baked into this to make it all work.
Jon Radoff: Yeah, my own experience with it is, is the results can vary. And when it, when it really captures what you were trying to do with the sketch, then it's magic. And it's, it's like mind blowing. And then other times it's like, that's not at all what I had in mind. And you try again. So yeah, as the frequency of the magic goes up, I guess people will fall in love with it more and more and more. But it's clear from just those little experiments that you can do with it. Where it kind of runs with the geometry and figures it out. Like that's really cool. But speaking of geometry, I think I saw a paper come out on essentially 2D to depth map. And it was like the next day you had depth maps in blockade. How, first of all, how the heck did you pull off that pace and talk a little bit about that. And we'll show a little video running in the background here by the way. Again, like if you're listening to this, these are things you have to look at. They're pretty amazing to see.
Game Development: Yeah. So when we're talking about, so if you're talking about the paper that Scotty wrote with Intel.
Jon Radoff: Yeah.
Game Development: So that paper was in the works for a while is really what it comes down to. Scotty Fox, innovated a lot of these, these technologies and like proof of concept. It did back in October of last year. And, and kind of we had been talking to him at that point. But we wound up bringing him in when we decided to move on this and he became our vice president of engineering. And so that paper was in the works for about five or six months before it finally was published. So we actually had been holding back depth maps because we're like, well, people aren't really going to know how to use these. It's going to have complexity to the user interface. And a big part about what we're doing is trying to make a powerful tool that remains accessible. But Margaret, you know, like, since we have a message of three AM, like the day we released depth and was like, we're releasing depth today. I'll tell you why later. And it was after some conversations with some VCs who were who were who were looking at some of the stuff people were doing with our API. And they were like, well, who's leading this? Are you leading it or are these people are like, no, no, no, you don't understand. They're using our API to do all of this. But we hadn't exposed it on the front end yet. So it was a relatively simple matter to do that. We have a lot of stuff baking. The sketch interface was really, really heavy for us to build. It's why we didn't cover a lease for about six weeks. It's because we thought it was going to take four weeks. So on to taking six weeks, really, really hard to kind of invent new interfaces for stuff that people haven't done before. And then try to figure out how to make that work. So just all kinds of really interesting challenges that arose during that process. But now we are onto a cadence where it's going very fast. And we're prepared to roll out accounts in about two weeks. And then about two weeks after that, we're going to be following with what we call add to this, which will allow you to take the current Skybox process, which right now is like throwing a dart, right? Like you might be great at throwing darts, but every time you throw one, it's a new kind of attempt. And with the Skybox, there's a lot of times what you'll find is that like the first, you know, like 80% of it you love. But there's this error over here, and it's hard to fix that right now. So add to this is a new type of in-painting that we've invented that allows us to go beyond sort of the size limitations that typically characterize, like 512x512 is typically what you can invent.
Jon Radoff: Well, I'm very interested in the roadmap, but maybe let's zoom out the camera lens a little bit. There's sort of two sides to the generative AI equation when you think about games. So there's making a game and the way that one can use these technologies to produce art and content and material that is going to populate the world of a game. And then you make the game and you ship it and people play it and they have fun. There's another aspect of it too though, which is actually incorporating generative technologies into the experience of the game. So while I'm playing maybe a skybox is generated based on some kind of 2B-dripped up game mechanic that people are going to start being inspired by. Could you talk a little bit about the way you're thinking about those two sides of the coin of generative technologies in games?
Game Development: As it relates to skyboxes, the potential really is there to do dynamic generation. I don't know if it makes sense though. I think that one of the nice things about these technologies is that you can create such a corpus of material. You can create such a large amount of material, carry it down to what you really want and then you can just use that. So there are I think really good examples of where purely generative in-game stuff is going to work. But keep in mind the problem we're solving is, hey, I have a project. I need a custom skybox. It's expensive or I have to pick something that's totally generic. Those are my two options basically. So we're not really solving the problem of, hey, people need to generate skyboxes on demand. That's how we solve the problem. But the problem is is that people need custom skyboxes at the exact instant that they need them. So we've already created 4 million of these things and where we're going with this, I know right, it's ridiculous, 2 million just last month. It's a lot of GPU time. It is. We've actually, we would be producing more except we are literally a capacity on all of our GPUs. And I want to spend any more money until we turn on monetization. So we do about 96,000 skyboxes per day right now and that's pretty much our peak capacity at the moment. So anyways, so like that's the important part is that just because we can do something, which is like we could generate skyboxes in real time, that doesn't mean that's the right choice. And I think that for a lot of things, it's not going to be the right choice. What is the right choice is having access to an API where there's 50 million skyboxes that have been created in every possible variation and whatever you need one. You know, it searches there, pulls one that's aesthetically ranked as absolutely beautiful, semantically mapped, you know, to whatever it is that you're trying to do, and just solve the problem of providing them. And then also there's creation if it's needed, but if it's not needed, then why would we do extra work? So that's one side of it. I have more though. The other side is that when we started working on this project, when I started working with Marguerite who is the CEO of Blockade, our original idea actually was to build all of this into the Neon District Web 3 game. And to do it in such a way that we would prove to people how in the future, generative technologies would be used as actual gameplay elements, right? So not just the creation of the world, but we imagined a loot mechanic where what you were essentially doing is you were finding objects that look like they were procedurally generated, but which could be broken down into core words, right? So you might have, you know, like small laser pistol or something like that. Break down into three different words, small laser and pistol. And then you could take other words that you would find, combine them together and create new weapons, new gear, you know, it's stuff like that, and be able to really kind of put them a can again. And then again, like imagine, so this is an obscure book, but if you've read Remdi by Neil Stevenson, which is read me with a transposition in it, not a great book from Neil Stevenson, but a really great world that he built. And one of the things that is a dynamic in there is this concept of aesthetic wars, an aesthetic sort of affiliation factoring, where there's a dynamic in this MMO that they create that's like an MMO, except it has really economics in it, which is a fascinating topic unto itself. And what happens in the game increasingly is that you've got sort of the good guys and the good races and the evil races, but actually they start banding together based on their color preferences, whereas like all the people who love like the Tolkien style, you know, like drab, gray, and greens, and kind of things like that, wind up working together against the people who love the, you know, bright primary colors, right? And I forget what the factions are called, but it was just an idea that stuck with me. And I was like, oh, you know what? Actually, this is something that could even be used to have different sort of benefactions battling over how the world in the game actually looks and different areas of the world again. And you could effectively remix all of these scenes such that, you know, like, oh my god, the unicorn faction took over,
Game Development: and now everything is rainbows and unicorns, you know, and then you go three blocks over and it's the
Game Development: zombie gang, right? And they like everything like it looks like it's a zombie apocalypse. And again, like these ideas of giving players the agency to really kind of transform and personalize the worlds that they live in, I think is a super powerful concept that will become prevalent as time goes on, but it's a little bit non-obvious if you're not so far down the rabbit hole.
Jon Radoff: What I heard you say a little while ago, though, is you were, we were kind of playing with this idea of dynamic generation in the course of playing the game. And you sounded a little concerned about actually generating at high volume, I guess, for obvious reasons, like that's a lot of GPU use and stuff like that right now. But it sounds like through your API, you almost have like a search mechanism that can layer above that and find cases that already fit the prompt or fit the material that you were interested in. So is that a way to potentially have the illusions of things being generated at all times and you just optimize for search or generation according to what's required?
Game Development: Yeah, so it is, that is my solution to this problem because again, we're talking to some of the biggest companies out there that have tens of millions of monthly active or in some cases, substantially more than that in terms of users. And the reality of hardware right now is just that there aren't enough GPUs out there to service. If like the top 10 MMOs or the top 10 multiplayer games came online and wanted to do simultaneous generation of woodwork, like there literally isn't enough hardware out there to accomplish that. The other side though is that when I started working on the precursors for this project, which was just about two years ago in the spring of 2021, like it took us five minutes to generate a 512 by 512 image that was terrible. And so today we can do that same image except attend to 50 times the quality of it in terms of coherency and actually, you know, like, but the same size image, we can do that in two tens of a second. That gives you an idea of how fast this space is optimizing and it's both software optimizations, which are substantial and there's still a lot of low-hanging fruit on there and it's also the increased hardware capabilities. We're not even using the newest hardware or the hardware that would give us the best performance. We're probably using hardware that's 50% worse than that and that's just because we don't need to use the best stuff and it's again, like commercially, it's a little hard to get it right now. So I think that we will very much get to a point within the next year or two where faster than real-time generation for things like video, you know, an animation is going to be a thing and that will then drag forward many of these other areas that don't really need it, but if they had it, it unlocks another 50 to 100, you know, use cases that don't really make sense right now but will make sense in that time. So I, and then we do not have search and on-demand stuff in our API right now. That's going to be a separate API. We'll be launching towards the end of the year. Right now, we're really narrowly focused on improving the quality, accessibility, and control for the core application, which then is the feeder that then provides sort of all of the opportunity to do the kind of broader stuff later on.
Jon Radoff: So hardware keeps improving, but it's not improving by like 100x, 1000x, at least not right away. It must be through algorithmic improvements and just better approaches to training. It's the software level that's really driving this truly exponential improvement in quality and time.
Game Development: Yes, and again, like nobody knows best practices, right? It's like the world of crypto when people started doing token sales, they're like, oh, that's the way you do it. It's like, no, that's not the way you do it. That's the way they did it and it worked for them, but that doesn't mean that that's like a best practice. That's a first practice. I think that's what we're seeing today. And it's why again, we've been able to make such fast progress in the areas that we've chosen to focus. You know, is because there is just a ton of low hanging fruit that just isn't properly understood. And most people aren't taking the time to really dig into that. They're very focused on what they're doing because they didn't start two years ago. They started six months ago, right? They started, you know, summer last year when stable came out with stable diffusion. And suddenly, it was like, oh, this is a thing. And at that point, that was the moment when a thousand companies that aren't going to make it got started because they thought that that was sort of the starting gun when actually the starting gun had come a couple of years before that. Yeah, when suddenly
Jon Radoff: people saw things and it actually was good and said that these bizarre, sore realistic, you know, dog dreams that we were seeing two or three years ago. Exactly. Exactly. Yeah, I mean,
Game Development: like the first technologies that I worked with were actually, so these on the wall behind me, those skulls, those were done using VQGam Plus Clip with my platform pixel mind. And they were all in anatomical skull that people would then use words about a year and a half ago to, you know, to modify into something else. So that middle one there is like, cthulhu tentacles, right? And if you look at it closely, anyway, so I love those early versions of the technology, but they were very weird. They were very weird. They were not coherent. The thing that convinced me that I should stop working for CoinDesk, because I built their podcast network over the course of about two years and I was very well paid, was I started doing pop art of Vitalik Deuterin. I'm a big fan of Vitalik going back to the, you know, very early, like working at VQGam magazine type thing. And the images that you would produce would be like, hey, here's like a mess of colors. And then here's Vitalik's forehead that you can kind of see peeking through the thing. And that was enough that I was, I got addicted to early versions of the technology and wound up hiring the guy who built the first discord bot that did it. And then we were just kind of off to the races.
Jon Radoff: So we've talked a little bit about your API, but I'd like to double click back to that and talk about it a little bit more. So there's the aspect of your product, which is I can go to the website and play around with it and generate images. How is the API fit into that? How are people using the API in a different way than they would as a direct end user through the web browser?
Game Development: The thinking about the API is a little bit sophisticated, but for anybody who's gone down this path before, it's not at all going to be foreign. The reality of it is is that API is likely the place you want to be because rather than having to find every single client and user yourself, you can just say, hey, build something with this or integrate this into the thing that you've already built and then roll it out. And we've solved your problem. It's a really great deal for you because you didn't
Game Development: have to invent this in the cost is low. And you can add these capabilities that you never thought
Game Development: were going to be possible, but now they are very easily. So that's the thinking about the API is that over the medium term, the API becomes a very significant driver of utilization, but it's relatively low margin for us. Again, the margins are great because it's AI, but compared to the front-end experience is relatively low margin. But the thing you can't do with an API is you can't demonstrate to people really effectively, you know, why this thing is interesting or important. And you also have to rely on other on these companies that are doing the integrations to make good choices about how they go about building kind of the first examples. So we didn't want to do that. We again, like API has always been important, but we wanted to very clearly show people and let
Game Development: people experience for themselves without having to build anything why this was so impressive and
Game Development: would be so important. And, you know, we got lucky. We got lucky in that essentially our first attempt at creating this type of interface worked really well and it immediately took off, you know, and our trajectory has been unlike anything I've ever built, which is really amazing because it sucks to build products that are important, you know, as far as like they show where the kind of path is going. And I did lots and lots and lots of that 2013, 2014, 2015, 2016 in the token space. But to have it ultimately be meaningless because you're six years too early. And this time I wasn't six years too early. This time I was two months or two years too early. And that was just enough time to build out the infrastructure to position ourselves for the race. So I don't think I was too early
Jon Radoff: at all. What are some of the things people are building on top of the Cybox technology that you're excited about that you really like? I want to check this out and we're going to include links in the
Game Development: show notes too. Sure. Yeah. So again, like a lot of the stuff like the kind of best place to go, we're doing another thing that I don't typically do, which is we're largely building in public. And we're really have kind of taken to Twitter as a way to kind of, you know, share with people what we're doing and capture kind of the imagination of where these types of technologies can go. So that's been a really successful. I'm happy we're doing it. That's all Margarit's doing. I like to just stick myself in a box with engineers and then build something until I think it's perfect and then get it out there and we're not doing that. And it was such the right choice. Things that people are building with it already. The very first project that we had come at us for API access was a company called Skyglass, which was brand new application on iOS and it is effectively a virtual green screen for content creation. And so you use it on your phone. It's a phone app. And then you create whatever background you want. It automatically, you know, screens you out so that you're there, but you're against the background and then it connects to the gyroscopes in your phone. So as you move your phone around, you can see the scene moving behind you and it really allows you to kind of create whatever you want. That's I think a big initial use from where we are right now because it doesn't require geometry and it doesn't require a lot of things that are coming, but not ready yet. But another type of that application is like a we have an application called MovieBot that I believe is rolling out their integration with us in the next couple of weeks. And what they are is it's an entirely AI generated animation sequence tool where you, you know, like you you type in here's the here's the one sentence thing about what I want the scene to be about. So it's a conversation about herring. And then you you know, you pick your characters, you pick this penguin and then you pick, you know, some dude whose name is Bob. And then, you know, you assign voices to them and then you pick the, you know, you pick, is it a drama? Is it a comedy? Is it like stand up? What is it? And then you pick the scene from us. And right now they have just kind of baked in stuff that's static, but with ours basically they allow you to generate the to generate, you know, a background, the scene, whatever you want. And then you can actually pan around again using the gyroscope and figure out exactly what part you want to center the scene on. And then you lock that in as your background for the scene and then you press the button and like 30 seconds later, you've got a kind of like a totally accurate battle simulator. You remember that one? That game where it's like everything is kind of spaghetti characters. So in this case, again, the early version is like, here's your spaghetti characters talking to each other with AI generated voices and AI generated script. And you had no part except for kind of picking these variables or choosing random on them. So that's something else. A big project that we're working on. We're not under NDA on it yet. But again, like without saying anything, meta just rolled out the ability to use basically exactly our file format and our production format for sky boxes in their horizon tool, which is basically their home environment and the ability to have custom sky boxes. So we were a little concerned initially that maybe somebody would be able to replicate what we were doing if they had, you know, like a sufficient amount of unlimited money to throw at it. And in practice, what we've seen is that nobody's interested in that. They just want a solution. And that's exactly
Jon Radoff: what we thought would be the case. Let's dream for a minute here. So like, let's go into the future. We talked about the near term future of what it might mean for game development in the next year or two. But let's go a little bit further out like five, 10 years. What is all this generative technology mean? What does it mean for economies, jobs, creativity, products we're going to get to use, including the stuff that people tend to get a little bit scared about when they look at some of these things. What are your thoughts on all of this? Disruptive technology is fundamentally
Game Development: empowering in my opinion because it disintermediates and breaks them inopoly of people who have effectively dominance in a given area, right? So that can be because they have so much money that they can afford a thing that nobody else can afford. It can be because they have so much expertise. And there's so much expertise required that they have an advantage that really insulates them from any competition. And I think that all of that is in the process of falling away. I think that and I think that that is fundamentally empowering for normal people who exist out there. And it's even fundamentally empowering for people who have incredible skill in an area that is going to be obsoleteed because they will be able to do whatever they want in any of these other areas. So I think that the thing that people should be rightly concerned about is process oriented jobs. Process oriented jobs are going to go away in the way that we understand them today. But that's because we're opening up a world where process is not the barrier. And right now process is the barrier and almost anything that you try to do, right? Execution is the challenge. And so it subjugates idea behind the ability to actually execute. And I don't think that that's actually a good thing. I think a lot of the concern about this comes from two directions. On the one side, you have people whose lives are being disrupted or are about to be disrupted and nobody likes change, nobody likes to be disrupted. I'm very sympathetic to that perspective and empathetic to it as well. And I think that that's totally right. But it's also a natural process, right? We all go through a grieving process when things change. And this is that. We're still in the part where they're, you know, they're raging against it. A lot of people are, some people are depressed, where it varies parts of the process.
Game Development: But it's a process that will eventually end. I think the area that really gets disrupted,
Game Development: though. And if the area that's really important is if you look at how our world is organized today, if you look at how governments work, if you look at how powerful companies work, the amount of money that they have, whether it's, you know, because they control monetary policy or just because they have wildly profitable and awfully businesses, that gives them so much power. And so much ability to not care about the inconveniences and the challenges that exist in every day life, because they can just spend the money on the experts to do it. I think that we are coming towards a world where money matters a lot less in the ability to execute, not just because it gets easier, but because expertise is always on tap. I'll give you an example. So I've been going through a merger process. And one of the big challenges about a merger is that effectively it's like 150 page long, like computer code written in the stupidest possible language, which is of course English. And you're trying to, at the end of this program, have a, have a list of, in this eventuality, whose fault is it? And what are the exceptions to that? And what are the exceptions to the exceptions? And so you have like massive lists of, even just definitions of what words means that you have a common agreement on them. Right now, that means that you're paying lawyers hundreds of thousands of dollars in order to help you understand what is effectively a very stupid contract, right? Or a very stupid computer program. And if you, even today, start to use some of the longer form context windows in GPT-4, you can have a level of service that is completely impossible to achieve for any normal person when you are looking at a complex legal document. When, for example, you want to, you know, draft a binding letter of intent, or you have a deal that you've negotiated over a, that you've negotiated over email, right? And there's like 15 emails back and forth. You can literally take the entire email thread, just copy it into one of these LLLMs, and then say, hey, write articles of incorporation, write a binding MOU, write one of these things, that captures all the points in this, and then summarize what you didn't want you to do at the end. And it'll do it over the course of maybe two minutes. And then at the end of it, you're like, well, that was terrible. Do it again, right? Or change this or change that. And again, that's a process that today, if you're not super rich, like, you know, you don't, like, you would never get that level of service in terms of the speed and the responsiveness of it. But even access to it as a resource is really challenging. So if you look at the legal system, again, why is the legal system so tilted? It's because if you have a sufficient amount of money, you can keep things in court forever, and you can effectively bankrupt the other guy. So just taking the legal profession as an example, I'm not saying lawyers aren't going to have business. I'm saying that not having money isn't going to mean you're going to have insufficient representation. And that is kind of an indication of where this is going, where it's good for everybody, but it's best for people who lack resources in the current environment to really have the expertise that they need. That to me is the essence
Jon Radoff: of disruption. It's actually taking something and making it much more broadly distributed through society. When you had a capability that was previously limited only to a very few, so that's one access that we're on. The other though too is like, when you can increase the rate at which you can participate and get your fast responses, like, there's also an increase in quality, the innovation cycle increases, the ability to iterate. So there's a quality improvement as well as a decrease in cost for access to services. Yeah, hugely. And again, the important point to me is
Game Development: that it's true for everybody, but people who already have dominant positions in the world today, they don't really benefit much. They might save costs, but the capability that they had monopolized
Game Development: previously because of the advantages that they have evaporates. And I mean, you can draw this
Game Development: back even more, and I don't know how much you want to get into this. But again, why do we have representative democracy in the way that we do? If you look back 250 years, here in the US, the founders created a system that is maximally decentralized, impractically decentralized, with how the powers spread out. But at the time, they were very concerned about those centralizations
Game Development: of power in a single individual, i.e., a king. And so they built a system that was impractical for the time, but which was pretty good. If you looked at the capabilities that we have today,
Game Development: I mean, how many people does the average representative in Congress represent? It's like low is tens of thousands, and the reality for many people is many millions are being represented, you know, in like populous states and cities and stuff like that. So that's the situation today. But if we all have AI agents running locally on our phones in the next five years, then why can't we have a system where I tell my phone the things that I care about? And then I have a personal representative whose entire job it is to argue with everybody else's personal representative, and you bring down the granularity of representation to a single person level. So there's no ability to abstract, you know, like the needs, right? Because that's what happens is that as you add complexity, you add obfuscation, it becomes easier to hide nonsense and garbage in something that's complicated than something that's very simple. And I just see a world where all of the complexity, whether it's actually not complex or you're just literally asking your local AI to tell you what it means in a way that's going to be consistently honest, that's the direction that we're going in. And it's again, it's not just about creations, but everything.
Jon Radoff: So I totally love that you went there. I just a few weeks ago, I gave a talk at MIT about this subject and one of the topics was this idea that we're going to be able to project our will into the end of the future with these agents on our behalf. The part of my premise, though, is that we're about to see this battle. We're already starting to see the early signs of this battle. And I'm really curious how you think about it between the,
Game Development: you know, the more decentralized approach would be open source models. Everyone gets access to
Jon Radoff: the weights and biases. You can put it on your own machine versus API-Gated. It's up in the cloud, and only the people who know how it was constructed actually knew what went into it, and they can monetize it and control how it's used and define their own form of alignment. So open versus closed AI. What's your thoughts?
Game Development: Yeah, I think that this battle is already lost by the closed source folks. I think that it'll take a while for that to be recognized, but the core technologies that exist out there, like there was this assumption, I think, going into a lot of this stuff that large AI models that require data centers to run are ultimately the only way to get to true intelligence and to kind not true intelligence, but to really useful, you know, LLMs. And I think that that's, I think that's already recognized as wrong. And I think that it increasingly will be recognized as wrong. As it stands right now, I have a 3080 in this computer, so not a great graphics card, eight gigs of RAM, and I'm able to get running a 30 billion parameter model that if I ask it to write me a paper, we'll write it very, very coherently and do so at a high level. But it will take 45 seconds for token that it generates. So that's the current state of the art as far as the kind of locally running models, all of it's built off of like LLMA and stuff like that. But again, like these models, like I just don't see a world where where closed source retains that lead, because there's so much focus on the open source side and there's so much passion. And the tools already exist for them to be pretty good. And so the fact that the tools already exist for them to be pretty good means that we're going to continue to see breakthroughs, we're going to continue to see developments. So I mean, if you want to think about this in Star Wars terms, right? Like the closed source guy is Google, you know, OpenAI, they're the empire, right? Super well-resourced, you know, very buttoned down, know what they're doing, high quality product, you know, but they're super limited, right? They're terrified that somebody is going to do something that's going to then come
Game Development: back on them and they're public companies, so it's reasonable. You know, if you're an open source
Game Development: dude out there who's just like, hey, I'm doing this thing, right? Like that's not the concern that you have. The concern that you have is it possible? Can I make this work, right? Here's an idea I had, what happens when I do this? And rather than being like, okay, cool, it's going to be in the lab for six months. You're like, oh, yeah, this is open source from day one, everybody come here. I've loved seeing all the open agent stuff. There's like five or six projects that are basically using like LLMs to instruct other LLMs to then, you know, do certain tasks and then check their work. Stuff like that, none of that is happening inside Google. At least as far as they've told us, none of that is happening inside OpenAI. It's all happening at the kind of lower level where people are using those APIs a lot of times to bootstrap that idea. But ultimately, you can swap the engine out and it's great. One more thing, Alex Atela from OpenSea. So he has been working on a project called Window AI. Are you familiar with it? So it's really cool. It's a browser plugin, browser extension, and what it does is it allows web-based applications to offload the ML stuff to a plugin that then connects the person locally based on their own key. So you're not putting keys into there. You can connect local LLMs that you're running to the extension and then use it to power web apps. And
Game Development: again, it speaks to this future where not only is this going to be self-directed and you're going to
Game Development: have locally running models, but you will actually be in control of what model is being used and will be able to make choices about that. Again, locally, and the guy developed in the application doesn't have to worry about using their API. He doesn't have to worry about what that model is. They can have recommendations and those again, recommendations can auto-populate, but ultimately it's putting the control in the hands of the user. So that's a fully open-source project as well. I think it's window AI.com or window.ai or something like that. I don't know. We'll link to that in the show notes
Jon Radoff: as well. So I'm super interesting. So I'm rarely a cynical person. I think the best of everybody, most of the time, but the cynical part of me when I hear about these calls for regulation, when they come from the people who have these leadership positions already makes me think that this is just an attempt at regulatory capture to shut down some of these open-source people. So I'm curious if you have thoughts on that. And then the other part of this that still seems like it's going to be a challenge for a bit. That favors the really big guys is inferences one thing. We can do inference on your 3080, like you said. We can run stable on it and get good results. But if I want to train that model, I'm still back to spending hundreds of thousands of dollars at least on a stack of H-100 somewheres. Is there hope for that to either change or is there a business model for open source
Game Development: to pursue those kind of training regimes? Yeah. Like I think that they're speaking of kind of to the last point first. So I think there's a question about large models, core base models versus specialized versions of them. Specialized versions of them are already incredibly economical to do whether you're talking about textual inversion or you're talking about lora's or you're talking about fine tunes in more classical sense. Like that's already stupidly affordable and individual can do it by themselves with zero resources on their own device. And I think that is going to continue to amplify. Notably because lora's although we don't use lora's but we've been kind of paying attention. Lora's are really cool because they allow you to stack multiple layers on top. And to there's a lot of advantages to doing that versus some of the other things that are like if you do a fine tune that's kind of mutually exclusive. But lora's you can really stack and continue to improve, continue to improve, continue to improve. I suspect that's really where the open source is going to is going to make a lot of room. On the kind of core models themselves, I do think it's going to get a lot easier to do that if just because hardware that's coming out. It's not really accelerating from an inference perspective. So you're not going to generate faster with like an H-100 which is the kind of latest, not the latest now that they've announced their super computer. But you know back in December it was the new hotness. Right and it's like it's much more efficient than prior versions. Apart where for training right now they're very very expensive. But if you wanted to train a model you probably could do so like a core model. And then the other thing is that that like a lot of the stuff that's going on out there. Again there was just an assumption that you needed an incredibly large amount of data in order to do that. And really it's not so much the
Game Development: training as it is the data set prep. That is the thing that's going to kill you as far as cost
Game Development: is concerned. And we're just finding that better and better results are coming from smaller and smaller data sets that are more carefully curated. Because if you think about it when you look at what stable did with scraping the internet. They didn't scrape it. It was a lion's scraped the internet. And then they just paid for you know the training to be technically correct. But if you look at what they did. It was a great shortcut to get to something where quality went up. But actually the reason why something you know like prompt engineering exists as a viable need is because that data is so so dirty. And it's because they made this assumption if you're scraping 450 million image taxpayers right like that's going to be hard to clean. So they just made an assumption which is here's the image. Here's the text that accompanies with it. We assume that all the words in the text that accompanies it excluding the ones that we censored out apply in some way to the image up here. And the problem with that of course is that that's wrong a lot of the time. So again quality went up hugely but ability to ask for what you want. And really kind of get it in a way that isn't like you know engineering machine like to talk like a computer right. Like that that was kind of the problem that they had. So as you as we're continuing we're finding out that smaller data sets that have much much cleaner characteristics about them actually generate as good a results on a quality side so long as you have a diverse of data set. But can give you much better control over the outputs.
Jon Radoff: What was the first part of your question? Oh I brought in regulatory capture and no no I have
Game Development: thoughts on this. Yeah yeah. So okay so so so the regulatory capture thing is interesting. I don't actually think it's regulatory capture I think it's a virtue signaling. Okay. I think that I think that the reality of it is is that this is so disruptive the governments at some point although they're super slow and very dumb are going to be like hey we should probably do something about this. And I think that again if you look at the open AIs out there the Google's out there etc all the players who signed this stuff for the most part their companies that already get a lot of regulatory attention and that already are kind of under the microscope and so they kind of have to signal that hey we're good guys right even if they're not good guys they still have to signal that and so that pot the the letter that Musk signed out and everybody else signed on too. Again like I looked at that I was like okay this is pure game theory right they're saying all right well we are going to signal this but actually we're going to keep working and the hope is that you know that everybody else believes that we've stopped working and they also stop working which then let's us get further ahead because if you play it out again assume for a second that there's a 1% defection rate right 1% of people who are training these large models are like you know what I'm not going to sign on to that and I'm going to keep going right or a 100% sign on but 1% is like actually we're going to train secretly all you're doing is you're empowering the people who'd care least about the rules right and are most likely to create the actual problematic solutions so to me the priority here should be hey if we know that if these things are controlled terrible things can happen then the whole point is is that they shouldn't be controlled because if they can't be controlled then sure terrible things will happen but at least we'll have agents that can protect us we'll have agents that can act on our own behalf as opposed to them being like nobody's allowed to have this except for people who don't care about laws and are willing to break the rules and then those people have it who are exactly the problem and none of us get the advantages that would come with it because
Game Development: it is very very much is and is going to be an arms race so Adam this this has been an awesome
Jon Radoff: conversation we've covered a lot of ground but I'd like to kind of close on if you know we're sitting here halfway through 2023 what are you most excited about in the next six to 12 months that you think everybody else watching listening to this should be excited about as well I mean
Game Development: I'm really excited for long-form content creation coming from I mean like personally like I've been doing experiments for like six years with using AI to write for me like in my spare spare spare time you know like I really like radio dramas and like old school stuff like that and and so I am really good at writing nonfiction as far as like position papers you know looking at technology the future I really get stuck with fiction because there's so many possibilities and there's no right answer and so I found early on that even the very first versions of the GPT technology allowed me to break through that barrier and I fell in love with it at that point and I've just been constantly testing and testing and testing and we're finally to a place where it's like like over the weekend you know again in my spare time I wrote you know the first six scenes of a screenplay for a for a concept that I've been kicking around actually since the very early days of Bitcoin on the Bitcoin talk forums there was this project called Bitdrop from the guy who did the Bitcoin the Bitcoin black market stock exchange before the concept of tokens made it so that that was a little bit more trustless he eventually wound up getting arrested but it was this friendly robotic delivery service right and so I I love these types of concepts I tried to help them spec it out and I then wound up turning into fiction after the whole project fell apart so I've tried this experiment over and over and over and over and over again and I'm finally to the place where I would actually send you I could actually send you like the first couple of scenes on this and you'd be like oh yeah that's not terrible and you know and it's coherent right and so there's still a lot more work to do but I think that over the course of the next year as context windows continue to extend and also as wrapper layers around them kind of eliminate the need to hold everything in memory at a given time you know like it's it's there but so that's like specific more broadly the thing that I am most excited about and think that people should really be thinking about here is that there have now been three different examples which I'm not going to tell you about any of them I'll just mention them where I have seen teams that have been working on projects for years at this point create a version that uses AI that is worse than what a single developer has created in like 40 hours of of throwing out all the sacred cows and figuring out what the thing is like what what if you didn't have the existing process and you weren't optimizing the existing process what would the new process look like right we've done that with the sky box thing and I'm you know very happy that we did but I'm seeing other people do it too with many less resources than us and it continues to it just it shocks me every time I see it because they shouldn't be winning they shouldn't be better than all of these tools that have been a development for so long but it's it's a lot less about execution and it's more about how do you think about it and I'll leave you with one kind of thought here which is that every tool that exists out there today thinks about most of them thinks about like what is the purpose of AI in the context of helping the human right so it is an AI assistant for us to do something better because the AI is helping us to do it better I think that this is a fundamentally backwards positioning for the world that we actually are in and I think that actually what's happening is that the AI is the exit the expert in execution and so the AI should be doing it and then the human should be helping the AI because the human intuitively understands whether what the AI is doing is correct or incorrect and it has the ability to judge its work and then to guide it so this concept of rather than the AI helping the human the human guiding the AI is I think a fundamental reversal of the way that we think things work today and I think if you if you can come to grips with the implications of that you know just metaphysically and like what does it mean what's the meaning of life stuff like that then I think that it sets you up to build applications that make a lot more sense and that rather than being 5x 10x improvements in productivity are 100x thousand x improvements in in in productivity in ways that are a lot less frustrating than the AI trying to teach you how to do a thing that you fundamentally don't know how to do right so that's that's where I want to go with our product I want it up I want an interface that at the end of the day is like hey here's the power user side and if you're that guy that go ahead and do it but if you just you know you just want to talk to the integrated AI in our product itself and you're like hey give me you know a forest scene that has a castle with three turrets you know a mile in the distance then it should know how to draw on our sketch mode it should know you know what prompts to use exactly it should be able to change things iteratively and basically again you're not doing it you're guiding the AI to accomplish your objective and if you can do
Jon Radoff: that the whole world is different so you're you're kind of talking about curation in addition to creation and and taste yeah taste becomes really important yeah hugely but it's going to be an exponential increase in creative production between AI's and humans working together and whether that's art writing music all kinds of things we didn't talk about so Adam thanks so much for coming on and talking to me all about this this is amazing awesome work everybody check out blockade and see the tools that are available to you now and keep an eye on them too because I have a suspicion that every few weeks we're going to see something super cool out of the team it's
Game Development: going to be a weird year it's going to be a very very weird year John really appreciate you having
Game Development: me on this is a lot of fun you