Artificial Intelligence Livestream - Tejas Kulkarni - Common Sense Machines - @CSM_ai

Originally Broadcast: 2025-04-02

Jon Radoff hosts Tejas Kulkarni for a mind-blowing conversation about the future of AI-driven creativity, the path to AGI, and how 3D generative models are revolutionizing digital content creation. From the intersection of neuroscience and machine learning to the cutting edge of synthetic imagination, this is a discussion you won’t want to miss!

🎙️ New to streaming or looking to level up? Check out StreamYard and get $10 discount! 😍 https://streamyard.com/pal/d/4855526410551296

Tejas Kulkarni: All right, welcome everybody to the premiere episode of the Artificial Intelligence livestream.

Jon Radoff: It is on April 1st. This is not an April Fool's joke. We are really doing this. We've had lots of conversations in both the Web3 and the non-Web3 game development livestreams. And it was sort of a running joke from the beginning, honestly, that we wouldn't talk about AI all the time. But then, Oscar, what's our ratio? Have we been tracking metrics on this?

Tejas Kulkarni: What's the percentage of the time that we don't talk about AI? It might be 0%.

Jon Radoff: I think we basically mentioned AI.

Tejas Kulkarni: It's been successful about 2% of the time, but AI seems to be coming up over and over and over in the conversation. I'm not sure why.

Jon Radoff: So it keeps coming up. It's as if AI is a real thing that's changing. It's changing the world in significant ways, and it keeps affecting the conversation. In fact, on Thursday, when we do our normal game development livestream at 1 p.m. Eastern, it's going to be two AI gaming startups. One that's doing like an AI Dungeon Master called Wanderheart and another one that's doing a vibe coding platform for kids called Nilo. So we're just talking about AI all the time. So what it led to is we should actually do a series that's just on Nilo. It's called AI Artificial Intelligence and also not be constrained to gaming. So that's what we're doing. And you've joined us here on April 1st. Nope, not a April Fool's joke. We are really going to talk about AI every week at 3 p.m. Eastern on Tuesdays.

Tejas Kulkarni: And let's get started.

Jon Radoff: So my name is Jon Radoff. I'm the CEO of a company called Beamable. We build gaming infrastructure, and I just love to talk about AI and game development and blockchain and all these big trends that are changing the world. So let's get started. So my name is Jon Radoff. I'm the CEO of Beamable.

Tejas Kulkarni: I'm the CEO of Beamable.

Jon Radoff: And I'm so pleased to be joined today by my friend Tejas Kulkarni. Tejas, why don't you give yourself a brief introduction here?

Tejas Kulkarni: Yeah. Well, thanks for having me. You know, wonderful to meet everyone here. And thank you, John and Oscar. John, I've known for quite some time. It's really fun to be here. So, yeah, so my background is definitely come from, you know, AI neural nets. Did my schooling from MIT here. I was a computer software developer, you know, back in the days. And, you know, that was really the time almost a decade ago that I started thinking about what would it get, what would it mean to have computer software basically recreate the world, right? Both the fantasy world as well as real world. And then now I think, you know, gaming, I've been playing text-based MUDs for, you know, more than a decade. I think I was very, very interested in that. And, you know, that really led me, led us to eventually build CSM. You know? And what CSM is doing is, you know, we're building a lot of things. We're building a lot of things. We're building a lot of things. We're building a lot of things. And, you know, I think the goal is to really create a 3D assistant. I think, you know, in the same way that AI coding is helping people be very good coders. And, you know, I think, you know, it's gone really far just like within the last one year. And what our hope is is that we would be able to help people who are creating 3D assets worlds, definitely for gaming applications. But also, you know, real-world products and things like that, simulations, which is slightly kind of more of a challenge. And, you know, I think that that would be the goal is to have a lot of those.

Jon Radoff: And it's a lot more of the work that we've been doing over the last year and a half, more longer term to make them more productive and easy to create those worlds. How do you categorize yourself, Tejas?

Tejas Kulkarni: Is it vibe coding? Is it world building? What is this? Yeah. I think, you know, definitely I think one of the earliest kind of reasons for me to be, you know, the earliest bets I guess, you know, when we were starting for Mewals, there were two different threads. It's probably the most successful application of generative models, which is kind of upside down, right? You would have expected the cups and the chairs to be first, but it happened to be the other way around. So, you know, you have AI coding on the one side, like white coding games, and then you have the asset thing on the other side, and they're all meeting together now. So I think coding is, you could argue it's much further ahead, not in terms of creating a full world, but really kind of making a huge impact. I think, you know, nobody knows how far it can go, really, even with the current models have chained together properly. So I think, but the content piece, the actual, the goodness of the real world, right? You know, the dragon kind of moving around in some way, or, you know, the chair looking the way it should be with all the PBR materials. So that's really the art aspect of things, right? So I think if you can have both of these things, then, you know, your regular artist can become a superhuman artist and someone who doesn't even know anything about art. Can actually start dabbling in it. So I think the way I see what we are doing is AI coding is happening independent of us. And I think we are building the 3D generative models that are going to kind of interface with that and really truly open up the vibe revolution in many ways on both sides.

Jon Radoff: So Tejas, let's maybe develop that idea a little bit more. So this is the first AI live stream that I've done and awesome attendance already. A couple hundred people in the live audience here listening to this. And I imagine it's like other streams I've done recently, a few hundred thousand views

Tejas Kulkarni: in replay.

Jon Radoff: Not everyone may be familiar with even the idea of the difference between training a model or creating a foundation model versus a lot of other things that they see in AI that are using somebody else's model. My understanding is at CSM, you guys are training new foundation models for three years. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. That's right.

Tejas Kulkarni: Yeah. Why do you need to do that? Yeah. It was more out of like a need, a necessity than you wanted to do it. And there's a lot more open source components now you can mix and match, train more and things like that. So the problem is if you look at image, video, text, code, a lot of those things are way more mature because the amount of training data on the internet is at least many orders of magnitude larger. If you just look at, if you try to buy the entire Shutterstocks library of 3D assets, you'd get less than a million assets, much less than that. And you know, but hundreds of millions, billions of images, right? So the big challenge for 3D is that you can't really just scrape the internet, I think, for data. Like it's not possible to do it at scale. So it's data.

Jon Radoff: Unlike 2D, we're not riding on the coattails of... Yeah. ...millions and millions of photographers and 2D artists who upload content. No, you can't.

Tejas Kulkarni: And I think, you know, and it also reflects how humans actually learn the task, right? By the time you're 18 years old, 20 years old, and you start doing some of the, you know, the design work, you have probably seen billions of image pairs, right, from the real world. But how many images has a 3D artist seen? Like, you know, barely any kind of compared to that. And that's true of also what machines are seeing, right? Yeah. So fundamentally, you have to have a data efficient way to learn these systems. And some of the architectures are data efficient.

Tejas Kulkarni: They had to be.

Tejas Kulkarni: So the models are basically being... So the models underwent kind of a two-phase kind of step change. Pretty much in 2023, 24, half of 24, the models were actually taking some of the image diffusion models and doing this method called distillation, score distillation, which basically meant, you know, you take a large image model, a text to image model, and then you essentially ask it to render many different views of it, and then you back propagate that information into some sort of a 3D representation, right? Gaussian splat or a nerve or 3D mesh, whatever. And then that gave you pretty good textures, but it didn't really give you good geometry. So those kind of fell out of favor in some sense, although there are some good ideas there might come back in some form or the other, for sure. But then they became 3D models. Yeah. Yeah. So that's really what I was trying to do. I was trying to make sure that the 3D models are actually 3D native. So a lot of the current models are actually reflecting a lot of the workflows that people have, like, you know, geometry, polygons, PBRs and things like that. So there are multiple models now stacked on top of each other to get to the final representation. So that's really what a 3D foundation model is at the asset level. And then once you have an asset, then comes the piece of animation, right? So you could directly animate it with skeletal based methods. That's kind of, you know, one thing you can do. You can also do things like and which is getting easier by literally as early as last week. I think GPD 4.0 has been a breakthrough. I think most I think it will get unraveled in the next few weeks where, you know, even if I have a random concept art image of a of a new character or a prop, I think now we have image tools that can both the Gemini native image generation, GPD 4.0. I think you can actually reimagine that in T poses and ask for different parts. And, you know, do these type of things, which was which is what you actually need to to get to like, you know, usable assets. So you have kind of, you know, you have this kind of combination approach to get a foundation model for the asset level. And then you have the foundation models for code, right, which interfaces with the asset layer. So, yeah, so that's kind of a foundation model there.

Jon Radoff: OK, so there's a lot to unpack there. And so I'm going to try to unpack that first. Before we do that, we are. We're already up to. About 300 live audience here. I want to remind everyone there's a reason we do this as a live show as opposed to a recorded show that I did that actually did a recorded show on the subject in the past, but it was two or three years ago. We do everything live now so that you, the audience, have an opportunity to come in, ask questions, actually participate and be part of this conversation. So if you're here and listening, you have a really unique opportunity to engage live with. Tejas, who's a leader in this field, the CEO of Common Sense Machines, which is building foundation models and a whole tool chain around 3D graphics. And Tejas is the real deal. He is a Ph.D. from MIT. So like he's the guy you want to talk to and ask all your questions from. So please take advantage of this opportunity to ask your questions about graphics and AI in general. I think there's a lot of ground we can potentially cover. And if you even want to join the. The. Show live, that can happen as well.

Tejas Kulkarni: So Oscar's watching all the channels we're transmitting to Facebook, LinkedIn, YouTube,

Jon Radoff: Twitch X. We're on all these places wherever you are post comments. We'll share those comments. But if you want to join live, let Oscar know we'll bring you in and you could ask your questions live and on the air, too. So that's a possibility if you have something about AI that you just want to talk about. These shows are here to be dynamic conversations. So Tejas, going back to maybe unpacking a little bit. So we ran right through some of the things that came before what you were describing 3D native models. And I just want to spend a little bit of time on that again, because I think a lot of people want to just understand what's happening better. You ran right through Gaussian splats. So my understanding of the way Gaussian splats work is you can use photographs, multiple photographs of the scene, and it's sort of like reverse ray tracing. So. You can look at those images and we figure out something about what the point cloud of light that could have generated that would have been if you take those photographs and back into the ray trace, essentially. I think that's basically the understand. Now, a nice thing about Gaussian splats is, again, it can depend on photographs and we've got lots of videos that have multiple scenes and things like that. So you can start to get this sort of greedy ish immersive experience. Yeah. And it's not just like a pretty immersive experience that comes from that.

Tejas Kulkarni: But I think what you're saying is it's almost like pretend 3D.

Jon Radoff: It's not it's not really 3D in terms of how we think about 3D immersive experiences today. What can we spend a little bit more time on that? What are the shortcomings of something like a Gaussian splat, even though the I think the results are pretty cool looking for certain things?

Tejas Kulkarni: Yeah, I mean, what I was showing on screen, I mean, so that bench, for example, looks very realistic. Right. And it's hard to get that in the polygon space to really get that level of realism. And like you like what you're saying is like it's really kind of the intensity of light luminance shining from from a particular point in 3D space. So that way you can represent a lot of thin structures, different materials. But really, you know, doing physics with this is hard. Right. So it's hard to get interactions right. It's hard to get. We still haven't figured out how to do it. Right. We still haven't figured out how to get that full relighting and mixing it with existing assets in a dynamic way. Although you could imagine like, you know, unreal or something could with Nanite and

Tejas Kulkarni: some of the other hypo engines in the future will have some pretty seamless integrations

Tejas Kulkarni: with these. The problem with these right now is that it's very much it's very good for visualization. Basically, if you're doing product visualization, visualization, I think this is really useful, but it's still not at it still hasn't gone in production like your Amazon website. You don't yet see Gaussian splats of products yet. I think the reason for that is because we have talked to many of those people. I think early on we were trying to see if this is a use case you want to go after. What happened there is if you can't control every single pixel of your Louis Vuitton bag or whatever, then, you know, people people get scared of that. So so editability is really hard with this. So I think one of the big challenges. Right. One of the big challenges for these approaches is going to be how do you make it very editable? Because artists have a whole pipeline of doing polygons and PBR maps and shaders and all of these things. So kind of in in a splat land, it's kind of a an implicit representation of a Gaussian with some parameters that are very hard to manipulate. But I believe that, you know, there will be kind of just like how you can control images

Tejas Kulkarni: now.

Tejas Kulkarni: There will be tools to control these type of images. Right. There will be a lot of representations and it'll be a new medium, basically a new game engine some someday in the future. And it might get combined. There are also ways to combine that with meshes. So there are ways to kind of mix and match these things in principle. Can I ask a question? Is Gaussian splatting the same thing as photogrammetry where like someone takes a camera and they, you know, they've seen those two different things, but they seem to be fundamentally different on some level. Yeah. The photogrammetry is more like mesh land. Right. So and the reason why that doesn't look realistic is it's very hard to get meshes to look like the real world because the real world is not a mesh. Right. The real world is kind of a reflection of light. And if you actually it's really cool where Gaussian splats come from. If you trace the history of Gaussian splats all the way back to where the first paper was like, you know, it comes from Kajaya's work in computer graphics. But if you look at what Kajaya is citing. He's citing the first paper. Right. Right. Right. Right. Right. Right. Right. Right. Right. Right. Right. Right. Right. Right. So one of the things that i would like to touch upon is the spectrum which shows off the Underground God春 מסدر . Like, it's it's maintenant texture is important, right. Contrasting theELLAE And the surfaces,essenceえ So the decorate the And the effects. So when you get here, you get to experience the flexible within the Heavens. And you have motion to it to be reenacted in the Light. Right. of both of these methods.

Jon Radoff: The other difference too is that photogrammetry, you're really scanning the entire environment

Tejas Kulkarni: and you're building the whole 3D scene

Jon Radoff: from all of those data points as opposed to Gaussian splat. You can use a sparse number of inputs. You don't actually need all of the photos showing every possible view of the object.

Tejas Kulkarni: You're building the point cloud of light

Jon Radoff: essentially from the few photos that you have.

Tejas Kulkarni: And you can still kind of walk around it

Jon Radoff: and see it from all these points of view. So it's neat. You can always correct me if I'm wrong about anything, Tejas, but that's my understanding about the way Gaussian splats work. You wouldn't have to scan every aspect of the photo, although I think more is better. I've done some and when I had a lot of photos, it very much improved the result. But the short...

Tejas Kulkarni: Yeah.

Jon Radoff: So the short...

Tejas Kulkarni: No, just one more cool thing about the Gaussian splat is that you can actually have a neural network predict a splat from a single image or text also. Interesting.

Jon Radoff: So sparse could be one input and it'll do its best. Interesting. But then the issue with Gaussian splats are, well, I guess we can start with where they're useful and you're pointing out if you want to do something like be able to see a 3D-ish representation of a 3D image. Yeah. Yeah. Yeah. Yeah. So if you can use a 3D image, if you can use a 3D image of an object, like product catalog views, look at the Louis Vuitton bag, as you were saying, but see it from any point of view to improve the chance that someone might buy that object, that's a use case. There's probably a lot of interesting use cases in just static video as well because you can do fly-throughs of scenes and various renderings that are not interactive. But if we get into the world of 3D, that's where it gets harder because that's where physics becomes the issue that I think you're pointing at earlier. So... With real 3D scenes where we built 3D worlds with 3D meshes present in them and various objects, we've got not only the presentation of light, but we've got things like what happens when these objects collide with each other? How do they respond to each other? If you're a video game, how do you handle things like collision detection and bullets and all these things that come out?

Tejas Kulkarni: So there's a lot of issues with an immersive, interactive 3D environment that would be maybe not impossible to solve, but maybe hard to solve using Gaussian splats as the rendering system.

Jon Radoff: Is that a way to think of it?

Tejas Kulkarni: Yeah, I think I agree. I think so. It's just that they look very realistic. And I think if we can bring their realism to solve something... I think that's one of the issues you kind of just highlighted, then we have the best of both worlds. There is a pathway, I think. There is a pathway to... And in all the... The cool thing is that in a lot of the work that happens in predicting meshes, for example, in these foundation models and 3D foundation models, splats are actually used in the intermediate parts of the pipeline as loss function, then as training targets. So it's very incredibly useful to do that, actually. So even if you're not going to use the splat, splat is very useful.

Tejas Kulkarni: It's very helpful, basically.

Jon Radoff: We've got actually a great question, unfortunately, from anonymous LinkedIn user, probably someone who's a couple of degrees separated from me, so I don't know their names. But thank you for the question asking about how you build an AR world. I think he's asking, how would you use AI to generate objects that would be useful in AR? Are there different considerations for something in AR versus just a completely virtualized world like an MMO? Yeah.

Tejas Kulkarni: I think splats are really useful for AR, right? Because if I want to take real world machinery, shiny things, machine parts for education, for cars, for jet engines, whatever, right? I mean, I think things that are quite complex and people use AR for a lot of those education training applications. Maybe it's for more like Pokemon style games that are out there in the real world. I think for AR, actually, it's close. It's closer to visualization, right? Because you're not doing like deep interaction. A lot of the interaction is also with the real world. But some of the AR applications that people seem to be using a lot more, the appearance matters a lot. Right. So it depends on what you're trying to build. If you're building a game, then it really I mean, then you have to think of it as a game engine, right? Just a mesh based representation. But I would I think AR is probably good for like AR and splats can work pretty well with each other. I feel.

Jon Radoff: Right. Because you're saying in AR, it's not the same level of interactivity that we would have in a game where you're got objects bouncing into each other and collision detection issues and physics and gravitational forces and all these things that you have to deal with within a immersive game environment or a simulation for that matter. You're touching on simulation earlier. But in AR, a lot of the time you're projecting images into the real world and you're kind of referencing an educational application. So you're kind of referencing an educational application. We might want to just walk around the jet engine and see see what it looks like.

Tejas Kulkarni: Yeah.

Tejas Kulkarni: Like one of the most magical AR experiences I had.

Tejas Kulkarni: I mean, unfortunately, the Apple device is not like it's not being done so much.

Tejas Kulkarni: But I mean, I do have one. And if you've seen kind of the the experience where the dinosaur they're fighting. Yeah. And they come outside. Right. Yeah. So there are splats that are also four dimensional splats. So they're like movies. Right. So I think to generate things like that, you want to be in the splat land. So that's a really good kind of that's it's going to come back, I think, at some point when we have good enough glasses and things like that.

Jon Radoff: So earlier we were talking about the challenges of 3D. So let's bring it back to 3D. We've talked a little bit about the advantages of having something 3D native. And by 3D native, you mean built from geometry, describing objects in the environment where you can apply physics rules to these objects. You talked about the absence of. Enough training data to do this at scale. What else is what else is hard about this? Like, is there. Yeah. Is that it? Yeah. Or could we solve it if we had a billion 3D models we could just download and train a model?

Tejas Kulkarni: So, yeah, it turns out I think you don't need that much. Like, I've been surprised how data efficient they are. So when I first started, I mean, many years ago, it just didn't work. But we're just kind of fiddling around and more or less trying to get these models to do something. But I think for the same data sets, the models are actually getting better and better. So what I mean by that is, you know, if you kind of send in an input image and if you basically show, you know, inspect kind of the normal maps and the geometry maps, some of the more recent architectures without changing the data set, they're astronomically better.

Tejas Kulkarni: Right.

Tejas Kulkarni: Which is I think which was not something that I expected. So I don't think. We're going to be data bound, which is good news. And because there isn't enough at all. And I don't think we need that. And there is existence proof that humans, the human brain has figured this out. So I think, you know, it all kind of adds up from those principles. And then the so yeah, so that's kind of like the basic core premise. And I think the second thing is from a more AI standpoint, there's still a lot of talk, but now that has kind of gone down with white coding. So I think that's going to be a big revolution. So there was a whole feeling, I think, for the last two years that measures are just a bad idea. You know, everything is going to be a video model.

Tejas Kulkarni: Everything's going to be like soda that you're going to go in input, you know, just press kind of keyboard action, mouse actions.

Tejas Kulkarni: You're going to be in a solar world. I think that's going to happen. You know, that's going to be a new form of interactive media. Like, you know, you have papers from DeepMind like Genie2 that are incredible. I think that's just a new world model. I think that is going to be. Yeah. The problem with that world model is that there is no coding. So a real world model actually also needs to have program synthesis because that's where a lot of persistent state and, you know, just all the things that we have from programming. I think, you know, if you if kind of God was building a simulator, I think it definitely be a programmer to kind of express reality in the fullest extent. But I think two years ago, measures was like a bad idea. Like a deep learning community. And although like, you know, they were trying to nobody of my you know, in my friends group, they were like not interested in measures. It's kind of like that's kind of like not the end to end deep learning path. You know, you should just predict pixels and videos and put in actions. You got a game engine out. That was kind of the feeling because I don't think most people expected program synthesis to even work. I think, you know, there was no kind of there was no belief. I mean, you know, even I was like a staunch believer in the symbol. neuro symbolic systems, but I didn't, I didn't kind of predict it would happen so quickly at all. So there was this kind of just surprise, I think of how fast it moves. And so I think, you know, we don't need that much data to learn 3d and program synthesis has worked just by training on massive corpuses of, you know, text and code and stuff like this. So I feel like I don't think we need a miracle now. I think the miracles are kind of, I think we have like these

Tejas Kulkarni: two miracles, I guess, kind of colliding in some sense. So now it's going to be, I think there's a,

Jon Radoff: there's a straight shot from here. I'm trying to get a super clear understanding of something you just said, though, are you saying that you think we'll eventually move past meshes and everything will just be in a, like the,

Tejas Kulkarni: I don't think, I think there will be like, they will all kind of mix with each other,

Tejas Kulkarni: but I think there's something really powerful about code as a representation, right?

Jon Radoff: Okay.

Tejas Kulkarni: In a, in a, in a world model, because you can do anything. I mean, in principle, that, you know, it's like massive compositionality, you get a lot. And I think it just wasn't clear that if that would even be possible. So I think the, the mesh based route for building open worlds with AI was just kind of a bad idea. I think from the, you know, the weekly race plan and everything else that's going on right now, it's really going to take a lot of time. ул is the, the, the new technology that individuals are wholly inspired by and really longest оворing, but also really pumping and able to create something with limited areas of functionality that's not done in a fucking way, but that's going to be been a huge driving area, I think, there are gonna be unpacked impeccable marketing priorities. be creative ideas that are happening the next few years. Anybody thought about that? At that point this was probably my first question that I'm thinking about back and Greetings from AllysMax is so open. I think фор the possession of very mothers to make right now, But I some people are absolutely talking all theacional pitfalls I mentioned earlier to kind of work and thinking of the ability to get the right to control houses because you can. interactions that are hard to capture with code. Just kind of a monkey kind of taking a basketball, and there's a giraffe who's standing there, and they're both playing kind of ball. I think how do you can simulate that with code? It's hard to do that.

Tejas Kulkarni: So for that, you just need to have good visual priors, which

Tejas Kulkarni: will come from these image models, video models.

Jon Radoff: OK, so you are saying, though, that you see a future where

Tejas Kulkarni: we kind of skip these intermediate representations

Jon Radoff: in 3D and just skip to video output based on ingesting parts of the scene and then telling you what the result is based on some kind of prompt?

Tejas Kulkarni: I think the eventual thing is so in the AGI land, ultimately, I don't believe that. That would be true, because I think AGI will figure out that it's actually efficient to recompile the game into how humans have figured out, but maybe even better. So there are some good ideas, I think, that the community has figured out over the last two, three days.

Jon Radoff: Because a mesh is basically code. If you think like a mesh is code, it's a way of representing an object in probably one of the most minimalist forms possible. You've got the geometry defined, and then you can apply a texture to it, and you can do all these things with it, as opposed to trying to video render it. Seems like a very, that's a very data intensive process to go from a neural net with all of the possibility space that that represents and then infer out of it video outputs.

Tejas Kulkarni: No, it's true. What you're asking, I agree with you. I think what you're asking is probably the biggest open question in AI. I think this is the AGI, ASI problem. If you want to build a world model, how should you build it? That is the only, I think, interesting and kind of the biggest open problem left, I think, in AI, I think at this point. And there are two opinions, I think.

Tejas Kulkarni: One opinion would say, no, no, it'll just be the predictions of a neural network, which

Tejas Kulkarni: will be pixels, and actions will come in. And I think that will happen. I don't know how, I don't know if there's a limit to how far that can go. But I think if that's going to happen, then I also believe the neural net is also going to predict code. That recompiles. The predictions of that video model into something more kind of a shorter program lens. So I think it's ultimately it's going to do what it's looking like right now, but better.

Tejas Kulkarni: So we'll have to see, I think.

Jon Radoff: Les is commenting here from YouTube that he's a measure artist and it sounds depressing to him. Should he be depressed? No, no, I think it's actually not.

Tejas Kulkarni: I actually was.

Tejas Kulkarni: So no, you shouldn't be.

Tejas Kulkarni: I think. And I'll give you a little bit of a hint. And I'll give you give you like a clear reason for this. Like the video model, people will do the video things. I think, you know, that will be more like VFX movie making. I think it will be like an exploration tool. Maybe even large movies will be created by that. But I think there's something fundamental about the kind of things you're doing, which is connected to coding, which is what John was also saying. I think I think you're kind of you'll be fine from the principles of Kolmogorov complexity. I think, you know, there are things. I think you're kind of. There are things that will just kind of remain in the in the computational substrates of what the kind of things you're looking at. I was I mean, last March or April, I actually felt like how you're feeling now. Right. Which is AI coding was taking off. And, you know, I'm definitely a programmer, I think at heart. And I just started feeling a bit depressed initially because I was like, you know, programming is not going to be the same anymore. You know, it's it's kind of very different now. What do I do with my life? I like company building, but I actually like programming more like as an art form. Yeah. For no reason. And I think what I've realized is is now it works so well and I don't program the same way that I do. But now I'm out of code. I've written just the last two months. It's like more than the last few years. And I think, you know, I feel much better. I actually I'm not scared at all. I think the scope of the projects that you're going to do, every project will look like GTA six. Right. Plus plus whatever. So I think I think the level of our quality of our work with this. We'll just be shifted up. And I think we would all be doing like massive projects, you know, and I think people who don't know anything, they will be creating today's games and today's experiences. But I think the people who are professionals who have spent a lot of their life, blood, sweat and tears on this, I think I think they would create monumental projects. I really do believe this.

Jon Radoff: You think people will scale out the kinds of things that the individual person is able to do as opposed to like a net decrease in human labor? Yeah. I think that's what you just said.

Tejas Kulkarni: I think so. Otherwise, it will just become it will start becoming uninteresting and just fizzle out from human culture.

Tejas Kulkarni: Anyways,

Jon Radoff: like you mentioned Kolmogorov complexity. And I love the fact that we're throwing out all the all the big terms here. I think that that's what we want this AI stream to be known about. So we're talking about the basic like what is the minimal level of complexity for a computer program or object? And I guess to apply that term to some of the things that we're doing. Like for example, like the things we've talked about, like Kolmogorov complexity for a Gaussian splat system seems like much higher than many basic 3d model 3d mesh driven things, right? So you're saying that that things probably will, even if it starts out as these video systems that are very complex, and they're able to do all the inference that you get this perfectly aesthetically consistent image. interacts with gravity and physics and all these other things that we care about in the scene the problem is that it won't be compressed enough and the agi should we have one would be like okay

Tejas Kulkarni: now let's figure out the the um the simplification model for that and you end up with something

Jon Radoff: that probably is not too far off from a mesh system or kind of a 3d prim based on 3d primitives and physics rules that we've used today we've learned that from the real world and kind of taught computers how to interact with things that are relatively similar to the real world although truth be told a lot of graphics in games for example have no similarity to the real world they're like a collection of mathematical hacks shader graph programs and stuff which we've which we've come up with because they're a lot faster than doing ray tracing so that's an interesting thing to think about at the same time as well that's that isn't that's

Tejas Kulkarni: kind of uh uh aligns with what you're saying and i think the another kind of way to uh think about what you're saying from a brain tool tooling and brain brain perspective is if you think of splats or video models that that's kind of the content of your dreams and your brain right if yeah you know if you're a creator if you close your eyes like the brain actually there's a cool phenomena where you can actually measure people you can give give someone uh um a cube or whatever right and ask someone to do mental rotations uh and you know if you actually and you know you can and it turns out that you can't do like from a to b you actually the brain does it in a just like a video basically internal video and you can measure this with fmri right so your brain is kind of rendering these you know pixely things in your head but if you had to as a designer go and use tools uh and blender and other things then you would have to kind of translate that imagination so a lot of i think you know it's kind of these two systems right one is the goey imagination which is not symbolic and then you have to use uh to to compress that down even further into something that is kind of manageable right uh so we're probably in the first phase of ai i think for the most part and then the second phase would come with with the computer tool use mcp servers and code and you know all these type of things uh uh kind of distilling all of that knowledge in a way that's not really accessible to the user and then you have to go back and do something uh

Tejas Kulkarni: reusable reprogrammable all of that stuff

Jon Radoff: i'm caught on this thing you were just saying about our brain and dreams because before you said that it um i think what i heard you just say is we basically have a little bit of a video buffer with multiple frames in our brain that can be that our brain can render against for doing things like imagining a 3d rotation before that i was thinking about my dreams and i was thinking i'm not sure my brain actually does that kind of stuff but i think it's going to be a little bit different than what you're doing in the future so i think that's a good point i think that's a good point i think that's a good point i think that's a good point actually does video i'm not even sure really what's happening in dreams it yeah i had assumed

Tejas Kulkarni: it was sort of like a symbolic representation that feels like video because it's the best my brain

Jon Radoff: can kind of come up with but is it really it it seems like it's a lot simpler than that but maybe i'm wrong about that because you're saying we have some we actually have a bit of a rendering

Tejas Kulkarni: capability i think you're right i mean i think we definitely like if i have to plan a trip uh to go to japan like i definitely can sometimes be fully bighit on that but it's kind of a choice to go somewhere else that kind of gives a feeling to me during what i don't want to go 所以 um smoke it is cigarette smoke is absolutely really really working as well anyway um so it would be nice if i could do like when i wake up i wake up at midnight with compliment i need high fever and then six-five o'clock but reaching product officer in japan uh yeah um so that people who are not made sleep over coffee to just hope that you know t that golden transition a melting Kitty has generation yes okay so we will soon talk about So I think you know, the brain can render pixels but it doesn't do that unless it really has to. But it does have it, I think you know. Sometimes you know when you have had a super vivid dream, right, it's very realistic. So it can do it if it has to do it. But I think most of the time you probably, your brain will fry. It takes up a lot more computation.

Jon Radoff: So let's dig into a little bit of what you're doing at Common Sense Machines. So we've been talking about all the problems. We gave the barest thumbnail of what you're doing at Common Sense Machines. But bring us current, like what's the problem you're solving right now? If I go to Common Sense Machines, your website, what can I do? And tell us anything that's coming next.

Tejas Kulkarni: Yeah, so I think fundamentally, like if you go on our X, I think a cool video we could start with is if you go on our X channel, there is a video that's kind of at the top entry, just a few days old, like here. Yeah, this is a cool one to start with. So I think, so you know, we have various different workflows. But one that actually, you know, makes sense for a lot of people is, if I'm trying to create characters, right, for your game, I'm trying to create props, oftentimes you little functions of things like fireflies, music sounds like tick. I'm trying to make it what I want. Like if you've got primitive projects, let's say, you want to create like written files over here, let's say a flower, let's say we have an hour paper here. So we have my filter right here. Okay? Just define this problem, I'm trying to create something in here, I'm trying to make aơn a flower, let's suppose it's a hospitality, I'm thinking, I don't know what to do with that, right? I'll leave that excuse blank for the process. And then we can choose aaociety that we can implement later, because then everything works. So I think that's great. That's awesome. Settings, alright. Thanks. Yeah. back things uh sketches you could input kind of images in this case i think gpd 4o when it got released i think it's phenomenal at taking single pictures and generating individual parts from that so you could kind of input it like a random picture of anything really uh something that you want in your in your game and you could ask it regenerate this keep everything as the same but put all the parts nicely on a sheet right because i want to reassemble them so that was the prompt that i did and then it gave me that image on the bottom left right and then from there our tools can take that and generate what you're seeing on the right which is it's an unedited output right uh that i just kind of put put things together you can kind of disassemble it and you get pbr assets and meshes and you know things like that so so so now you can take this asset and and rig it and you know do do kind of your game logic on top of this right uh so what we are building is like both at the part level at the asset level but also the at the scene level tools to actually make these type of processes faster and easier

Jon Radoff: super impressive that you break it down to the components like that and you're able to recombine

Tejas Kulkarni: yeah because that's i think i think you know and if you go go down i think if you can scroll down a little bit a bit more like you know that's that's one more example right uh where um you you can also do there are some manual steps involved in going from the right but i think you know you want to reduce that as much as you can but you can also you know get all the components in the right

Tejas Kulkarni: t poses right if you follow the right workflows uh so you can rig them get the ground plane

Tejas Kulkarni: to be you know what what it's expected to be and you can keep building your

Tejas Kulkarni: world right from from asset onwards right and take josh just so we're clear so everyone's

Jon Radoff: clear on what we're looking at here so the left hand side is an illustration that was the input yes and the right hand is the 3d model generation that came from that yeah now does it understand that those two characters in the scene are independent characters or is it building just a big 3d mesh that's kind of all thrown together like what's happening how much does it understand

Tejas Kulkarni: about the logic of this representation yeah so you have to guide the work so there are some

Tejas Kulkarni: manual workflows right now it's not fully automated right it's because even in the part spacing that i told you about sometimes you only want the belt sometimes you want the belt to be kind of just merged into the the the the clothing so in this case they're actually all independent like the the three characters are independent uh but i think we have different workflows for that so you could either do everything as merged you could do like a parts asset based workflow uh so depending on which workflow route you go down you kind of follow certain steps and then you end up kind of you know at the right level of decomposition um so but the scene assembly disassembly part assembly is not a fully solved problem i think uh it's uh we are starting to now go beyond you know from single asset to parts of single assets with this kind of worlds right but if you scroll down a little bit you'll see uh like if you um

Tejas Kulkarni: go go down a bit more a bit more a bit more

Jon Radoff: and everybody just go to common sense machines x page go look at this stuff yourself because their demos are super cool and i think you'll learn a lot yeah keep going down i think

Tejas Kulkarni: just a little bit more a little bit more i want to get to the scene level i think if you keep scrolling i think in a second you might find it uh keep scrolling keep scrolling oh yeah this one yeah so you know so this is also a scene that's created using our tools um it's a very complex scene i mean i think this is kind of the grand challenge right so if you expand that you'll see why that's complicated there's a lot going on here like there's hundreds of assets uh all the

Tejas Kulkarni: assets by the way are from uh our product right uh so even the text and everything we have tools

Tejas Kulkarni: to like get that right right uh like you can do quick touch ups to if the logo is messed up you can fix that and you know things like that um but i think that the important thing here is that the placement here was not automated i think we don't have technology yet uh to get to this level of scene i think once you get to this level of scene we have basically a 3d agi uh in existence i think you know then we would be able to do pretty much a lot I think, you know, unimaginable things, I think. So we're trying to work towards that. We are kind of, you know, at the small kind of mini world, a few parts, few things level, but the goal is to get to this level, essentially.

Jon Radoff: Now, I just saw something fall off the shelf there. So we got it because before I saw that thing fall off the shelf, I was also thinking what I want is all of those distinct objects in the 3D scene to have their own representation so that now the car can crash into the shop from behind and everything knocks over and it follows normal real-world physics that we're used to. So how close are we to that? Is that the 3D AGI?

Tejas Kulkarni: Yeah, I mean, I think, you know, there will be two steps to that. So I think the hardest step for that is to have this agentic scene builder, right, that can create this type of representation. Everything is its own components. That is the harder part because once you have done that, then putting mass and gravity and other things on these assets is going to be relatively possible. I think it's not going to be solved, but it's going to be possible with, at some basic level, it will be possible for gaming applications with visual LLMs. You could say, ask the thing, oh, what's the mass of this LAS chip and, you know, stuff like this, and you can probably get pretty far with that. And then I think a step after that is if basically a dragon comes there and, you know, kind of flaps their wings and stuff like that. Then how do you model the interaction of the dragon wings, right, which is slightly different than modeling physics because that's about biochemical reactions inside the dragon. So that's where I think these models will have to be tied to video models in a general way. So that hasn't happened. Right now, people are doing rig-based systems, but they will evolve into more like video diffusion-based systems.

Jon Radoff: Well, ultimately, you're even dealing with things like fluid dynamics there because I'm thinking of the dragon's wings, you're moving air. So what is the result of air turbulence and air currents and how does that affect a scene and the objects in the scene according to things like their air resistance properties and their weight and stuff like that? So we've got a whole bunch of questions coming in from the online audience. So let's take a moment to look at some of these. So Les asked an interesting question around rigging. You mentioned a little while ago when we were looking at the 3D character that you would rig. You said you would rig it yourself. What about auto-rigging as just yet another step that would jump into this?

Tejas Kulkarni: I think that's, I mean, I think especially the way image models are like, like automatically placing assets in the right orientations from any input. It's not a fully solved problem, but it's . Last one week, it's dramatically better. So I would say auto-rigging is very much kind of possible now, especially if you're doing vibe games and stuff. Right? I think it's like, you know, it's a little bit more complicated. Yeah. I think, you know, someone should start trying that. I think ASAP. Like, you know, if I had a little bit more time, I think it's the tools are there now. I think, you know, you could create a lot of different embodiments like bodies and, you know, start doing auto-rigging.

Tejas Kulkarni: There's a paper rig net which came out some time ago.

Tejas Kulkarni: And then there are some other startups and companies who are also kind of specializing in that type of technology. And I think we have been wanting to add that. It's just that, you know, the models were not like we were not getting good orientations. All the time. But now some of that is much better. And another question here coming in from LinkedIn.

Tejas Kulkarni: The question is, what do you recommend for people who are not coders, but want to be able to do work like this?

Jon Radoff: I think one answer is the URL is CSM.AI. Like you're not targeting coders who are programming this stuff. It's a user interface with prompting and things like that. And you can supply an image and it works out the 3D model. So I think that's a good way to start. Yeah. I think that's a good way to start.

Tejas Kulkarni: I think that's a good way to start.

Jon Radoff: Yeah. I think that's a good way to start. Yeah. I think that's a good way to start. Yeah. I think that's a good way to start.

Tejas Kulkarni: Yeah.

Jon Radoff: I think that's a good way to start. Yeah. I think that's a good way to start.

Tejas Kulkarni: Yeah.

Jon Radoff: I think that's a good way to start. And then the question is, I think that's a good way to start. Because I think that's kind of like the, there was an earlier user, one of your dialed-in person, who asked the question like, should I be concerned as an artist? I think as a programmer, I should be more concerned, honestly.

Tejas Kulkarni: Because I think the whole trend of what's happening here is that even the vibe gaming stuff that you're seeing. Yeah. If you're a creator and an artist, I think there's going to be the SaaS tools. Oh, yes. The mesh artist. I mean, you should not be depressed. I think you should actually be happy because you can probably create now

Tejas Kulkarni: the content much better than anyone I would, for example.

Tejas Kulkarni: But you would also be able to actually do the programing bit a lot better. So I think you now are kind of in control of where the future of this interactive medium goes. I think you don't need people like me, essentially, in the same way that you need it. I think, you know, I mean, programmers are going to be, of course, needed. I think, you know, but you can now do a lot of the things that you would have asked someone to do and spend a lot of time. So I think this is going to be purely additive. We're in this kind of local minimum of subculture kind of feeling, you know, when something new comes, it's kind of scary. But, you know, I believe we're going to pass that stage soon.

Jon Radoff: Let's ask this question. I'm here and he's saying, well, does it save us time? But then we make less income overall. I love to hear your comment. I don't buy that business case. I think here's the other way to look at it, Liz. I think the value of your time has the potential to go up significantly with all of these tools. Like, I already find that happening for myself, even just using things like chat, GPT, deep research, like how much I'm able to learn about a topic, even though I don't necessarily consider chat, GPT, for example, to be the ultimate teacher. And I have to go and look into things myself. Deep research is a button I press now to give me a roadmap of learning a whole bunch of stuff. And then I go off and I learn sort of the next level down on my own rather than depend on it. But it's an accelerant and it saves me a lot of time. And I feel like that paradigm plays out across a whole lot of things. So if the value of your time goes up, then economically speaking, that should translate into even greater value. Even greater income.

Tejas Kulkarni: Yeah, I agree with John. And yeah, I mean, definitely you can email me at TK at CSN. I definitely would love to hear from many of you and just kind of add one small thing to what John said. I think if you look at YouTube, right, YouTube, like many people, I can shoot a video. But that doesn't mean that I can shoot a good video. And the entire. The economy of YouTube has transformed so many people's lives. I think you want gaming and 3D to be there. Right. I think it's such a small economy right now. You want all of us to get to a point where we are at least at the YouTube economy level. Right. So and I think, you know, then taste matters. Right. Taste, creative direction. I think these are the types of things that are going to matter a lot, because if everyone can generate anything at some level of realism, I think, or engagement, then that's going to be normalized. So if you actually have. Creativity of any sorts, I think, you know, you're going to be able to do a lot.

Tejas Kulkarni: A lot of that.

Tejas Kulkarni: Something that comes. Oh, sorry. No, I apologize. Yeah, John, we're up to over over a thousand thirty people joining us.

Tejas Kulkarni: But I guess the I run a small it's not small.

Tejas Kulkarni: I generated art group. It's about two hundred thousand members now.

Tejas Kulkarni: And I deal with about five.

Tejas Kulkarni: Five hundred to a thousand submissions per day. I have to vet them because, you know, user generated content.

Tejas Kulkarni: But the argument comes up a lot is I generated art art.

Tejas Kulkarni: And my contention is absolutely, you know, I don't go and grind up, you know, like flowers to make paint anymore. The idea of the creativity that still comes from a source and inspiration. I don't think it's fair to claim that I drew this pixel by pixel, but I can also say that this does not exist without my hand on it in some form or another. As someone who is neck deep in this argument, I'm sure. But it's also exploring it technically as well as creatively. What is your opinion? Is I generated art art? Yeah, that's a great question. So so I actually play like harmonium, which is like accordion. I really like playing that with a group of people. And that's one of my favorite things to do. I think I'm a little bit of a fan of the art sort of way. And so really, I'm an artist at heart, right? And not very good. But I think I really understand when people say if you want to paint, they have the Ghibli image. It's not about the image itself. It's the process of actually doing that, right? Like, you know, the physical things and the life experiences that go on underneath all of that. So I think that will never go away. I mean, I think I totally hear what the whole the whole problem is all about. But my feeling on this is that, you know, this is inevitable. I think there's like a new medium it's forming and it's an inevitable medium. I think it will just kind of lift us up as people who practice different forms of art to kind of do to find a the next level of niche, whatever that is. And I think there's a lot to do there. So I do I do kind of hear all the concerns about like, you know, it's just different art being in front of a computer and spending. Many, many different, many, many hours for many programmers. That's art form. Right. I think literally it's kind of the meditative space of being in front of a computer for like hours on end, I think, and producing like this beautiful piece of thing, a digital piece of thing. So I think, you know, that has been art for many people. Right. But for some other people, like playing an instrument in in in person is better than sooner and other things. Right. But I think that's OK. I mean, I think, you know, you're just going to have there's just more art now. I think there's just more of it, essentially. So I don't think the concerns are going to go down. There will be some cultural kind of back and forth on this because they are different things at the end of the day. But there is different. I think they don't conflict with each other. I think they are just different. And if you're if you know you are doing that for your ego's reason, that if I was playing harmonium for myself, like how good I am, right, versus because I love doing it. Right. I think it's for the love that you're doing the art. Right. So why are you concerned about this? If you're if you if you're learning it for your own art, then just love it for your own sake, like, you know, then that's the true art. So I feel like there's a lot of egos also at play here. I can be looking at. Absolutely.

Jon Radoff: Yeah, I've written about this topic a lot myself. I went generative art a few years ago, started getting popular with this article called The Work of Art in the Age of Generative AI. And I just talked about that.

Tejas Kulkarni: And it's sort of like

Jon Radoff: what is art anyway? Like it's one of these things that people just question. And the problem is everybody has their own definition of art. I like Rick Rubin. Rick Rubin wrote this book on creativity that everybody should check out if you're really interested in creativity and he's like, well, what is art? It's really just an agreement like that people agree that something gets to be declared art. That's the easiest way to say that something is art. The thing that you put up on your wall in a. In a frame because you got it in an art gallery and other people visit your house and they nod and say, yes, that is art. Now it's art just by social agreement, so anything that could fit that could fit anything, including things that are generated by computers.

Tejas Kulkarni: It also brings up this question of what is creativity and to me, creativity.

Jon Radoff: Sometimes people think of it as this mystic, like almost inspiration comes from comes from the void into your brain. But I think creativity is. It's simpler than that. I think it's a search through like a big solution set and. Creativity is just being good at the search. If you're really good at the search, you're able to come up with better solutions faster, but you're really just sort of probing a whole massively multivariable space until you come up with with things that are useful. That's that's how I think of it. I don't I don't think creativity is is. Mystical, I think it's actually just mostly mathematical.

Tejas Kulkarni: It's combinatorial. I mean, I think, yeah, yeah, although some no, I mean, I definitely like there's a major portion of me that agrees with you, like maybe. But there's a 10 percent side of me, which is like it's like a like being an artist is like a spiritual form. It's almost like going to the monastery. And and kind of sharing some identity that you had and creating a new one, something like this, I think in some sense. So but no, I think what you're saying there, it makes sense. Right. I mean, if you're sure if it's all about like sharing your identity to create a new one and the art form is an expression to do that and the output of the art form is almost like that kind of trace of memories that you leave as you kind of walk around in life. I think I think if you're doing it with that intention, the pure art, there's not much for you to worry about because it's not the content of what you create is going to be like, you know, you wouldn't want to create what the creates.

Tejas Kulkarni: Right.

Tejas Kulkarni: Because it would be kind of boring as an artist. So you would want to kind of go beyond that in some sense.

Tejas Kulkarni: Yeah. Or the musical example given like I play piano.

Jon Radoff: There's millions and millions of people who play piano at an expert level way beyond what I'm going to do in my life, but I can play stuff competently that I choose to play and I just do it because I enjoy it and I enjoy the act of like channeling that art through myself and reinterpreting it and and making it in real

Tejas Kulkarni: time so that joy is not taken away either by the fact that a computer might be able

Jon Radoff: to do that in the future, actually computers already do. If you just want computers to play music for you, like pretty much that's a solved problem for decades now that computers could generate synthesized music quite competently right now. We're having them create things as well or do things improvisationally so people start to get a little bit upset about it. But that doesn't take away my own personal experience of expression any more than the fact that there's commercial musicians out there who make tons and tons of money are taking away from the fact that I can also enjoy the act of creativity or expression to music. But I guess it all comes down to the controversial areas just. There's a whole long list of people out there who earn a living from it. And when people feel like earning a living is threatened, then they get upset about it. And it's not so much about whether it's really human art. It's where I get to make a living doing my art in the same way that I always have. And clearly big questions remaining there. No, it's true. Tejas, this has been a really fun conversation. I would love to have you back any time, frankly. You have so many ideas. You've got really deep background. And all of this stuff. And this was episode one on April 1st, not an April Fool's joke. We covered a lot of things. And I hope everyone will go back and check out the X feed from Common Sense Machines or just go to CSN.ai. You'll see a lot of really compelling examples there of how you can use 3D generated art from things like an image. It's very, very cool stuff. So check that out and check out our next program. So tomorrow on Wednesday, we've got a conversation on Web3 gaming again. And then on Thursday, we've got the AI in gaming discussion that we're going to have with Wanderheart and Nilo. So I hope you'll come back for both of those. We're up to three programs a week now. And we're doing this because there's been so much interest in it. We've got at the end of the hour here, we're hitting the peak number of 1230 people. If we stuck around talking to Tejas for another hour, we'll probably get to 2000 or something, but we're keeping these shows tight. It's going to be a one hour program. We're going to deliver a lot of value during that time, and there were some great questions as well. So Tejas, let me thank you for being here. And until next time, thanks to the audience for taking out part of your day to be with us.

Tejas Kulkarni: Thank you so much for listening to me and this wonderful opportunity to connect with so many people.

Tejas Kulkarni: Thank you. Thank you.

Jon Radoff: Thank you, Tejas. Thank you, Oscar, for keeping us on track here. All right. Until next time, everybody. Take care. Thank you.