Imagine if you could provide a text, image input or video input and the AI would generate a complete environment based on the input. Not just a 3D model, but a living, breathing digital world.
Check out Blockade Labs, they are getting pretty as they are able to give you a 360 photo, that you possibly can also hook up and enjoy with a VR headset if you got one.
Others are advancing on the NeRF side of things, which focuses mostly on transforming real word imagery to a 3D model. Hope that the team in Luma AI will consider having the ability of create 3D models of say a set of several Stable Diffusion images rotated and transposed in the xyz axes which is required for a successful NeRF scan. Only this time with a non-real world generated input.
Lets talk about our goals here:
User can input either a written description or he can specify input files in the form of images, video. Audio input is not considered our use case, which is only focused on a 3D modelling a live animated enviroment.
Example: Transforming a 10 second movie scene clip into a 4K realistic explorable and highly customizable game that you design.
File Input (.jpg, .mp4, )
Text Input (optional)