Page Loader
Summarize
Google researchers recreate Doom, first-person shooter game using AI
The recreated version of Doom runs at 20fps

Google researchers recreate Doom, first-person shooter game using AI

Aug 29, 2024
12:27 pm

What's the story

Google Research scientists have made a significant breakthrough with GameNGen, an artificial intelligence (AI)-powered game engine. This innovative technology can generate original gameplay for the iconic video game Doom, using a neural network. The team behind this project includes Dani Valevski, Yaniv Leviathan, Moab Arar, and Shlomi Fruchter who utilized Stable Diffusion to design GameNGen.

Gameplay mechanics

Unique approach to game generation

GameNGen's version of Doom is not just a visual simulation, but a fully playable game with consistent logic. Players can perform actions like turning, strafing, and firing weapons while experiencing accurate damage from enemies and environmental hazards. The game engine builds an actual level around the player in real-time as they navigate through it, maintaining a mostly accurate count of the player's pistol ammunition. As per the study, the game runs at 20fps.

Data acquisition

AI training and data collection

To gather the necessary training data for GameNGen to accurately recreate Doom levels, Google's team trained its AI agent to play Doom at various difficulty levels and simulate different player skill levels. The AI was rewarded for actions like collecting power-ups and completing levels, while player damage or death resulted in penalties. This approach generated hundreds of hours of visual training data for the GameNGen model to reference and replicate.

Technical challenges

Overcoming Stable Diffusion's limitations in GameNGen

Stable Diffusion, a widely used generative AI model that creates pictures from image or text prompts, was integral to the project. However, it has two main weaknesses: a lack of consistency between frames and a gradual decline in visual quality over time. To overcome these issues, Google Research trained new frames with an extended sequence of user inputs and preceding frames rather than just one prompt image. They then introduced Gaussian noise into these context frames for further refinement.

Future prospects

GameNGen's visual stability and future potential

A separate but connected neural network was used to correct its context frames, ensuring a consistently self-correcting image, and high levels of visual stability that last for long periods. Despite some imperfections in the current examples of GameNGen, such as random blobs and blurs appearing on-screen or dead enemies turning into blurry mounds post-death, the technology shows promise for future AI game development.