Environments
To start with, 3D environments for movies don't have to be nearly as complete as 3D environments for video games. In movies, animators only have to worry about what's going to be on-screen in the field of vision; this may require modeling of a full "room", or just the side of it that's going to be on-screen. Also, because this is a non-interactive video image, they don't have to worry about making many separate environmental objects. In 3D video games, however, environments must work on a full 360-degree level; very rarely will you play a game where your overall view or a character's first-person view doesn't encompass a full range of motion. Can you imagine pivoting your character only to face blank, black space? It would completely ruin the feeling of being immersed in the game.
In many cases, environments also have to be interconnected (up to a certain extent). If you're traveling from room to room in a game play environment where you can see from one room to the next, that room had better be there. While this is true in some ways in movies as well (if an open door is part of an environment, there should be something visible on the other side of the door), there are ways to get around it in a movie environment; a static image can be placed in the environment to create the illusion that there's something beyond the door. That won't work in a video game, however, because of the freedom of motion allowed; a flat image wouldn't be believable from every angle, so it makes more sense to just continue to build the interconnected environment for as far as is necessary.
Limitations on Available Console Power
Games also have a limitation that movies are rarely faced with: the power of the rendering engine in the game console. You may not realize this, but as you move through a game, the rendering engine is constantly creating output based on the angle of the camera following you, the character data, and the environment factors included in the game. It's almost like rendering digital output to video when creating an animation, but with one crucial difference: the digital output has to keep up with your input and be able to render as fast as you alter the motions entered via controller input. This is why many games have various levels of model detail.
To use the Final Fantasy games (VII and up, for the PSX and PS2) for example: there are generally three levels of model detail in Final Fantasy games, from the lowest-detailed, highly pixelated "super-deformed" (small, child-sized, over-toonified) models used on world maps, to the more intricate, normal-sized, but still low-quality models used in combat scenes, to finally the most detailed, smooth models used in the non-interactive movie scenes. The playable models are less detailed because the gaming console's rendering engine just doesn't have the kind of power that it takes to render full detail on characters and environments on a frame-by-frame basis, with split-second unpredictable changes and adjustments. This limitation isn't apparent in movies; while at times fully-detailed movie models will be "toned down" a little to avoid logging 200 hours of render time for five minutes of animation, on average movie animators are working with a more open time frame and can afford to render one painstaking frame at a time to produce the final result.
Use of Sound and Sound Quality
The real-time rendering constraints are also why most games before the next-gen consoles avoided adding sound other than musical backdrops in repeating MIDI or WAV format; adding voices to characters other than generic "beast" sounds would triple or double the strain on the rendering output engines, and slow the game down even further. Again, this limitation is not evident in movies, where speech and varied sound effects are necessary for the overall effect; but because movies aren't being rendered frame-by-frame as you watch, there's no need to cut corners on the audio.

