First-Person Game (FPS)
Weeks 21-22 | The leap into 3D — where the camera becomes the player.
"You don't control a character in an FPS. You ARE the character. The camera isn't showing you the world — it is your eyes."
Prerequisites
| Module | What You Used From It |
|---|---|
| Module 01 - Pong | Game loop, input handling, collision detection fundamentals |
| Module 03 - Top-Down Shooter | Vector math, projectile systems, enemy spawning, aiming mechanics |
Week 1: History & Design Theory
The Origin
John Carmack's Wolfenstein 3D (id Software, 1992) made 3D real-time gameplay possible on consumer hardware through a raycasting trick: the game world was actually a 2D grid, but by casting rays from the player's position and calculating wall distances, it rendered a convincing first-person perspective. The walls were always the same height. There was no looking up or down. It was an illusion — and it was enough to create an entirely new genre.
How the Genre Evolved
Doom (id Software, 1993) shattered Wolfenstein's limitations. Variable-height floors and ceilings, ambient lighting, non-orthogonal walls, and a new rendering engine created spaces that felt genuinely three-dimensional. Just as importantly, the WAD file system let anyone create and share custom levels and mods. Doom did not just define the FPS — it created the modding community.
Half-Life (Valve, 1998) asked: what if an FPS told a story without ever taking control away from the player? No cutscenes. No text crawls. Every narrative beat happened in real-time while you held the controls. Half-Life proved the first-person camera was not just a combat interface but a storytelling device.
Halo: Combat Evolved (Bungie, 2001) solved the problem everyone said was unsolvable: FPS controls on a gamepad. The twin-stick layout combined with generous aim assist and the "30 seconds of fun" design philosophy made console FPS a mainstream genre.
Modern competitive FPS design (Overwatch, Valorant) layers character-ability systems and team composition strategy on top of the mechanical aiming skill that has been the genre's core since Wolfenstein.
What Makes FPS Games Great
The FPS is the most embodied genre in gaming. Because the camera is the player's eyes, every design decision — field of view, head bob, weapon sway, recoil — directly affects how the player physically feels. The genre's depth comes from the tension between precision and chaos. Aiming is a fine-motor skill, but the game is constantly disrupting your aim with movement, threats from multiple directions, and time pressure.
The Essential Mechanic
Aiming and shooting in 3D space from a first-person perspective — the player IS the camera. Every shot is cast from the center of the player's view into the world, making the act of looking and the act of aiming the same thing.
Week 2: Build the MVP
What You're Building
A first-person 3D environment where you can move, look around, and shoot at targets or enemies. This is a shooting gallery or simple arena — not a full campaign. The goal is to internalize 3D space, camera control, and raycasting for interaction.
Core Concepts (Must Implement)
1. 3D Coordinate Systems and Transforms
In 2D, you worked with (x, y). Now every object has a position (x, y, z), a rotation (pitch, yaw, roll), and a scale — collectively called a transform. Transforms are hierarchical: a gun attached to a hand inherits the hand's position and rotation.
transform:
position = (x, y, z)
rotation = (pitch, yaw, roll) // or a quaternion
scale = (sx, sy, sz)
// Child transforms are relative to their parent:
gun.worldPosition = hand.worldPosition + hand.rotation * gun.localPosition Why it matters: Every object in a 3D game exists as a transform. This is the atomic unit of 3D game development.
Drag on the canvas to rotate the 3D wireframe cube. Use the sliders to move the cube along X, Y, and Z axes. The axes are labeled and color-coded: Red=X, Green=Y, Blue=Z.
2. First-Person Camera
The camera is a perspective projection: objects farther away appear smaller, creating depth. The player controls pitch (looking up/down) and yaw (looking left/right) with the mouse. Pitch must be clamped to prevent the camera from flipping upside down.
// Mouse-look each frame:
yaw += mouseX * sensitivity
pitch += mouseY * sensitivity
pitch = clamp(pitch, -89, +89) // prevent gimbal flip
camera.rotation = quaternion_from_euler(pitch, yaw, 0) Field of view (FOV) has a dramatic effect on game feel: narrow FOV (60 degrees) feels zoomed-in and claustrophobic; wide FOV (100+ degrees) gives peripheral awareness but distorts edges. Most FPS games default to 80-90 degrees.
Why it matters: The first-person camera IS the player's interface with the game. Understanding perspective projection explains why objects scale with distance and why FOV changes how the game feels.
3. 3D Character Controller
Movement must be relative to the camera's facing direction, not the world axes.
// Get camera's forward and right vectors, flattened to ground plane:
forward = camera.forward
forward.y = 0
forward = normalize(forward)
right = camera.right
right.y = 0
right = normalize(right)
moveDir = forward * inputVertical + right * inputHorizontal
moveDir = normalize(moveDir) * moveSpeed
player.position += moveDir * dt Why it matters: Camera-relative movement is what makes 3D controls feel intuitive. If "forward" always meant world-north regardless of where the player was looking, the controls would feel broken.
4. Raycasting for Shooting
When the player fires, cast an invisible ray from the camera's center point straight forward into the scene. Check what the ray intersects first — that is what gets hit.
// On fire input:
ray.origin = camera.position
ray.direction = camera.forward
hit = raycast(ray.origin, ray.direction, maxDistance)
if hit:
if hit.object.hasComponent("Health"):
hit.object.takeDamage(weaponDamage)
spawn_impact_effect(hit.point, hit.normal) Why it matters: Raycasting is the fundamental spatial query in 3D games. Beyond shooting, it is used for ground detection, line-of-sight checks, mouse picking, and AI perception.
Left side: top-down 2D map. Click to place or remove walls. Right side: simulated first-person raycast view (Wolfenstein-style). The view updates in real-time as walls change. Drag on the map to look around.
5. 3D Collision and Physics
In 3D, colliders are volumes: boxes, spheres, or capsules. The player is typically a capsule. Gravity pulls downward each frame, and ground detection determines whether the player is grounded.
// Gravity:
velocity.y -= gravity * dt
// Ground detection:
groundHit = raycast(player.position, DOWN, playerHeight/2 + skinWidth)
if groundHit:
isGrounded = true
velocity.y = max(velocity.y, 0)
player.position.y = groundHit.point.y + playerHeight/2
player.position += velocity * dt Why it matters: 3D collision is the same concept as 2D AABB from Module 1, extended by one dimension.
6. Basic Lighting
Place at least an ambient light and a directional light. Lighting transforms a flat-looking scene into one with depth, mood, and readability.
ambientLight:
color = (0.2, 0.2, 0.3)
intensity = 0.3
directionalLight:
direction = normalize(-1, -1, -0.5)
color = (1.0, 0.95, 0.8)
intensity = 0.7
castsShadows = true Why it matters: Lighting is one of the biggest differences between "programmer art that looks flat" and "a scene that feels like a place."
7. Level Geometry and BSP Concepts
3D levels are built from meshes — collections of triangles. The concept of BSP (Binary Space Partitioning), pioneered by Doom, recursively divides space into regions to determine what is visible from any given point.
// Conceptual BSP: split space with a plane
if player is in front of dividing plane:
render front geometry first, then back
else:
render back geometry first, then front
// In practice, engines use frustum culling, occlusion culling, and LOD Why it matters: Understanding how 3D space is organized explains why level design is both an art and a technical discipline.
8. HUD Overlay
Game UI is rendered in screen space — fixed to the camera's output, not positioned in the 3D world.
// Screen-space UI (drawn after 3D scene):
draw_crosshair(screen.center)
draw_text("HP: " + player.health, position=(10, 10))
draw_text("Ammo: " + weapon.ammo, position=(10, 40)) Why it matters: The distinction between screen space and world space is fundamental to 3D game UI.
Adjust the FOV (field of view) slider to see how it changes the perceived depth and space. Narrow FOV feels zoomed-in; wide FOV gives peripheral vision but distorts edges. The scene shows a corridor with pillars.
Stretch Goals
- Weapon model and animation — A 3D model visible in the lower-right with a fire animation.
- Enemy AI with navigation — Enemies that move toward the player using simple pathfinding.
- Sound spatialization — 3D audio where sounds have position and attenuate with distance.
- Multiple weapon types — Hitscan (raycast, instant) vs. projectile (moving object with travel time).
MVP Spec
| Feature | Required |
|---|---|
| 3D environment you can walk around in | Yes |
| First-person camera with mouse-look | Yes |
| WASD movement relative to camera direction | Yes |
| Shooting via raycasting (click to fire, hit detection) | Yes |
| Targets or enemies that take damage and react | Yes |
| 3D colliders preventing walking through walls | Yes |
| Basic lighting (ambient + directional) | Yes |
| HUD with crosshair, health, and score/ammo | Yes |
| Weapon viewmodel | Stretch |
| Enemy AI with movement | Stretch |
| 3D spatial audio | Stretch |
| Multiple weapon types | Stretch |
Deliverable
A playable first-person 3D game with movement, shooting, and hit detection. Write-up: What did you learn? What was the hardest part of the transition from 2D to 3D?
Analogies by Background
These analogies map 3D game dev concepts to patterns you already know. Find your background below.
For Backend Developers
| Concept | Analogy |
|---|---|
| 3D Coordinate Systems & Transforms | Like nested namespaces or hierarchical routing — a child's position is relative to its parent |
| First-Person Camera | Like a database view or projection — the same underlying 3D world data, filtered and transformed |
| 3D Character Controller | Like request routing with middleware context — "forward" is resolved relative to the current session state |
| Raycasting for Shooting | Like a database query with a spatial index — "find the first row that intersects this line" |
| 3D Collision & Physics | Like connection validation with constraints — the capsule collider is an acceptance boundary |
| Basic Lighting | Like log levels or monitoring dashboards — ambient is your baseline (INFO), directional highlights specific areas |
| Level Geometry / BSP | Like a B-tree index for spatial data — BSP partitions space for fast lookups |
| HUD Overlay | Like a response wrapper — the 3D scene is the payload, the HUD is the metadata envelope |
For Frontend Developers
| Concept | Analogy |
|---|---|
| 3D Coordinate Systems & Transforms | Like nested CSS transforms — a child element's transform is relative to its parent's coordinate system |
| First-Person Camera | Like a Three.js PerspectiveCamera — FOV, aspect ratio, near/far clipping planes |
| 3D Character Controller | Like making scroll direction relative to the viewport, not the document |
| Raycasting for Shooting | Like document.elementFromPoint(x, y) or Three.js Raycaster |
| 3D Collision & Physics | Like collision detection in drag-and-drop with getBoundingClientRect(), extended to 3D |
| Basic Lighting | Like CSS lighting effects — ambient is a global filter: brightness() |
| Level Geometry / BSP | Like virtual scrolling or windowing (react-window) — only rendering visible DOM nodes |
| HUD Overlay | Like a position: fixed overlay on top of a 3D canvas |
For Data / ML Engineers
| Concept | Analogy |
|---|---|
| 3D Coordinate Systems & Transforms | Like affine transformation matrices — position, rotation, and scale compose into 4x4 matrices |
| First-Person Camera | Like a projection from 3D to 2D — the perspective matrix maps coordinates, identical to pinhole camera model |
| 3D Character Controller | Like transforming a velocity vector from a local coordinate frame to world coordinates |
| Raycasting for Shooting | Like a line-intersection query on a KD-tree or BVH |
| 3D Collision & Physics | Like constraint satisfaction with continuous simulation — Euler integration of forces |
| Basic Lighting | Like the Phong reflection model — ambient + diffuse + specular as dot products |
| Level Geometry / BSP | Like spatial partitioning structures (KD-trees, octrees) used for nearest-neighbor search |
| HUD Overlay | Like plotting annotations on top of a 3D visualization |
Discussion Questions
- Wolfenstein 3D faked 3D with raycasting on a 2D map. How does knowing this change how you think about the relationship between a game's internal data representation and what the player sees?
- Why does field of view matter so much in an FPS? Try playing the same scene at 60 FOV vs. 110 FOV — how does it change the feel of movement, aiming, and spatial awareness?
- Half-Life told its story entirely through the first-person camera without ever taking control away. What are the advantages and limitations of this approach compared to cutscenes?
- The transition from 2D to 3D adds one axis, but the complexity increase is not linear. What specific problems did you encounter that do not exist in 2D?