AI Origins · Lesson 12

Deep Blue to AlphaGo: The Go Challenge

Chess fell to brute force. Go forced AI to become selective, probabilistic, and far more human-looking in the way it searches.

Start Here: How to Play Go

Read this first. Keep one big idea in mind: your job is to surround more space than White.

The whole game

Place stones on the line crossings and try to surround more space.

Black moves first. Stones do not slide around after you place them, so each move changes the board for good.

Try the board now

1

Place a stone

Click any empty crossing. That is your move.

2

Capture a group

If an enemy group has no open touching points left, it is captured and removed.

3

Pass when needed

If no move helps, press Pass. Two passes in a row end the game.

Show the classroom rules
  • A stone or group needs at least one open touching point. This is called a liberty, or a breathing space.
  • You cannot play on a crossing that already has a stone on it.
  • You cannot place a stone that instantly has no breathing space unless that move captures enemy stones.
  • You cannot instantly remake the exact board from the previous turn. This is the ko rule.
  • This lab uses area scoring: stones on the board and surrounded empty space both count. White gets 6.5 komi for moving second.

Play 9x9 Go vs. Monte Carlo AI

You are Black and move first. Goal: surround more space than White. If a move is illegal, the status box will tell you why in plain language.

Classroom Lab

Start simple, then notice the strategy

A good first game: start near the center, connect your stones, and watch how local captures can matter more than one flashy move.

Turn Black to play
Current estimate Black 0 | White 6.5

Observation prompt Notice how the bot can stumble into good local shapes but still miss the larger territorial plan that a stronger neural system would see.

Moves played 0
Last move None yet

Estimate = stones on the board + surrounded empty space. White includes 6.5 komi.

Black captured 0
White captured 0
Show how this AI works
  1. Generate legal candidate moves.
  2. Test random continuations from each move.
  3. Keep the moves that win more often in simulation.
  4. Play the candidate with the strongest playout score.

Deep Blue, Go, and AlphaGo in Detail

Once you have tried the board, this is the larger history arc: one machine pushed classical search to its limit, and the next had to combine learning with search to survive Go's branching explosion.

01 · IBM Deep Blue (1997)

Brute force plus a hand-built evaluator conquered chess.

  • ResultDefeated Garry Kasparov 3.5-2.5 in a 6-game rematch.
  • EngineMinimax plus alpha-beta pruning, accelerated by custom hardware.
  • SpeedEvaluated up to 200 million chess positions per second.
  • Why it workedChess has a manageable branching factor and a strong evaluation function based on material, structure, and king safety.

02 · Why Go Broke Classical AI

Go's board is bigger, murkier, and strategically harder to score.

  • Board size19x19 means 361 intersections instead of 64 squares.
  • Branching factorRoughly 200 legal moves per turn, so minimax explodes before it reaches useful depth.
  • Evaluation problemYou cannot count "piece values" to know who is winning. Territory and influence are global and fluid.
  • Life and deathReading whether a group will live often requires many moves of tactical foresight inside an enormous search tree.

03 · AlphaGo (2016)

DeepMind combined learning and search instead of relying on raw depth.

  • ResultDefeated 9-dan professional Lee Sedol 4-1 in March 2016, years earlier than most experts expected.
  • PolicyA neural network learned which moves were plausible, shrinking the search space.
  • ValueA second network estimated win probability from a position without playing the whole game out.
  • SearchMonte Carlo Tree Search explored promising branches more deeply and abandoned weak branches sooner.

The AlphaGo Recipe

AlphaGo's breakthrough was not one trick. It was a stack: learned priors, learned evaluation, search, and then self-play.

1

Policy Network

Trained on expert games to predict strong candidate moves so the engine did not waste time exploring nonsense.

2

Value Network

Estimated who was winning from a board position, replacing the impossible idea of searching all the way to the end.

3

MCTS

Monte Carlo Tree Search balanced exploration and exploitation, directing more compute toward the moves that looked promising.

4

AlphaGo Zero

Later versions learned entirely from self-play, showing that powerful strategic intuition could emerge without human game data.

What Students Should Be Able to Explain

By the end of the lesson, students should be able to connect the game they just played to the larger shift from Deep Blue to AlphaGo.

Learning Goals

  • Goal 1Explain why Go's branching factor makes brute-force minimax impractical.
  • Goal 2Describe how Monte Carlo playouts estimate move quality without searching to the end.
  • Goal 3Contrast Deep Blue's hand-built search with AlphaGo's learned priors plus search.
  • Goal 4Explain why AlphaGo Zero mattered even more than AlphaGo itself.

Key Concepts

  • LibertyAn empty intersection directly adjacent to a stone or connected group.
  • Ko ruleYou cannot immediately recreate the exact previous board position.
  • MCTSMonte Carlo Tree Search explores, simulates, and back-propagates outcomes.
  • KomiWhite receives points to offset Black's first-move advantage.

Look For These Moments

Use these prompts while students are playing or debriefing.

  • When does the AI make a move that seems locally smart but globally awkward?
  • Which move made it easiest to tell who controlled territory?
  • What kind of position would be hardest for random playouts to judge correctly?

Go Further: Try a Neural-Net Bot

The Monte Carlo AI above has no neural network. To experience the next step in Go AI, compare it with KataGo or another neural-net bot on Online Go Server.

Run the Comparison

  • Start smallPlay 9x9 on OGS first so the board size matches this classroom lab.
  • Use KataGoKataGo is one of the strongest open-source descendants of AlphaGo-style architecture.
  • Watch styleNeural-net bots usually show stronger whole-board judgment, shape, and timing than a pure playout bot.

Open Online Go Server

Debrief Questions

  • How did the Monte Carlo bot's move quality compare to KataGo's move quality?
  • Where did the neural-net bot seem to understand the whole board better?
  • If you could add one improvement to this classroom AI, would you add better search or better evaluation first?

Lesson Outputs

Play at least one full game, then discuss using specific evidence from the board rather than vague impressions.

Discussion

After the game, answer these three questions

  • Why does Monte Carlo search help more in Go than brute-force minimax alone?
  • What would a neural network add that this classroom bot does not have?
  • What does AlphaGo Zero suggest about the relationship between learning, data, and intuition?
Compare

Return to Deep Blue

Compare the Go lesson with the chess lesson and name the exact point where the brute-force strategy breaks.

Navigate

Back to STEAM Lessons

Return to the full ClassroomOS lesson shelf.