In April 2025, Josh threw together a daily film-link puzzle so we could play it on a road trip. People kept playing — small but real traction. In May 2026, we picked it back up with two new partners: Hyperagent as the design and strategy room, and Replit as the build environment. One week of work shipped sixteen things — including a complete gameplay logic reboot, a Movie Identity Card system, a watchlist feature born inside a badge concept, daily auto-posting infrastructure, a full GA4 + GTM analytics layer, a press page, and this case study.
The Movie Game started in the front seat of a car. Emily had carried the mechanic in her head for years — every movie has actors, every actor has been in many movies, so any two movies can be connected by a third. Six Degrees of Kevin Bacon, daily. Ahead of a long drive in April 2025, Josh threw together a version we could play on the road. It was a weekend vibe-code job, deployed to a domain, and shared with a small group of friends.
And then we kept playing it. Friends kept playing it. A small, organic following formed around it. The product worked — it had real, if quiet, traction. We always meant to come back. Life got in the way for about ten months.
In May 2026 we picked it back up — this time with two new partners. Hyperagent for the design and strategy work we'd been postponing — the GTM thinking, the redesigns, the identity system we'd never gotten around to. Replit for everything code-side: the new logic, the integrations, the build, the deploy. A clear division of labor emerged in the first two days, and it held all week.
We brought the gut-feel direction. We knew how the game should be played. We said yes or no to feature ideas. We asked for changes based on the feel of the game and how — and when — to scratch a movie itch. Hyperagent turned those instincts into strategy docs, mockups, copy, and specs. Replit took the resulting plans and built them — including a complete reboot of the link/clue selection logic that we worked out together inside the Replit environment. Most days we ran ideas through Hyperagent in the morning and watched Replit ship them by afternoon.
One week. Sixteen things shipped or specced. That's the headline. The rest of this doc is what we made, why, and which AI did which part.
The constraints were specific. We weren't going to put our face on TikTok — the maker's posture is anonymous brand only. Founder-direct outreach was fine — cold emails, DMs, pitches to journalists — just no public social-as-yourself. The brand color was a specific blue (#3a8eff) and we'd already burned a cycle correcting earlier mocks that used coral. And underneath all of it: v1 had a structural limit we wanted to fix as part of the relaunch. The original architecture used cached-only client-side storage — fine when there were a dozen puzzles, painful once the archive crossed 400. To do what we wanted — a proper SEO archive, real analytics, server-derived stats, daily auto-posting — we needed a real database underneath.
The work needed to do five things at once: find the audience, redesign the moments that mattered, fix the underlying puzzle logic, build a system of identity and progression worth coming back for, and instrument everything so we could actually learn from it — all without breaking the players we already had.
The puzzle is the doorway. The movie page is the room. Practice mode is the rest of the wing. — The framing that came out of this work
The starting product was simple and worked: a daily film puzzle where you guess the movie that shares a cast member with two clue films. Four rounds, three hint reveals across them, a share string at the end. After solving, you saw a stripped-down post-game screen with the answer and a link to the movie's full page (which itself was a quiet rabbit hole of Wikipedia-grounded trivia, soundtrack, cast, awards, where-to-watch — all already shipped). Tap "Play Again" and you dropped into Practice mode with the full puzzle archive — no 24-hour cooldown.
What was missing was everything around it. No identity system. No badges, no streak visualization beyond a number. No watchlist. The Statistics modal was a basic Wordle-clone screen. The post-game share moment wasn't optimized for virality. The "How to Play" was a wall of text. There was no analytics — we couldn't actually tell what players were doing. And there was no GTM plan beyond "tell some friends." We had a great mechanic and a working game; we did not have a product designed to grow.
The single biggest under-the-hood change of the sprint was introducing Postgres as the storage backbone. v1 ran on cached-only client-side storage — it worked for a dozen puzzles but couldn't hold the last 400, and it locked us out of everything we wanted to build next: server-derived stats, a real archive with SEO surfaces, deep analytics, daily auto-posting. v2 introduces a proper database layer (Postgres 16 via Neon serverless, Drizzle ORM, shared schema.ts). The migration is intentional and one-way; the puzzle history that ships with v2 is the canonical archive going forward.
A clean division of labor emerged on day one and held all week: Hyperagent generated ideas, designs, and functional specs; Replit handled building, fine-tuning, testing, and deployment; we steered with gut-feel direction. Most surfaces below were drafted in Hyperagent in the morning and shipped through Replit by the same evening. A few of them — most notably the link/clue logic reboot — were Replit-led from the start, with Hyperagent feeding diagnostic ideas rather than designs.
Each card below names the work, the shipping status, and which AI did which half.
The first deliverable was a complete free go-to-market plan, written before any redesign work. The playbook ranked surfaces by ROI, established a four-week sequencing model, and called out the strategic positioning — most daily games punish enthusiasm with a 24-hour timeout; The Movie Game rewards it with unlimited Practice mode after the daily, plus a rich movie-page rabbit hole instead of a results screen. That single observation reframed every later pitch in the document.
It also established the anti-strategy: what we deliberately weren't doing. No founder TikTok. No public X-as-yourself. No paid social. No app store. No leaderboards. Saying no turned out to be half the value of the document — every later piece of work landed inside those constraints without re-litigating them.
The post-game moment is where players decide whether to share, whether to come back, and whether to tell anyone. The original was functional but under-loved. We rebuilt it as the central social surface of the entire game — share string anchored at the top, brand-blue accent, prominent share affordances by platform, streak ribbon, and a clear path into the rabbit-hole movie page.
Five executions of the concept were generated side-by-side so we could decide by comparison rather than abstraction. The chosen execution made the share-string the hero, with secondary surfaces (badges earned, watchlist quick-add, replay CTA) tucked below the fold.
post_game_modal_open, share with platform parameter) so every iteration is measurable.
Maybe the single highest-leverage design decision of the week. The share string had to look unmistakable in a feed, encode a complete game state in a glance, and trigger the same I-have-to-try-this reflex Wordle's green-yellow grid does. Five rounds of iteration to land on the right emoji vocabulary: 🟦 for a correct guess (brand blue), 💡 for a hint taken, 🎞️ for a wrong guess, 🎭 for a bust.
The bust emoji nearly went out as 💨 (smoke, fire-going-out streak metaphor). Josh proposed 🎭 instead — theater masks, cinematic specificity. It became both the bust signal and the framing for a card design ("the drama got you").
The clue titles are the hook. A non-player sees "Top Gun ↔ Mean Girls" in their feed and thinks how are those connected? That's the game explained in one line.
The four-emoji row is the story. Wrong guess, hint, correct, didn't need round 4. Anyone can read it.
The theater masks finish the row. Cinematic, thematically right, narratively kind. "The drama got you" reads better than "you lost."
The streak break is symbol + number. No English, copy-pastes clean across every platform.
The original Stats modal was a basic 5-tile screen (Played / Win% / Current Streak / Max Streak / Avg Guesses). Rebuilt as a Statistics screen, not a modal — a full-page surface with two views: the canonical "just the numbers" version, and a richer version that surfaces the first earned badges, the player's emerging solver style, and the average-rounds-by-puzzle distribution.
Both versions live behind the same toggle so we can measure which one drives more return engagement. The richer version anchors the eventual Identity System (§06); the simpler version respects the players who just want their numbers.
Five identity-card concepts were generated side-by-side: The Plaque, The Folio, The Cover, The Poster, The Plate. Each carried a different metaphor for who the player is as a film-watcher. After comparison, The Plaque won — inset corner brackets, logo in a framed block, polished-metal gradient — partly because it felt earned (the plaque metaphor) rather than self-selected (the cover metaphor).
The card surfaces solver style, badge seals earned, streak signature, total puzzles, and a personal title. It's designed to be screenshotted and shared — a posting artifact, not just a profile.
A complete shovel-ready spec for the identity system. Twenty badges across four categories (Volume · Mastery · Discovery · Distinction). Eight solver styles derived from playing-pattern signatures (The Director, The Cinephile, The Speedrunner, The Completionist, The Skeptic, The Romantic, The Detective, The Curator). Every badge has unlock criteria, copy, and an icon. Every style has a one-line description and the playing pattern that triggers it.
The spec includes the migration plan (non-destructive), the persistence timing (badges fire on game completion; style derives on render), the Satori-based PNG render pipeline for sharable cards, and explicit Phase 1 vs Phase 2 cuts. It's the only piece of the week that isn't live yet — it's the next thing in the queue.
badge_unlocked GA4 event already specced and wired.
This one wasn't on the original list. While working through the Identity system, we noticed that the "Watchlist Builder" badge implied a watchlist feature that didn't actually exist yet. The badge needed somewhere for the user to build. So we designed and shipped the Watchlist as its own surface — a video-rental-wall of posters the player has marked from past puzzles, opening into the rabbit-hole movie pages already shipped.
The Watchlist Builder badge progress strip lives just above the wall (Scout earned at 5 films · Watchlist Builder unlocks at 25), tying the page back to the identity system. Tapping any poster also surfaces a small new section on the detail page: "Why this is on your watchlist — Connected Whiplash and Mean Girls via Greta Lee in Puzzle #287 · added two days ago." The watchlist becomes a memory of the player's journey, not just an inventory.
/watchlist and /watchlist/:id), the Zustand slice for watchlist state, and the GA4 events (watchlist_add, watchlist_remove, watchlist_view, watchlist_detail_view) with source attribution so we can see which surfaces drive adds.
The old How to Play was a wall of text. The new one is a three-panel walkthrough: what you're solving (the cast-link mechanic with a worked example), how the rounds work (auto-revealing hints, the corner badge vocabulary, wrong-guess rows), and what happens when you finish (rabbit-hole movie page, Practice mode, sharing). Skip-ahead from the first panel for returning players.
Critically, the share-string legend (🟦 · 💡 · 🎞️ · 🎭) lives inside the modal as a small inline section — players who encounter a share string in the wild can decode it without needing to play first.
Settings used to live as a scattered set of preferences across modals. Consolidated into a dedicated screen: Hard Mode toggle, Spoiler-Safe Share, Reduce Motion, High Contrast, Default Share Target, Reminder Time, plus progress import/export and reset. Each setting has a one-line description; defaults are sensible; the screen respects the player who just wants to play and the player who wants to tune.
The push-notification reminder system was wired up alongside Settings — a daily nudge at the player's chosen time, fully opt-in, no email collection, runs on web-push with VAPID keys.
setting_changed) so we can see which knobs players actually touch.
A dedicated press page at /press — origin story, pre-approved quotes, brand assets (logos in 80/160/240px, screenshot library, OG images), the strategic positioning (the rabbit-hole win, Practice mode, schema-safe migration), and a contact email. Built so a journalist can fact-check, quote, and run a story without needing to email first.
Includes a checklist for ourselves of what an inbound journalist might ask, with pre-written answers. The page itself is now the asset linked from every cold pitch in the outreach playbook.
themoviega.me/press. Wired the brand-asset downloads. Routes added without disturbing the puzzle archive or the watchlist routes. Image optimization pipeline applied to every press-page asset.
Every past puzzle now has its own discoverable URL with proper JSON-LD VideoGame + Movie structured data. Clue movies and link movies are marked up; cast members are linked; release years and titles are crawlable. Search engines can index the entire puzzle history, and individual puzzle URLs are shareable as deep links.
Also added an archive index page (/archive) with a list view and a calendar view. Past puzzles are playable via Practice mode directly from the archive — no 24-hour wait, just tap and play.
/puzzle/[number] and /archive), the metadata strategy (per-puzzle OG image generated via Satori with the answer hidden), and the calendar view layout. Wrote the index-page copy.
archive_open and replay_open via GA4.
Maybe the most significant under-the-hood change of the week. The original puzzle generator picked the link movie's actors at random from the full credited cast. The result: clue movies that were technically connected but often felt like a stretch — a 1996 single-scene appearance linking a 2024 lead role, or two unrelated supporting parts. Players solved it eventually but the "ahhh" of recognition was muted.
We rebuilt the actor-selection logic inside Replit to optimize for billing position, filmography depth, and cross-movie recurrence — picking the actors from the link movie whose presence in the clue movies is most likely to feel obvious in hindsight. The result: puzzles where the link, once you see it, is almost embarrassingly clear. Average solve rate climbed; hint rate dropped; the share-string distribution shifted toward earlier-round wins.
The play screen was rebuilt around a clearer game model. Three fixed hint slots — Hint N reveals automatically at the end of round N, no exceptions. The corner badge on each filled slot tells the story of that round: 💡 if the player took the hint, 🎞️ gold if they guessed wrong. Wrong guesses get their own row below the indicators, struck through, only appearing when made.
The hint counter button now uses ❓ instead of 💡 — keeping 💡 reserved for the corner-badge meaning ("I took the hint this round"). The whole surface becomes self-narrating: one glance tells you what round you're in, what you know, what you've tried, and how many reveals are still coming.
❓ vs 💡 separation that finally made the surface unambiguous.
hint_reveal, guess_attempt, and game_completion events via the GA4 layer with attempt-number + correct-boolean per submission.
We went from no instrumentation to a full nine-category GA4 layer in one sprint. Container GTM-WZRC3KHR, property G-V7HQGBVLMV. Categories: Lifecycle & navigation · Game lifecycle · Post-game surfaces · Sharing · Watchlist · Settings & data · Reminders · Install / PWA · Badges. All snake_case, no PII (guess strings never sent, no emails, no tokens). All pushed to window.dataLayer from client/src/lib/analytics.ts.
Every event is documented with when-it-fires, params, and gotchas — including dedupe rules, cardinality warnings, and the difference between our custom page_view (carries surface) and GA4's auto page_view. Looker Studio dashboard plan included: engagement overview, game funnel, difficulty insights, watchlist behavior, sharing, install/PWA funnel, reminders, settings, retention.
GTM-WZRC3KHR · Property G-V7HQGBVLMV · no PII · all snake_caseclient/src/lib/analytics.ts and connected to the dataLayer. Configured the GTM container, registered the custom dimensions in GA4 Admin, verified events in DebugView. Built the Looker Studio data source connection. Module-scoped guards in place for app_open and rabbit_hole_open to prevent React StrictMode double-fires.
The site now auto-posts a daily message to Bluesky and Tumblr at 8:00 AM ET — different content variant per day-of-week (clue-forward Mon/Thu, challenge framing Tue/Fri, yesterday's chain Wed, solver stats Sat, trivia teaser Sun) so the feed reads human, not bot. The Wed-reveal variant is the only one that ever spoils — all others stay spoiler-safe so the post itself is a hook.
X is queued but skipped for now. X's new pay-per-action API pricing makes write actions expensive at our scale, and the manual copy-paste-via-Josh's-phone approach is fine for now. We'll revisit once volume justifies the cost.
puzzle.number, puzzle.clue_a, puzzle.clue_b, yesterday's solver aggregates). Specified the spoiler-safety rules (only Wed reveals). Drafted the Tumblr tag set for algorithm pickup.
@atproto/api for Bluesky and tumblr.js for Tumblr NPF. Scheduled via node-cron with timezone-aware 7 AM ET trigger (8 AM ET adjustable). Failover logic in place: if either platform 4xx's, the other still posts. Logs every send to a daily audit table. X integration is built but feature-flagged off.
After the original GTM playbook landed, we needed it operationalized. v2 of the outreach playbook is a sequence of copy-paste-ready drafts: specific Reddit posts per subreddit (tier-1 permissive, tier-2 organic-only, tier-3 topical hooks); personalized journalist emails for ten priority contacts (Bergeson, Edwards, Webster, Kottke, Cheng, Broderick, Hickey, Polygon Discoveries, Letterboxd editorial, Griffin Newman); ten newsletters; eight podcasts; five founder cross-promo emails; an eight-item aggregator submission checklist.
Plus the updated Letterboxd playbook for 2026 — reflecting that the Letterboxd API is no longer granting write access for our category of project, so the strategy is manual brand-account + lists + the Journal editorial pitch + critic DMs. Three moves, all founder-driven.
For anyone evaluating Hyperagent + Replit as a build partnership for their own product: this is what's actually running in production at themoviega.me. The whole stack is JavaScript, hosted end-to-end on Replit, deployed via Replit autoscale.
Vite 5@hookform/resolvers@fontsource)tsx in dev / esbuild-bundled in proddrizzle-zod against PostgreSQL 16 (Neon serverless driver)express-sessionmemorystore for session backing@resvg/resvg-wasm for generating share imagestailwindcss-animate, @tailwindcss/typographynodejs-20, python-3.11, postgresql-16)cartographer, runtime-error-modal, shadcn-theme-jsonshared/schema.ts — single source of truth for DB tables, Zod insert schemas, and TS types consumed by both server and clientEvery choice in the table above prioritizes one of three things: (1) fast iteration with AI assistance — TypeScript-everywhere means Hyperagent's specs and Replit's generated code stay typed-consistent end-to-end; (2) low-ops production — Replit autoscale + Neon serverless mean no infra babysitting; (3) the v1 → v2 storage shift — Drizzle + the shared schema.ts file replaced v1's cached-only client storage as the structural backbone that enabled the SEO archive, analytics, watchlist persistence, and the auto-posting cron.
Numbers are an end-of-week snapshot. The ones that matter — daily players, retention, share rate — start accumulating now that GA4 is wired up. Looker Studio dashboards are next, then weekly read-outs.
If you'd asked us going into the week, "what's the most important thing we'll ship?" — we'd have given the wrong answer. The work didn't unfold by priority; it unfolded by conversation. Each piece suggested the next. Looking back, there's a clear sequence of lightning strikes where one moment of clarity led to the next one. Here it is in order.
We almost started with design. We're glad we didn't. Writing the free GTM playbook first forced us to articulate who we are and what makes us not-Framed before redesigning a single screen. The rabbit-hole win and Practice mode emerged as the strategic moat — and once those were named, every subsequent design decision had something to optimize toward.
The biggest single creative win of the week. Hyperagent proposed the emoji vocabulary — 🟦 for correct, 💡 for hint, 🎞️ for wrong guess. The initial bust state was 💨 (smoke — the streak fire goes out). Josh proposed 🎭 (theater masks) instead. The cinematic specificity made it click — and then the masks became a card framing too ("the drama got you"). A perfect example of what this partnership produces: AI generates the systematic thinking, the human picks the right metaphor.
Once the share-string was right, the post-game screen had to be rebuilt around it. Five comparative concepts. The chosen execution made the share-string the hero. Then the secondary surfaces (badges earned, watchlist quick-add) followed naturally. Then the post-game screen's structure suggested the Stats screen redesign should mirror it — same surfaces, same hierarchy, different content.
The Stats redesign needed something more than tiles. Hyperagent proposed badges. Then the badges suggested solver styles — there should be a derived identity from playing patterns. Then the badges and styles needed a place to live. That place was The Plaque — chosen from five card concepts because the "earned, not self-selected" metaphor felt right. We didn't set out to build an identity system. We set out to fix a Stats modal. The identity system fell out of the work.
This is the one we love telling. Hyperagent generated a Watchlist Builder badge in the identity spec. We looked at it and went, "wait — there's no watchlist yet." The badge implied a feature that didn't exist. So we built one. The Watchlist page is now one of the most-loved surfaces in the product — a video-rental-wall of films marked from past puzzles, opening into the rabbit-hole movie pages we already had. The badge designed the feature. That's a kind of generative magic we hadn't expected.
By mid-week we had a redesigned game, a new identity system specced out, and a watchlist. The product still felt loose — the How to Play was a wall of text, the Settings were scattered. Cleaning both up in parallel was the moment the redesign stopped being a collection of pieces and started feeling like a coherent product. Small detail with outsized effect: putting the share-string legend inside How to Play, so anyone who encounters a share string in the wild can decode it without playing first.
The single hardest piece of design to get right. The auto-reveal hint model — Hint N reveals after round N, badges show what the player did that round — sounds simple. It took four iterations and seven corrections to land on the version that's unambiguous at a glance. The breakthrough was the corner-badge separation: 💡 means "I took the hint this round" (player's action), so the hint counter button can't also use 💡 — it became ❓. Suddenly the surface stopped fighting itself.
The whole sprint had been Hyperagent-first, Replit-second. This was the inversion. The diagnostic — "the connections feel like a stretch" — went into the Replit conversation and Replit drove the implementation. Eight rounds of heuristic tuning. The result: puzzles where the link, once you see it, is almost embarrassingly clear. This is the work that matters most for retention, and it's the work that's hardest to evaluate in a screenshot. You feel it when you play.
By Friday we had a beautiful product with no way for anyone to find it. The outreach playbook v2 + the daily auto-posting infrastructure shipped together. Hyperagent wrote thirty-plus draft emails and the five auto-post content variants; Replit wired up the Bluesky and Tumblr APIs and put them on a cron. The product can now market itself daily, without us, while we focus on the next phase.
The discipline of refusing to build certain things was as important as the design of what got built. Every "not now" decision is a Phase 2 unlock — the system grows by addition, not re-architecture.
Saying no is half the strategy. — The anti-strategy section of the GTM playbook
One week of two-AI partnership shipped what would have taken a small product team a full quarter — but only because the gut-feel direction-setting was tight from hour one. If you outsource the taste, you outsource the result.