We Put Claude Code in Rollercoaster Tycoon

(labs.ramp.com)

151 points | by iamwil 5 days ago ago

66 comments

Kapura 8 minutes ago ago
"i vibe coded a thing to play video games for me"
i enjoy playing video games my own self. separately, i enjoy writing code for video games. i don't need ai for either of these things.
[-]
- gordonhart 4 minutes ago ago
  Yeah, but can you use your enjoyment of video games as marketing material to justify a $32B valuation?
- jsbisviewtiful 4 minutes ago ago
  AI for the sake of AI. Feels like a lot of the internet right now
- bigyabai 2 minutes ago ago
  That's fine. Tool-assisted speedruns long predate LLMs and they're boring as hell: https://youtu.be/W-MrhVPEqRo
  It's still a neat perspective on how to optimize for super-specific constraints.
hk__2 an hour ago ago
> The only other notable setback was an accidental use of the word "revert" which Codex took literally, and ran git revert on a file where 1-2 hours of progress had been accumulating.
[-]
- esafak 3 minutes ago ago
  Does Codex not let you set regex command permissions?
- _flux an hour ago ago
  Amazing that these tools don't maintain a replayable log of everything they've done.
  Although git revert is not a destructive operation, so it's surprising that it caused any loss of data. Maybe they meant git reset --hard or something like that. Wild if Codec would run that.
  [-]
  - MattGaiser an hour ago ago
    Claude Code has /rewind. Not sure if it is foolproof, but this has been tried.
- Filligree an hour ago ago
  Yet another reason to use Jujutsu. And put a `jj status` wrapper in your PS1. ;-)
  [-]
  - diath an hour ago ago
    > Yet another reason to use Jujutsu
    And what would that reason be? You can git revert a git revert.
    [-]
    - jsnell 13 minutes ago ago
      You're correct for an actual git revert, but it seems pretty clear that the original authors have mangled the story and it was actually either a "git checkout" or "git reset". The "file where 1-2 hours of progress had been accumulating" phrasing only makes sense if those were uncommitted changes.
      And the reason jj helps in that case is that for jj there is no such thing as an uncommitted change.
    - mbb70 18 minutes ago ago
      Probably it actually ran git checkout or reset. As you say git revert only operates on committed snapshots so it will all be in the reflog
  - westurner 31 minutes ago ago
    Start with env args like AGENT_ID for indicating which Merkle hash of which model(s) generated which code with which agent(s) and add those attributes to signed (-S) commit messages. For traceability; to find other faulty code generated by the same model and determine whether an agent or a human introduced the fault.
    Then, `git notes` is better for signature metadata because it doesn't change the commit hash to add signatures for the commit.
    And then, you'd need to run a local Rekor log to use Sigstore attestations on every commit.
    Sigstore.dev is SLSA.dev compliant.
    Sigstore grants short-lived release attestation signing keys for CI builds on a build farm to sign artifacts with.
    So, when jujutsu autocommits agent-generated code, what causes there to be an {{AGENT_ID}} in the commit message or git notes? And what stops a user from forging such attestations?
    [-]
    - westurner 6 minutes ago ago
      - "Diffwatch – Watch AI agents touch the FS and see diffs live" (2025) https://news.ycombinator.com/item?id=45786382
sriram_sun 13 minutes ago ago
> "Where Claude excels:"
Am I reading a Claude generated summary here?
pocketarc an hour ago ago
I love the interview at the end of the video. The kubectl-inspired CLI, and the feedback for improvements from Claude, as well as the alerts/segmentation feedback.
You could take those, make the tools better, and repeat the experience, and I'd love to see how much better the run would go.
I keep thinking about that when it comes to things like this - the Pokemon thing as well. The quality of the tooling around the AI is only going to become more and more impactful as time goes on. The more you can deterministically figure out on behalf of the AI to provide it with accurate ways of seeing and doing things, the better.
Ditto for humans, of course, that's the great thing about optimizing for AI. It's really just "if a human was using this, what would they need"? Think about it: The whole thing with the paths not being properly connected, a human would have to sit down and really think about it, draw/sketch the layout to visualize and understand what coordinates to do things in. And if you couldn't do that, you too would probably struggle for a while. But if the tool provided you with enough context to understand that a path wasn't connected properly and why, you'd be fine.
phreeza 7 minutes ago ago
Claude Code in dwarf fortress would be wild
lukebechtel 2 hours ago ago
> We don't know any C++ at all, and we vibe-coded the entire project over a few weeks. The core pieces of the build are…
what a world!
[-]
- yoyohello13 an hour ago ago
  Everyone should read that section. It was really interesting reading about their experiences/challenges getting it all working.
- AndrewKemendo 2 hours ago ago
  I would’ve walked for days to a CompUSA and spent my life savings if there was anything remotely equivalent to this when I was learning C on my Macintosh 4400 in 1997
  People don’t appreciate what they have
  [-]
  - imiric a minute ago ago
    Did you actually learn C? Be thankful nothing like this existed in 1997.
    A machine generating code you don't understand is not the way to learn a programming language. It's a way to create software without programming.
    These tools can be used as learning assistants, but the vast majority of people don't use them as such. This will lead to a collective degradation of knowledge and skills, and the proliferation of shoddily built software with more issues than anyone relying on these tools will know how to fix. At least people who can actually program will be in demand to fix this mess for years to come.
  - lifetimerubyist 2 hours ago ago
    It’s worse. They’re proud they don’t know.
    [-]
    - risyachka an hour ago ago
      Its like ordering a project from upwork- someone did it for you, you have no idea what is going on, kinda works though.
      [-]
      - kmijyiyxfbklao 22 minutes ago ago
        Since there are no humans involved, it's more like growing a tree. Sure it's good to know how trees grow, but not knowing about cells didn't stop thousands of years of agriculture.
        [-]
        ambicapter 20 minutes ago ago
        Very interesting analogy
      - datsci_est_2015 42 minutes ago ago
        Great analogy. “I don’t know any C++ but I hired some people on Upwork and they delivered this software demo.”
fnordpiglet an hour ago ago
Interesting article but it doesn’t actually discuss how well it performs at playing the game. There is in fact a 1.5 hour YouTube video but it woulda been nice for a bit of an outcome postmortem. It’s like “here’s the methods and set up section of a research paper but for the conclusion you need to watch this movie and make your own judgements!”
[-]
- Sharlin an hour ago ago
  It does discuss that? Basically it has good grasp of finances and often knows what "should" be done, but it struggles with actually building anything beyond placing toilets and hotdog stalls. To be fair, its map interface is not exactly optimal, and a multimodal model might fare quite a bit better at understanding the 2D map (verticality would likely still be a problem).
- cyanydeez an hour ago ago
  I was told the important part of AI is the generation part, not the verification or quality.
nipponese an hour ago ago
> kept the context above the ~60% remaining level where coding models perform at their absolute best
Maybe this is obvious to Claude users but how do you know your remaining context level? There is UI for this?
[-]
- adithyareddy an hour ago ago
  You can also show context in the statusline within claude code: https://code.claude.com/docs/en/statusline#context-window-us...
  [-]
  - nipponese an hour ago ago
    Follow up Q: what are you supposed to do when the context becomes too large? Start a new conversation/context window and let Claude start from scratch?
    [-]
    - kcoddington 21 minutes ago ago
      Either have Claude /compact or have it output things to a file it can read in on the next session. That file would be a summary of progress for work on a spec or something similar. Also good to prime it again with the Readme or any other higher level context
    - pbhjpbhj 20 minutes ago ago
      It feels like one could produce a digest of the context that works very similarly but fits in the available context window - not just by getting the LLM to use succinct language, but also mathematically; like reducing a sparse matrix.
      There might be an input that would produce that sort of effect, perhaps it looks like nonsense (like reading zipped data) but when the LLM attempts to do interactive in it the outcome is close to consuming the context?
- MattGaiser an hour ago ago
  /context
- neilfrndes an hour ago ago
  Claude code has a /context command.
haunter an hour ago ago
This is what I want but for PoE/PoE2 builds. I always get a headache just looking at the passive tree https://poe.ninja/poe2/passive-skill-tree
equinumerous an hour ago ago
This is a cool idea. I wanted to do something like this by adding a Lua API to OpenRCT2 that allows you to manipulate and inspect the game world. Then, you could either provide an LLM agent the ability to write and run scripts in the game, or program a more classic AI using the Lua API. This AI would probably perform much better than an LLM - but an interesting experiment nonetheless to see how a language model can fare in a task it was not trained to do.
[-]
- equinumerous an hour ago ago
  As far as a scripting API, it looks like the devs beat me to it with a JS/TS plugin system: https://github.com/OpenRCT2/OpenRCT2/blob/develop/distributi...
rnmmrnm an hour ago ago
this is cute but i imagined prompting the ai for a loop-di-loop roller coaster. If this could build complex ride it would be a game changer.
[-]
- blibble 15 minutes ago ago
  yeah I was expecting it to... do something in the game? like build a ride
  not just make up bullshit about events
joshcsimmons 33 minutes ago ago
Interesting this is on the ramp.com domain? I'm surprised in this tech market they can pay devs to hack on Rollercoaster Tycoon. Maybe there's some crossover I'm missing but seems like a sweet gig honestly.
khoury 2 hours ago ago
Can't wait for someone to let Claude control a runescape character from scratch
[-]
- ASpring 44 minutes ago ago
  People have been botting on Runescape since the early 2000s. Obviously not quite at the Claude level :). The botting forums were a group of very active and welcoming communities. This is actually what led me to Java programming and computer science more broadly--I wrote custom scripts for my characters.
  I still have some parts of the old Rei-net forum archived on an external somewhere.
- reactordev 2 hours ago ago
  https://www.reddit.com/r/2007scape/comments/1qeh3nc/i_added_...
  https://ubos.tech/mcp/runescape-mcp-server-rs-osrs/
- ideashower 41 minutes ago ago
  Wouldn't that break Jagex's TOS though? Is there a way of getting caught?
  [-]
  - AstroBen 32 minutes ago ago
    I imagine Jagex must be up there with having the most sophisticated bot detection out of anyone. Its been a thing for decades
neom an hour ago ago
Wonder how it would do with Myst.
mentos 2 hours ago ago
The opening paragraph I thought was the agent prompt haha
> The park rating is climbing. Your flagship coaster is printing money. Guests are happy, for now. But you know what's coming: the inevitable cascade of breakdowns, the trash piling up by the exits, the queue times spiraling out of control.
sodafountan 26 minutes ago ago
This was an interesting application of AI, but I don't really think this is what LLMs excel at. Correct me if I'm wrong.
It was interesting that the poster vibe-coded (I'm assuming) the CTL from scratch; Claude was probably pretty good at doing that, and that task could likely have been completed in an afternoon.
Pairing the CTL with the CLI makes sense, as that's the only way to gain feedback from the game. Claude can't easily do spatial recognition (yet).
A project like this would entirely depend on the game being open source. I've seen some very impressive applications of AI online with closed-source games and entire algorithms dedicated to visual reasoning.
I'm still trying to figure out how this guy: https://www.youtube.com/watch?v=Doec5gxhT_U
Was able to have AI learn to play Mario Kart nearly perfectly. I find his work to be very impressive.
I guess because RCT2 is more data-driven than visually challenging, this solution works well, but having an LLM try to play a racing game sounds like it would be disastrous.
skybrian 2 hours ago ago
Would a way to take screenshots help? It seems to work for browser testing.
[-]
- joshribakoff 2 hours ago ago
  I’ve been doing game development and it starts to hallucinate more rapidly when it doesn’t understand things like the direction it placing things or which way the camera is oriented
  Gemini models are a little bit better about spatial reasoning, but we’re still not there yet because these models were not designed to do spatial reasoning they were designed to process text
  In my development, I also use the ascii matrix technique.
  [-]
  - kleene_op 2 hours ago ago
    Spatial awareness was also a huge limitation to Claude playing pokemon.
    It really seems to me that the first AI company getting to implement "spatial awareness" vector tokens and integrating them neatly with the other conventional text, image and sound tokens will be reaping huge rewards. Some are already partnering with robot companies, it's only a matter of time before one of those gets there.
    [-]
    - nszceta an hour ago ago
      This is also my experience with attempting to use Claude and GLM-4.7 with OpenSCAD. Horrible spatial reasoning abilities.
  - hypercube33 an hour ago ago
    I disagree. With opus I'll screenshot an app and draw all over it like a child with me paint and paste it into the chat - it seems to reasonably understand what I'm asking with my chicken scratch and dimensions.
    As far as 3d I don't have experience however it could be quite awful at that
  - miohtama 2 hours ago ago
    They would need a spatial reason or layout specific tool, to translate to English and back
    [-]
    - falcor84 an hour ago ago
      I wonder if they could integrate a secondary "world model" trained/fine-tuned on Rollercoaster Tycoon to just do the layout reasoning, and have the main agent offload tasks to it.
HelloUsername 2 hours ago ago
*OpenRCT2
nacozarina 5 days ago ago
next up: Crusader Kings III
[-]
- Deukhoofd 2 hours ago ago
  Crusader Kings is a franchise I really could see LLMs shine. One of the current main criticisms on the game is that there's a lack of events, and that they often don't really feel relevant to your character.
  An LLM could potentially make events far more aimed at your character, and could actually respond to things happening in the world far more than what the game currently does. It could really create some cool emerging gameplay.
  [-]
  - Braini an hour ago ago
    In general you are right, I expect something like this to appear in the future and it would be cool.
    But isn't the criticism rather that there are too many (as you say repetitive, not relevant) events - its not like there are cool stories emerging from the underlying game mechanics anymore ("grand strategy") but players have to click through these boring predetermined events again and again.
- mcphage 2 hours ago ago
  > You’re right, I did accidentally slaughter all the residents of Béziers. I won’t do that again. But I think that you’ll find God knows his own.
  [-]
  - Forgeties79 an hour ago ago
    Paradox future hire right here
azhenley 2 hours ago ago
Edit: HN's auto-resubmit in action, ignore.
[-]
- Bluescreenbuddy 2 hours ago ago
  What
  [-]
  - eterm 2 hours ago ago
    So, this link is actually 5 days old, if you hover the "2 hours ago" you'll see the date 5 days ago.
    HN second-chance pool shenanigans.