Wait, is this CLI or is this a github action or is this a github application?
Also, I thought Jules was the "coding agent" they are working on. Now this is taking it over or is this like another case of Google self-competing?
Someone needs to take charge at this company with a strong vision, because they are all over the place and spreading themselves thin, which in turn spreads thin the customer/brand equity.
At this point, as someone who:
- Has been writing Android code for about 13 years now
- Has collaborated with Google on stuff
- Lead Google developer communities and conferences
- Knows many, many GDE's and has discussions with them often
- Uses Gemini API for their product
I'm so damn confused. How is a normal customer expected to understand then?
- They have 2 SDK's for communicating to their Gemini API.
- The documentation is spread and thrown all over the place.
- Half the time I'm trying to do something I have to dig through their code to find how to.
- The features I really want are rate limited or available only to private testers.
- They have 3 coding agents now.
- Even thought they have access to my Google Account and my phone, their Gemini app is useless.
- I tried to do a basic thing (add a service account) in Google Cloud recently, which wasn't allowed due to default rules that are deprecated and are so confusing to change due to their confusing UX.
The only usable thing is the AI studio, which is a great tool for experimenting with diff models and improved the DX of getting a Gemini API key by a mile.
I'd say congrats on the release, but honestly this is such a mid low hanging fruit of a product.
They need a boundary between their research culture and their software culture. One org, two cultures.
The chaos you describe is actually a significant positive in research environments. It's not spreading oneself thin, it is diversifying and decorrelating ones' efforts. You can't centrally plan all innovation.
But for the interface between the customer and the research output, which is a software and product problem, that definitely needs a different approach.
Completely agree - the research output should be integrated into a customer facing product, instead they are trying to integrate customers into into research output.
My take on this is that Google has a bunch of "incubating" spaces where they have teams of people building things that may or may not take off. So, when something does take off, it sort of becomes a victim of its own success. It confuses people because it's not a "core" Google product that fits nicely among other Google products. NotebookLLM seems to be another example.
Personally, I would rather Google did this sort of experimentation even if it is more confusing.
Or I could be wrong about this. But following NotebookLLM, it seemed like the team developing it had a lot of autonomy.
That is so, but the problem this causes is more than just customer confusion - it is a lack of integration and responsibility. There is no "let's polish this and see if it works based on real user feedback", but it's "let's throw this out and shut it down if it doesn't work".
And if it isn't shut down, it is left in that terrible half-documented state, with confusing integrations and terribly integrated into the rest of the product.
Considering I'm confused both as a customer, user and a shareholder, I'd say the tactic isn't working.
If they throw it out and it's great then they get accolades; if they throw it out and it's bad, they don't. If they polish it and see if it works based on real user feedback then they also don't but it took longer. Better to just throw everything at the wall the instant it has the potential to go viral and then move on if it doesn't.
Remember that Google operates at huge scale, so even something any other company would consider wildly successful (e.g. Reader) is a waste of resources for them. That means that if you're ramping up your product over the course of a year you're wasting time and money. Go big or go home.
The teams of people want to get their work out into the public to make a big splash so they can get a sweet bonus before anyone realizes that it's not actually useful or effective. See also: Google Wave, and 80% of their other products.
They don't get a sweet bonus and promotion for helping another team improve a product, so why collaborate? Just create your own chat app according to your own team's vision/goals/available technology and release it and hope it gains more traction than the other teams' existing options.
Yeah and they have like 50 coding agents, because everyone in the entire company turned to doing the same thing. There's not that much you can invent in this space.
Face it, they have hit the "Yahoo phase" of their company life. It was a good, long run. All that remains is buying larger and larger successful startups and grinding them in to dust.
But the the "sunsetting" of projects good or bad, random shotgun approaches to everything, super awesome islands of product that slowly get bled dry... it is a failure of management structure, not just management.
I don't know the guts of Google, but I imagine there are 500 VPs (or equivalent) each with their pet project, each trying to curry favor with the boss who sent an email blast to "go big on Gemini". It feels like many teams just dropped their old busted projects and moved on to the new hotness, to hell with the customers, consistency or revenue. The only metric now is "Gemini engagement".
To be fair, the "Hello Vasco" is a generated background image and not part of the chat context. But still, you would think they would put your name in the system prompt.
And Gemini CLI github action (this project) runs again in a VM (github action runner) on a separate checkout of the code. This is what OP meant with multiple coding agents.
> This is what OP meant with multiple coding agents.
It may be the same coding agent behind the GHA. I question the implicit declaration behind OPs critique: that all 160,000+ Google folk should offer a single coding agent to their billions of users (or whatever the TAM is for coding agents). This is akin to criticizing Google Cloud for having VM, Kubernetes clusters and AppEngine; superficially, these products solve the same problem.
FWIW - this Github Actions integration is close to my ideal AI agents workflow[0]. I don't want to metaphorically look over my agents shoulder as it works in a specialized, vendor-locked IDE. I want agents to work asynchronously, taking however long they need, and tackling multiple tasks, with PRs/CLs as the unit of work. Current models may not be up to the task of single-shotting this, but the task is parallelizable across multiple agent-instances & the best solution selected (climate change be damned). I suspect Github alone may not provide adequate context as it may be missing previous tickets and design documents & the back-and-forth on requirements, but it's a start, and I'm glad Google is exploring this path for agents.
0. I believe in this workflow so much I created a proof-of-concept project that reads tickets from Vikunja and creates PRs using Aider some weeks back.
Also, if you are on Google Workspace, then everything changes there too. Activating the Gemini CLI is a smile while crying emoji kind of activity if you are trying to provide this to an entire organization [1]
gemini-cli is a command line tool that calls Gemini and shells out to common text utilities and MCP for tool use.
This appears to just be a plugin where you do things on GitHub, that sends out notifications to gemini-cli running on cloud, then gemini-cli responds and sends notifications back.
Basically just saving you the hassle of cloning at a specific commit, calling gemini-cli manually, and then uploading the result manually.
And this can't authenticate the same way the normal gemini cli does, it needs an API key from the looks of it, so free, standard and enterprise plans through OAuth currently don't work for authentication, just the free tier of the Google AI Studio, which is different than gemini-cli free tier, and has way tighter rate limits.
They do, but at this point it's becoming comical, especially if they are trying to move away from search as a profit center. You need equity in people's heads if you want to conquer the market.
If instead of Google search they made 3 products each called "Google Search", "Super Search" and "YaGoo!", they wouldn't be where they're at today.
> I tried to do a basic thing (add a service account) in Google Cloud recently, which wasn't allowed due to default rules that are deprecated and are so confusing to change due to their confusing UX.
Similarly I tried contacting some human support for billing issues but was denied because automated checks deemed me unworthy for consulting anything besides documentation pages which i didn't understood so i gave up and switched to another cloud provider.
I understand Google feels they need to compete in coding AI. The crazy thing to me is:
- Gemini can't make me a calendar appointment between myself and another person for 30 minutes in the next week. Heck it can't make appointments yet.
- it can't edit or collaborate on Google docs, just insert. I edit my docs in cline or Claude code as markdown and upload.
- speaking of, I don't think they have a MCP for working with docs or sheets
- Gemini is worse than a Google search at helping me with sheet formulas
There's all these unique places in googles ecosystem I feel they could/should be excelling at AI at. They're not.
Hell I noticed yesterday searching for my remarkable preorder from years ago that you can't exact string search Gmail anymore. Searching for remarkable was pulling up "amazing". They're just degrading all of their products to stupidity at a time when I and AI can use more powertools
I think Google doesn’t turn Gemini loose on docs the same reason Apple doesn’t turn AI loose on your phone. It’s just not reliable enough to let 99.99% of the world use it. Those of us on the bleeding edge have been fine tweaking and working with inconsistencies. If you put a lot of work in, you get a productivity boost.
Think of the family member you are “tech support” for. (You know who you are) would you recommend that to them? Yeah. Me neither.
I 100% agree with this, and there are just _so many use cases_ that are small and useful like this.
Heck, just yesterday my partner forgot the grocery list printout so I took a picture of it and asked gemini to convert it to a format where I could copy and paste it to a specific todo list app that was already shared with her. INSTEAD, Gemini dumped the list into Google Keep, albeit with terrible formatting. Didn't miss a single item, but did not recognize categories of item (produce vs frozen food, for example)
So my read on it is there's a lot of "rough around the edges" use cases which can be tidied with better prompting/context or just Gemini team prioritizing those things when they get around to it.
What they actually _need_ is a marketing team showing off useful applications of the releases more often. OpenAI is ALL OVER TIKTOK and people I meet under 30 on that platform don't even know gemini exists. In my experience, Gemini is better than chatGPT at everything you need to do, and it can do the things that the OpenAI marketing people are constantly showing off on TikTok.
I’ve actually been using Gemini on my phone to create appointments from details on my screen. For example, I have a delivery coming so there is an email with the date and time range. I can press and hold my power button and Gemini pops up. I press a button to use screen context. Then say, "put this in my calendar". Then it does. It isn’t perfect. Events that cross multiple days or odd location details in the description sometimes don’t get included. But that is more and more rare. I’m using an Android phone. So maybe that is why it seems to work. I do see that "mostly works" is not the same as "always works".
Also, if you are a Google Workspace customer, you can connect your workspace to the Gemini web app. It can then search and manipulate your calendar and your drive. It will also summarize documents and a few other tasks. I have less use for this but it is far from "it _can’t_ make appointments".
There's a need for better indexing. Seems like they switched search to pure embeddings and it doesn't work. Making performant hybrid search is hard in the sense that you cannot combine indexes. Ideally something like embedding, text match and quality vector. I had PoC that worked great when using those but making it scale is hard with reasonable latency.
If something like this exist please educate me as this would make tons of products better.
Totally agree. It’s so surprising that I spent almost an hour trying to figure out how to make Gemini collaborate with me on a Google Document as a kind of artifact. I was sure I was just holding it wrong. I couldn’t believe it wasn’t a feature. Even when I gave up I was still unsure if maybe my account isn’t on the right tier or something.
Could not agree more. Trying to use Veo3 via genai/vertexai sdks has been full of dead ends, broken specs, and confusion. Good ole curl seems to work though.
> Heck it can't make appointments yet. - it can't edit or collaborate on Google docs, just insert.
I'm sure it's capable of doing those things, but they have it turned off because of the significant risks involved in automatically editing important documents like that.
I suspect this is the case, much like with Apple Intelligence as well. Case in point, see the early Apple notification summaries of text messages. "Mom: That hike killed me!" AI Summary: "Mom died on hike."
All it needs though is a sandbox to execute the action in, and an approval flow for the user to review the changes the agent wants to make, or make revisions. Why does it have to be all or nothing? "Hey Google, schedule a meeting with x for next week when we are both available" "Google: OK, here's a preview of the calendar invite - do you want me to send it, or make changes, or cancel?"
The amount of time I have to spend on investigations, to understand the basics of what something ACTUALLY IS, never ceases to amaze me. Having to scrape away buzzwords, ill-conceived descriptions, and unnecessary verbose stuff... it's tiresome.
So i THINK this is what it IS:
A GitHub Action that can be included in GitHub workflow YAML files. It executes the Gemini CLI, passing in prompts, repo context, and event data (like issue text or PR diffs) to generate responses or perform actions. In other words: it's a wrapper that installs and runs the Gemini CLI inside GitHub Actions environments.
It can use GitHub's API (via tokens or apps) to read repo data (issues, PRs, code) and write back (e.g., add labels, comments, or code suggestions). It makes calls to standard HTTPS API endpoints for Gemini LLM" (via the CLI's backend interactions with Google's Gemini API)
If you have it right, there is a brief discussion on semantic linting in this recent interview with Boris Cherny and Catherine Wu on the Latent Space podcast related to AI-assisted CLI behavior here: https://www.youtube.com/watch?v=zDmW5hJPsvQ&t=1760s
I've not explored this use of CC yet, anyone actively using AI-assisted CLI in CI/CD? Not automated PR review but either to semantically pass / fail an MR or some other use of terminal-capable, multi-context mashup during CI/CD?
> 7. Google One and Ultra plans, Gemini for Workspace plans
These plans currently apply only to the use of Gemini web-based products provided by Google-based experiences (for example, the Gemini web app or the Flow video editor). These plans do not apply to the API usage which powers the Gemini CLI. Supporting these plans is under active consideration for future support.
Again, with the complicated subscription. Please just give us a monthly subscription for developers that I can pay whatever, and then use Gemini CLI, this github action, Gemini chat, Jules, etc. Just like Claude and their max subscription.
This would be a game changer for me.
Sorry, congrats on the release too. This looks cool!
I'm honestly a bit confused by the free tier of Gemini. I've been using it with different agents (Aider, and then Crush), and I hit the rate limits FAST. Like, after maybe 5 or 6 requests it just blows up. Then I can try again quite a few times, and it hits the limit. Then eventually I guess I hit my daily limit and it just stops working until the next day.
I mean this has been enough to get my feet wet and have some fun with exploring agent-based development, no doubt, and I appreciate it, but I'm having a hard time crossing my experience with,
> generous free-of-charge quotas
as they say. It's not that generous if it stops working after 5 mins? (This morning literally a single sentence I typed into Crush resulted in some back and forth I guess it called the API a few times and it just rate limited-out. Fine, it was probably a lot of requests going on, but, but I literally gave it a single small job to do and it couldn't finish it.)
Meanwhile I seem to be able to use the Gemini web app endlessly and haven't hit any limits yet.
With Gemini CLI I blow through Pro requests in < 10 minutes and it switches to Flash. I can't trust either to be autonomous. Pro will write unit tests, get a test to 100% coverage and then delete the test. Flash will get stuck in endless loops where it replaces a string in a file, doesn't realize the string has been replaced, and keep failing to recognize that fact getting stuck in a doom loop.
Glad I didn't add an API key. I've had friends who did and ended up with $xxx in charges because the models can't think or use tools properly.
This. I have a side project that I intend to finish in vibe coding mode, but Gemini CLI has been stuck fixing build errors for an hour, after multiple attempts to correct errors or refactor code. The interfaces don't even make any sense. Time for me to go in and fix the mess myself.
I added a key rotator to my AI coder, and asked a couple of friends to make keys for me. That helped code a good chunk of http://typedai.dev when 2.5 Pro came out
Last year, I was actually working on a bounty platform for Github PR's.
The low quality human-authored PR's that came in (due to the incentive we offered) combined with the fact that a draft PR could be made for pennies with AI made this concept dead in the water as far as I'm concerned.
The pain point of getting some attention and action on your opensource codebase is really no longer relevant, in fact the pain point seems to be moving to how to optimize the limited reviewer / maintainer bandwidth under the onslaught of proposed suggestions.
To this end I've been experimenting with a framework that builds PR's from the major agents and but with a focus on how to structure the tasks and review process that optimize the review => accept/revise cycle. If you're interested I've been writing up some case studies here: https://github.com/sutt/agro/blob/master/docs/case-studies/a...
My guess is that it was built by the Gemini CLI team and institutional pressures caused this name, either to make sure they get credit, or to avoid making it sound like they’re taking over a very broad product area.
Maybe a skill issue, but I've tried using Gemini 2.5 Pro in Cursor several times, and each time it is an abundance of thinking and very little (often incorrect) actions. Claude Sonnet is cheaper and much more effective for me.
Having a hard time imagining the GHA integration will be much different.
We’ve been having really good results with Copilot Agent. Sometimes we have to close a PR and refine the issue or pull down and work locally on cursor but it also jumpstarts a lot of stuff.
It seems too good to be true that this is free, unless training data is the price we'll end up paying with. Also there is no option to opt-out which is all the more sinister. I guess it should be used with caution in private/internal repos.
This sounds like Gemini Code Assist rebranded under the successful Gemini CLI banner. I'm sure this was done to "consolidate" offerings and brands, but this is just way more confusing. CLI has a meaning, and this doesn't seem to have a CLI at all? Product looks cool, but the naming is just baffling
I tried this out last month. It was useful to summarize big PR's, and even found minor issues. But nothing really useful for professionals, only for overworked open source maintainers to review and feedback newbies.
Given the amount of setup required, this seems like a very high-friction version of the GitHub Copilot Agent that's already available for every user who could interact with this.
The Gemini assistant will need to be several times better than the existing tools to even fractionally displace them.
Having seen this play out at another hyperscaler, the practical distinction is that as long as the non-GH product name comes first, that's enough to avoid confusion.
I may not have fully grasped this, but on the surface, it looks like they want me to have an AI agents inserted directly into my git workflow...like right there with all my wonderful juicy code? Is that correct?
Isn't this a recipe for disaster, or is all the FUD around agents wrecking havoc getting to me? I love Claude Code, but it can be somewhat bonkers and is at least at arms length from doing any real damage to my code (assuming I'm following good dev practices, and don't let it loose on my wider filesystem).
"It’s pretty simple: Google Meet (original) was previously Meet, which was the rebranded Hangouts Meet. Meet has been merged with Google Duo, which replaced Google Hangouts. Google Duo has been renamed Meet, and Meet has been temporarily named Google Meet (original), for clarity"
Curious to try this against the Github (website) Agent. The website Agent is definitely dumber than the vscode agent (because it has to spend 20 minutes figuring out how to build and start my monorepo apps) but on the flip side, it doesn't take up my computer and thus any value it creates is additive.
We have tried out Gemini code review vs Copilot code review and Gemini is consistently offering better code review tips. It has officially caught multiple potential bugs, even a few that reviewers might have missed, so it's definitely been additive.
Observability looks way worse. Github Agent has a full UX built into the Github PR that lets you dig into the agent behavior. This requires you to egress text logs and make sense of it yourself.
Also curious about customization. Github just rolled out "agent writes its own instructions" https://github.blog/changelog/2025-08-06-copilot-coding-agen... which is super cool, how do I customize this one and teach it how to start and manage apps across my monorepo?
> Curious to try this against the Github (website) Agent. The website Agent is definitely dumber than the vscode agent (because it has to spend 20 minutes figuring out how to build and start my monorepo apps) but on the flip side, it doesn't take up my computer and thus any value it creates is additive.
Yeah that's on you. Add a `copilot-instructions.md` file and configure the `copilot-setup-steps.yml` workflow to setup your environment. Both are supported more or less since Copilot Agent got released (though in "preview")
Most agents read `AGENTS.md`, I just symlink it to CLAUDE.md, and do the same for GEMINI.md
Tim from the GitHub Copilot coding agent product team here!
@artdigital is on the money here. Our quick tip for beginners is to use `copilot-instructions.md` (which we can now generate for you <3), but for more serious use, we'd strongly recommend adding `copilot-setup-steps.yml`.
That gets you a deterministic setup - and for many teams, it'll be easy, as you can just copy and paste from existing Actions workflows.
I have a well documented copilot-instructions.md (and have used githubs new agentic self-documentation prompt) and the reality is that it takes about 15-20 minutes to build and start multiple react, reactnative and expressjs projects.
Github now appears to support defining setup tasks in a Github Action that runs prior to the agent, so that's the next avenue of research.
Regardless, the website agent will always be slower. My local is already running and fully ready to go so the ide agent can hit the ground running on any task. The website agent has to spin up a machine and install and build. It will take time.
Wait, is this CLI or is this a github action or is this a github application?
Also, I thought Jules was the "coding agent" they are working on. Now this is taking it over or is this like another case of Google self-competing?
Someone needs to take charge at this company with a strong vision, because they are all over the place and spreading themselves thin, which in turn spreads thin the customer/brand equity.
At this point, as someone who: - Has been writing Android code for about 13 years now
- Has collaborated with Google on stuff
- Lead Google developer communities and conferences
- Knows many, many GDE's and has discussions with them often
- Uses Gemini API for their product
I'm so damn confused. How is a normal customer expected to understand then?
- They have 2 SDK's for communicating to their Gemini API.
- The documentation is spread and thrown all over the place.
- Half the time I'm trying to do something I have to dig through their code to find how to.
- The features I really want are rate limited or available only to private testers.
- They have 3 coding agents now.
- Even thought they have access to my Google Account and my phone, their Gemini app is useless.
- I tried to do a basic thing (add a service account) in Google Cloud recently, which wasn't allowed due to default rules that are deprecated and are so confusing to change due to their confusing UX.
The only usable thing is the AI studio, which is a great tool for experimenting with diff models and improved the DX of getting a Gemini API key by a mile.
I'd say congrats on the release, but honestly this is such a mid low hanging fruit of a product.
They need a boundary between their research culture and their software culture. One org, two cultures.
The chaos you describe is actually a significant positive in research environments. It's not spreading oneself thin, it is diversifying and decorrelating ones' efforts. You can't centrally plan all innovation.
But for the interface between the customer and the research output, which is a software and product problem, that definitely needs a different approach.
Completely agree - the research output should be integrated into a customer facing product, instead they are trying to integrate customers into into research output.
My take on this is that Google has a bunch of "incubating" spaces where they have teams of people building things that may or may not take off. So, when something does take off, it sort of becomes a victim of its own success. It confuses people because it's not a "core" Google product that fits nicely among other Google products. NotebookLLM seems to be another example.
Personally, I would rather Google did this sort of experimentation even if it is more confusing.
Or I could be wrong about this. But following NotebookLLM, it seemed like the team developing it had a lot of autonomy.
That is so, but the problem this causes is more than just customer confusion - it is a lack of integration and responsibility. There is no "let's polish this and see if it works based on real user feedback", but it's "let's throw this out and shut it down if it doesn't work".
And if it isn't shut down, it is left in that terrible half-documented state, with confusing integrations and terribly integrated into the rest of the product.
Considering I'm confused both as a customer, user and a shareholder, I'd say the tactic isn't working.
If they throw it out and it's great then they get accolades; if they throw it out and it's bad, they don't. If they polish it and see if it works based on real user feedback then they also don't but it took longer. Better to just throw everything at the wall the instant it has the potential to go viral and then move on if it doesn't.
Remember that Google operates at huge scale, so even something any other company would consider wildly successful (e.g. Reader) is a waste of resources for them. That means that if you're ramping up your product over the course of a year you're wasting time and money. Go big or go home.
The teams of people want to get their work out into the public to make a big splash so they can get a sweet bonus before anyone realizes that it's not actually useful or effective. See also: Google Wave, and 80% of their other products.
They don't get a sweet bonus and promotion for helping another team improve a product, so why collaborate? Just create your own chat app according to your own team's vision/goals/available technology and release it and hope it gains more traction than the other teams' existing options.
Yeah and they have like 50 coding agents, because everyone in the entire company turned to doing the same thing. There's not that much you can invent in this space.
I've come to realize that life is all about having different eggs in different baskets . Some will go bad and some will hatch into beautiful chicks .
Face it, they have hit the "Yahoo phase" of their company life. It was a good, long run. All that remains is buying larger and larger successful startups and grinding them in to dust.
But the the "sunsetting" of projects good or bad, random shotgun approaches to everything, super awesome islands of product that slowly get bled dry... it is a failure of management structure, not just management.
I don't know the guts of Google, but I imagine there are 500 VPs (or equivalent) each with their pet project, each trying to curry favor with the boss who sent an email blast to "go big on Gemini". It feels like many teams just dropped their old busted projects and moved on to the new hotness, to hell with the customers, consistency or revenue. The only metric now is "Gemini engagement".
People have been saying this for the last 10 years.
> Even thought they have access to my Google Account and my phone, their Gemini app is useless.
This is the funniest thing to me. When you open the app, Gemini says:
"Hello, Vasco"
In the welcome screen. I then ask this amazing intelligence this question:
"What's my name?"
"I do not know your name. I am an AI and I don't have access to your personal information."
I know why it happens, but it's so funny.
If I didn't know better, I'd think you were joking.
To be fair, the "Hello Vasco" is a generated background image and not part of the chat context. But still, you would think they would put your name in the system prompt.
> you would think they would put your name in the system prompt
They probably do, along with "pretend to not have any personal information about the user".
Jules works in a VM, asynchronously, on a separate checkout of the code.
Gemini CLI works synchronously with the user (unless you YOLO) and in your own directory on your own machine on your own checkout.
Two different modalities.
And Gemini CLI github action (this project) runs again in a VM (github action runner) on a separate checkout of the code. This is what OP meant with multiple coding agents.
> This is what OP meant with multiple coding agents.
It may be the same coding agent behind the GHA. I question the implicit declaration behind OPs critique: that all 160,000+ Google folk should offer a single coding agent to their billions of users (or whatever the TAM is for coding agents). This is akin to criticizing Google Cloud for having VM, Kubernetes clusters and AppEngine; superficially, these products solve the same problem.
FWIW - this Github Actions integration is close to my ideal AI agents workflow[0]. I don't want to metaphorically look over my agents shoulder as it works in a specialized, vendor-locked IDE. I want agents to work asynchronously, taking however long they need, and tackling multiple tasks, with PRs/CLs as the unit of work. Current models may not be up to the task of single-shotting this, but the task is parallelizable across multiple agent-instances & the best solution selected (climate change be damned). I suspect Github alone may not provide adequate context as it may be missing previous tickets and design documents & the back-and-forth on requirements, but it's a start, and I'm glad Google is exploring this path for agents.
0. I believe in this workflow so much I created a proof-of-concept project that reads tickets from Vikunja and creates PRs using Aider some weeks back.
Also, if you are on Google Workspace, then everything changes there too. Activating the Gemini CLI is a smile while crying emoji kind of activity if you are trying to provide this to an entire organization [1]
[1]: https://github.com/google-gemini/gemini-cli/blob/main/docs/c...
gemini-cli is a command line tool that calls Gemini and shells out to common text utilities and MCP for tool use.
This appears to just be a plugin where you do things on GitHub, that sends out notifications to gemini-cli running on cloud, then gemini-cli responds and sends notifications back.
Basically just saving you the hassle of cloning at a specific commit, calling gemini-cli manually, and then uploading the result manually.
100. That's essentially the function of a GH Action, which is why I'm also confused by the pomp and circumstance of the announcement.
Now if they could get Gemini (the LLM) to run on a GH Actions runner I'd be more excited.
And this can't authenticate the same way the normal gemini cli does, it needs an API key from the looks of it, so free, standard and enterprise plans through OAuth currently don't work for authentication, just the free tier of the Google AI Studio, which is different than gemini-cli free tier, and has way tighter rate limits.
> because they are all over the place and spreading themselves thin
Well, they do have a lot to spread. But yeah, intense amount of overlap.
They do, but at this point it's becoming comical, especially if they are trying to move away from search as a profit center. You need equity in people's heads if you want to conquer the market.
If instead of Google search they made 3 products each called "Google Search", "Super Search" and "YaGoo!", they wouldn't be where they're at today.
> I tried to do a basic thing (add a service account) in Google Cloud recently, which wasn't allowed due to default rules that are deprecated and are so confusing to change due to their confusing UX.
Similarly I tried contacting some human support for billing issues but was denied because automated checks deemed me unworthy for consulting anything besides documentation pages which i didn't understood so i gave up and switched to another cloud provider.
I believe in silicon valley terms, this is called "moving fast and breaking things"
I understand Google feels they need to compete in coding AI. The crazy thing to me is:
- Gemini can't make me a calendar appointment between myself and another person for 30 minutes in the next week. Heck it can't make appointments yet. - it can't edit or collaborate on Google docs, just insert. I edit my docs in cline or Claude code as markdown and upload. - speaking of, I don't think they have a MCP for working with docs or sheets - Gemini is worse than a Google search at helping me with sheet formulas
There's all these unique places in googles ecosystem I feel they could/should be excelling at AI at. They're not.
Hell I noticed yesterday searching for my remarkable preorder from years ago that you can't exact string search Gmail anymore. Searching for remarkable was pulling up "amazing". They're just degrading all of their products to stupidity at a time when I and AI can use more powertools
I think Google doesn’t turn Gemini loose on docs the same reason Apple doesn’t turn AI loose on your phone. It’s just not reliable enough to let 99.99% of the world use it. Those of us on the bleeding edge have been fine tweaking and working with inconsistencies. If you put a lot of work in, you get a productivity boost. Think of the family member you are “tech support” for. (You know who you are) would you recommend that to them? Yeah. Me neither.
I 100% agree with this, and there are just _so many use cases_ that are small and useful like this.
Heck, just yesterday my partner forgot the grocery list printout so I took a picture of it and asked gemini to convert it to a format where I could copy and paste it to a specific todo list app that was already shared with her. INSTEAD, Gemini dumped the list into Google Keep, albeit with terrible formatting. Didn't miss a single item, but did not recognize categories of item (produce vs frozen food, for example)
So my read on it is there's a lot of "rough around the edges" use cases which can be tidied with better prompting/context or just Gemini team prioritizing those things when they get around to it.
What they actually _need_ is a marketing team showing off useful applications of the releases more often. OpenAI is ALL OVER TIKTOK and people I meet under 30 on that platform don't even know gemini exists. In my experience, Gemini is better than chatGPT at everything you need to do, and it can do the things that the OpenAI marketing people are constantly showing off on TikTok.
I’ve actually been using Gemini on my phone to create appointments from details on my screen. For example, I have a delivery coming so there is an email with the date and time range. I can press and hold my power button and Gemini pops up. I press a button to use screen context. Then say, "put this in my calendar". Then it does. It isn’t perfect. Events that cross multiple days or odd location details in the description sometimes don’t get included. But that is more and more rare. I’m using an Android phone. So maybe that is why it seems to work. I do see that "mostly works" is not the same as "always works".
Also, if you are a Google Workspace customer, you can connect your workspace to the Gemini web app. It can then search and manipulate your calendar and your drive. It will also summarize documents and a few other tasks. I have less use for this but it is far from "it _can’t_ make appointments".
There's a need for better indexing. Seems like they switched search to pure embeddings and it doesn't work. Making performant hybrid search is hard in the sense that you cannot combine indexes. Ideally something like embedding, text match and quality vector. I had PoC that worked great when using those but making it scale is hard with reasonable latency.
If something like this exist please educate me as this would make tons of products better.
Totally agree. It’s so surprising that I spent almost an hour trying to figure out how to make Gemini collaborate with me on a Google Document as a kind of artifact. I was sure I was just holding it wrong. I couldn’t believe it wasn’t a feature. Even when I gave up I was still unsure if maybe my account isn’t on the right tier or something.
Could not agree more. Trying to use Veo3 via genai/vertexai sdks has been full of dead ends, broken specs, and confusion. Good ole curl seems to work though.
Yeah, I figured I'd try Gemini for Google Docs, but given how restricted it is, why would I?
"Take each H1 heading and split that section off into a separate document tab"
A simple but tedious task that I wanted to do for a large document. Nope, Gemini says it can't do that. It offered to tell me HOW to do it though!
Is there something Gemini can actually do that's useful?
I did something similar after copy/pasting Markdown into a Google Doc, assuming Gemini could obviously convert the section headers and such…nope!
combined with separate plans to use the Gemini CLI, it’s an incredibly goofy situation
I asked Gemini to create a chart from the tabular data I had, and nope, it can't do that.
> Heck it can't make appointments yet. - it can't edit or collaborate on Google docs, just insert.
I'm sure it's capable of doing those things, but they have it turned off because of the significant risks involved in automatically editing important documents like that.
I suspect this is the case, much like with Apple Intelligence as well. Case in point, see the early Apple notification summaries of text messages. "Mom: That hike killed me!" AI Summary: "Mom died on hike."
All it needs though is a sandbox to execute the action in, and an approval flow for the user to review the changes the agent wants to make, or make revisions. Why does it have to be all or nothing? "Hey Google, schedule a meeting with x for next week when we are both available" "Google: OK, here's a preview of the calendar invite - do you want me to send it, or make changes, or cancel?"
The amount of time I have to spend on investigations, to understand the basics of what something ACTUALLY IS, never ceases to amaze me. Having to scrape away buzzwords, ill-conceived descriptions, and unnecessary verbose stuff... it's tiresome.
So i THINK this is what it IS:
A GitHub Action that can be included in GitHub workflow YAML files. It executes the Gemini CLI, passing in prompts, repo context, and event data (like issue text or PR diffs) to generate responses or perform actions. In other words: it's a wrapper that installs and runs the Gemini CLI inside GitHub Actions environments.
It can use GitHub's API (via tokens or apps) to read repo data (issues, PRs, code) and write back (e.g., add labels, comments, or code suggestions). It makes calls to standard HTTPS API endpoints for Gemini LLM" (via the CLI's backend interactions with Google's Gemini API)
If you have it right, there is a brief discussion on semantic linting in this recent interview with Boris Cherny and Catherine Wu on the Latent Space podcast related to AI-assisted CLI behavior here: https://www.youtube.com/watch?v=zDmW5hJPsvQ&t=1760s
I've not explored this use of CC yet, anyone actively using AI-assisted CLI in CI/CD? Not automated PR review but either to semantically pass / fail an MR or some other use of terminal-capable, multi-context mashup during CI/CD?
it says "in the chat interface" write this and that. what chat interface?
That description is 100% correct!
In this case, the "chat" happens as a comment on an issue or PR addressing @gemini-cli
> 7. Google One and Ultra plans, Gemini for Workspace plans These plans currently apply only to the use of Gemini web-based products provided by Google-based experiences (for example, the Gemini web app or the Flow video editor). These plans do not apply to the API usage which powers the Gemini CLI. Supporting these plans is under active consideration for future support.
Again, with the complicated subscription. Please just give us a monthly subscription for developers that I can pay whatever, and then use Gemini CLI, this github action, Gemini chat, Jules, etc. Just like Claude and their max subscription.
This would be a game changer for me.
Sorry, congrats on the release too. This looks cool!
I need AI to understand their subscriptions.
Having some end users is a tolerable side-effect of their activities for Google.
The primary goals are promotions, bonuses and stock price.
> The primary goals are promotions, bonuses and stock price.
If that's the case, last i checked they are doing pretty well on stock price.
The markets are fickle. That can change quickly.
I'm honestly a bit confused by the free tier of Gemini. I've been using it with different agents (Aider, and then Crush), and I hit the rate limits FAST. Like, after maybe 5 or 6 requests it just blows up. Then I can try again quite a few times, and it hits the limit. Then eventually I guess I hit my daily limit and it just stops working until the next day.
I mean this has been enough to get my feet wet and have some fun with exploring agent-based development, no doubt, and I appreciate it, but I'm having a hard time crossing my experience with,
> generous free-of-charge quotas
as they say. It's not that generous if it stops working after 5 mins? (This morning literally a single sentence I typed into Crush resulted in some back and forth I guess it called the API a few times and it just rate limited-out. Fine, it was probably a lot of requests going on, but, but I literally gave it a single small job to do and it couldn't finish it.)
Meanwhile I seem to be able to use the Gemini web app endlessly and haven't hit any limits yet.
With Gemini CLI I blow through Pro requests in < 10 minutes and it switches to Flash. I can't trust either to be autonomous. Pro will write unit tests, get a test to 100% coverage and then delete the test. Flash will get stuck in endless loops where it replaces a string in a file, doesn't realize the string has been replaced, and keep failing to recognize that fact getting stuck in a doom loop.
Glad I didn't add an API key. I've had friends who did and ended up with $xxx in charges because the models can't think or use tools properly.
This. I have a side project that I intend to finish in vibe coding mode, but Gemini CLI has been stuck fixing build errors for an hour, after multiple attempts to correct errors or refactor code. The interfaces don't even make any sense. Time for me to go in and fix the mess myself.
I added a key rotator to my AI coder, and asked a couple of friends to make keys for me. That helped code a good chunk of http://typedai.dev when 2.5 Pro came out
I find their image text for the third image in the carousel funny:
> Delegate work with an "@ mini-cli" tag and the agent can complete a range of tasks, from writing bugs to fixing bugs
Surprisingly it's not fixed in the meantime. Maybe they were being honest.
I'm just here for the PR review feature
Last year, I was actually working on a bounty platform for Github PR's.
The low quality human-authored PR's that came in (due to the incentive we offered) combined with the fact that a draft PR could be made for pennies with AI made this concept dead in the water as far as I'm concerned.
The pain point of getting some attention and action on your opensource codebase is really no longer relevant, in fact the pain point seems to be moving to how to optimize the limited reviewer / maintainer bandwidth under the onslaught of proposed suggestions.
To this end I've been experimenting with a framework that builds PR's from the major agents and but with a focus on how to structure the tasks and review process that optimize the review => accept/revise cycle. If you're interested I've been writing up some case studies here: https://github.com/sutt/agro/blob/master/docs/case-studies/a...
I wonder why they call this `gemini cli`, it's not really a CLI anymore when it's primarily used through GitHub, is it?
Why not follow Claude Code naming with this and just call it `gemini github action` or `run gemini`?
My guess is that it was built by the Gemini CLI team and institutional pressures caused this name, either to make sure they get credit, or to avoid making it sound like they’re taking over a very broad product area.
Because it installs gemini-cli in the GitHub Action VM and then passes the comment from the issue/PR as prompt to gemini-cli.
I wondered the same thing, naming things is hard but they've royally screwed up the naming here.
not surprising from a company that greenlighted the name 'bard' for their AI.
This is an add-on to Gemini-CLI, which is entirely local.
Maybe a skill issue, but I've tried using Gemini 2.5 Pro in Cursor several times, and each time it is an abundance of thinking and very little (often incorrect) actions. Claude Sonnet is cheaper and much more effective for me.
Having a hard time imagining the GHA integration will be much different.
We’ve been having really good results with Copilot Agent. Sometimes we have to close a PR and refine the issue or pull down and work locally on cursor but it also jumpstarts a lot of stuff.
It seems too good to be true that this is free, unless training data is the price we'll end up paying with. Also there is no option to opt-out which is all the more sinister. I guess it should be used with caution in private/internal repos.
This sounds like Gemini Code Assist rebranded under the successful Gemini CLI banner. I'm sure this was done to "consolidate" offerings and brands, but this is just way more confusing. CLI has a meaning, and this doesn't seem to have a CLI at all? Product looks cool, but the naming is just baffling
I think this would be API-based pricing, while Gemini Code Assist is flat-rate. I'm not sure what other differences there are, though.
I tried this out last month. It was useful to summarize big PR's, and even found minor issues. But nothing really useful for professionals, only for overworked open source maintainers to review and feedback newbies.
Given the amount of setup required, this seems like a very high-friction version of the GitHub Copilot Agent that's already available for every user who could interact with this.
The Gemini assistant will need to be several times better than the existing tools to even fractionally displace them.
What existing assistant is so good you mean Claude? Gemini has to be about the same, only with clear and reasonable subscription.
Isn't there not a trademark issue over naming it Gemini CLI GitHub Actions?
As Microsoft own GitHub and it's a competitor.
If that was the case, nobody but GitHub could build actions. There is a whole GitHub Actions Marketplace and Google is in there.
https://github.com/marketplace/actions/run-gemini-cli
Having seen this play out at another hyperscaler, the practical distinction is that as long as the non-GH product name comes first, that's enough to avoid confusion.
I may not have fully grasped this, but on the surface, it looks like they want me to have an AI agents inserted directly into my git workflow...like right there with all my wonderful juicy code? Is that correct?
Isn't this a recipe for disaster, or is all the FUD around agents wrecking havoc getting to me? I love Claude Code, but it can be somewhat bonkers and is at least at arms length from doing any real damage to my code (assuming I'm following good dev practices, and don't let it loose on my wider filesystem).
What’s wrong with receiving code/security/MR review comments from AI?
Not a fan of agents that require and can’t function without access to your GitHub repository. They should be local first.
gemini-cli is very much local. This GH integration is new.
Sorry to be blunt, but Google needs a better Product Marketing team.
As an engineering manager with an AI budget, I'm always looking for better and cheaper tools.
I have a decade of engineering experience and consider myself fairly intelligent.
I still can't figure out what this is, who it's for, or how much it costs.
It's been going on for years
https://x.com/tomgara/status/1587640766696140800?lang=en
"It’s pretty simple: Google Meet (original) was previously Meet, which was the rebranded Hangouts Meet. Meet has been merged with Google Duo, which replaced Google Hangouts. Google Duo has been renamed Meet, and Meet has been temporarily named Google Meet (original), for clarity"
Curious to try this against the Github (website) Agent. The website Agent is definitely dumber than the vscode agent (because it has to spend 20 minutes figuring out how to build and start my monorepo apps) but on the flip side, it doesn't take up my computer and thus any value it creates is additive.
We have tried out Gemini code review vs Copilot code review and Gemini is consistently offering better code review tips. It has officially caught multiple potential bugs, even a few that reviewers might have missed, so it's definitely been additive.
Observability looks way worse. Github Agent has a full UX built into the Github PR that lets you dig into the agent behavior. This requires you to egress text logs and make sense of it yourself.
Also curious about customization. Github just rolled out "agent writes its own instructions" https://github.blog/changelog/2025-08-06-copilot-coding-agen... which is super cool, how do I customize this one and teach it how to start and manage apps across my monorepo?
> Curious to try this against the Github (website) Agent. The website Agent is definitely dumber than the vscode agent (because it has to spend 20 minutes figuring out how to build and start my monorepo apps) but on the flip side, it doesn't take up my computer and thus any value it creates is additive.
Yeah that's on you. Add a `copilot-instructions.md` file and configure the `copilot-setup-steps.yml` workflow to setup your environment. Both are supported more or less since Copilot Agent got released (though in "preview")
Most agents read `AGENTS.md`, I just symlink it to CLAUDE.md, and do the same for GEMINI.md
Tim from the GitHub Copilot coding agent product team here!
@artdigital is on the money here. Our quick tip for beginners is to use `copilot-instructions.md` (which we can now generate for you <3), but for more serious use, we'd strongly recommend adding `copilot-setup-steps.yml`.
That gets you a deterministic setup - and for many teams, it'll be easy, as you can just copy and paste from existing Actions workflows.
I have a well documented copilot-instructions.md (and have used githubs new agentic self-documentation prompt) and the reality is that it takes about 15-20 minutes to build and start multiple react, reactnative and expressjs projects.
Github now appears to support defining setup tasks in a Github Action that runs prior to the agent, so that's the next avenue of research.
Regardless, the website agent will always be slower. My local is already running and fully ready to go so the ide agent can hit the ground running on any task. The website agent has to spin up a machine and install and build. It will take time.