Hacker Newsnew | past | comments | ask | show | jobs | submit | sinatra's commentslogin

Oh. So when you say “May we please have terraform back?” You mean “May we please have terraform back at my employer?” Why are you posting such an employer specific request on a public forum?

Because it was meant as a rhetorical device, not a literal request.

Piggybacking on this post. Codex is not only finding much higher quality issues, it’s also writing code that usually doesn’t leave quality issues behind. Claude is much faster but it definitely leaves serious quality issues behind.

So much so that now I rely completely on Codex for code reviews and actual coding. I will pick higher quality over speed every day. Please don’t change it, OpenAI team!


Every plan Opus creates in Planning mode gets run through ChatGPT 5.2. It catches at least 3 or 4 serious issues that Claude didn’t think of. It typically takes 2 or 3 back and fourths for Claude to ultimately get it right.

I’m in Claude Code so often (x20 Max) and I’m so comfortable with my environment setup with hooks (for guardrails and context) that I haven’t given Codex a serious shot yet.


The same thing can be said about Opus running through Opus.

It's often not that a different model is better (well, it still has to be a good model). It's that the different chat has a different objective - and will identify different things.


My (admittedly one person's anecdotal) experience has been that when I ask Codex and Claude to make a plan/fix and then ask them both to review it, they both agree that Codex's version is better quality. This is on a 140K LOC codebase with an unreasonable amount of time spent on rules (lint, format, commit, etc), on specifying coding patterns, on documenting per workspace README.md, etc.


That's a fair point and yet I deeply believe Codex is better here. After finishing a big task, I used two fresh instances of Claude and Codex to review it. Codex finds more issues in ~9 out of 10 cases.

While I prefer the way Claude speaks and writes code, there is no doubt that whatever Codex does is more thorough.


Every time Claude Code finishes a task, I plan a full review of its own task with a very detailed plan and it catches itself many things it didn’t see before. It works well and it’s part of the process of refinement. We all know it’s almost never 100% hit of the first try on big chunks of code generated.


How exactly do you plan/initiate a review from the terminal? open up a new shell/instance of claude and initiate the review with fresh context?


It depends on the task but I have different Claude commands that have this role, usually I launch them from the same session. The command has the goal of doing an analysis and generating a md file that I can execute with a specific command and the md as parameter. It works quite well. The generated file is a thorough analysis of hundred of lines with specific coded content. It’s more precise that my few line prompt and help Claude stay on rails


Yeah. It dumps context into various .md files, like TODO.md.


Thanks for the tip. I was dubious, I tried GPT 5.2 for a start on a large plan and it was way better than reviewing it with Claude itself or Gemini. I then used it to help me with feature I was reviewing, it caught real discrepancies between the plan and the actual integration!


This makes me think: are there any "pair-programming" vibecoding tools that would use two different models and have them check each other?


Have you tried telling Claude not to leave serious quality issues behind?


Let’s call it JoyScript so it still shortens to JS. And so at least the name as some joy in it even if the language doesn’t.


Hah. It can’t be “I need to spend more time to figure out how to use these tools better.” It is always “I’m just smarter than other people and have a higher standard.”


Show us your repos.


My stack is React/Express/Drizzle/Postgres/Node/Tailwind. It's built on Hetzner/AWS, which I terraformed with AI.

It's a private repo, and I won't make it open source just to prove it was written with AI, but I'd be happy to share the prompts. You can also visit the site, if you'd like: https://chipscompo.com/



Spot on.


The tools produce mediocre, usually working in the most technical sense of the word, and most developers are pretty shit at writing code that doesn't suck (myself included).

I think it's safe to say that people singularly focused on the business value of software are going to produce acceptable slop with AI.


I currently use GPT‑5.1-Codex High and have a workflow that works well with the 5-hour/weekly limits, credits, et al. If I use GPT‑5.1-Codex-Max Medium or GPT‑5.1-Codex-Max High, how will that compare cost / credits / limits wise to GPT‑5.1-Codex High? I don't think that's clear. "Reduced tokens" makes me think it'll be priced similarly / lower. But, "Max" makes me think it'll be priced higher.


In my AGENTS.md (which CLAUDE.md et al soft link to), I instruct them to "On phase completion, explicitly write that you followed these guidelines." This text always shows up on Codex and very rarely on Claude Code (TBF, Claude Code is showing it more often lately).


I stopped having the same issue of 100s of tabs of "math videos that I was going to watch one day" when I started saving them in my private playlists. Now I just have 100s of videos in playlists that I just look at longingly but never watch.


lol I tried that once.

What works best for me now is to do my best at putting tabs in the correct group tbh most gather while debugging and then I can just kill the group when I'm done.

Problem is the ADHD and groups get contaminated. Mostly a few casualties is actually fine but sometimes the group gets too mixed. Eventually I nuke it all


Have you documented how you built this project using Kiro? Your learnings may help us get the best out of Kiro as we experiment with it for our medium+ size projects.


I've got a longer personal blogpost coming soon!

But in the meantime I'm also the author of the "Learn by Playing" guide in the Kiro docs. It goes step by step through using Kiro on this codebase, in the `challenge` branch. You can see how Kiro performs on a series of tasks starting with light things like basic vibe coding to update an HTML page, then slightly deeper things like fixing some bugs that I deliberately left in the code, then even deeper to a full fledged project to add email verification and password reset across client, server, and infrastructure as code. There is also an intro to using hooks, MCP, and steering files to completely customize the behavior of Kiro.

Guide link here: https://kiro.dev/docs/guides/learn-by-playing/


And the d20 rolled a 12 when you checked it for duration to hold? Man, lucky you! Give the dice a kiss!


Your comment seems unfair to me. We can say the exact same thing for the artist / IP creator:

Tough luck, then. You don’t have the right to shit on and harm everyone else just because you’re a greedy asshole who wants all the money and is unwilling to come up with solutions to problems caused by your business model.

Once the IP is on the internet, you can't complain about a human or a machine learning from it. You made your IP available on the internet. Now, you can't stop humanity benefiting from it.


Talk about victim blaming. That’s not how intellectual property or copyright work. You’re conveniently ignoring all the paywalled and pirated content OpenAI trained on.

https://www.legaldive.com/news/Chabon-OpenAI-class-action-co...

Those authors didn’t “make their IP available on the internet”, did they?


First, “Plaintiffs ACCUSE the generative AI company.” Let’s not assume OpenAI is guilty just yet. Second, assuming OpenAI didn’t access the books illegally, my point still remains. If you write a book, can you really complain about a human (or in my humble opinion, a machine) learning from it?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: