I have a lot of sympathy for the author's position but I may have missed the point in the article where he explained why clarity of writing and genuineness of human expression was so vital to a robotics class. It's one thing for an instructor to appreciate those things; another for them to confound their own didactic purpose with them. This point seems obvious enough that I feel like I must have missed something.
As always, I reject wholeheartedly what this skeptical article has to say about LLMs and programming. It takes the (common) perspective of "vibe coders", people who literally don't care what code says as long as something that runs comes out the other side. But smart, professional programmers use LLMs in different ways; in particular, they review and demand alterations to the output, the same way you would doing code review on a team.
I think they summed it up well in the section "Why do we write, anyway?" — they nowhere claimed it was vital for students' success in a robotic class. On the contrary as they title a subsection there with "If it’s worth doing, it’s worth doing *badly*" (emphasis mine) — so what they are looking for is to peer into the author's mind and their original thoughts.
The implication there is that this is acceptable to pass a robotics class, and potentially this gives them more information about students' comprehension to further improve their instruction and teaching ("...that they have some kind of internal understanding to share").
On that second point, I have yet to see someone demonstrate a "smart, professional programmer use LLMs" in a way where it produces high quality output in their area of expertise, while improving their efficiency and thus saving time for them (compared to them just using a good, old IDE)!
A couple of examples from influential open source developers: Adam Wathan (Tailwind) agreeing with Mitchell Hashimoto that LLMs are making them more productive. "in their area of expertise" is not obvious from this post alone, but I am pretty confident from the way they talk about it that they are not exclusively using LLMs on side projects where they're inexpert.
Armin Ronacher also talks about using LLMs quite a bit, but I don't have as good of an example from his tweets of him straightforwardly saying "yes, they are useful to me!"
That's a good point, though it's worth thinking about what would count as a demonstration. Pointing to a PR that an LLM helped with is not enough because you don't know how long it would have taken without the use of LLMs. The counterfactual never really exists. What is the hard evidence vim or emacs makes people more productive? Using git over svn? Using search engines? Over time, we expect people to use the things that make them more productive.
I would assume honesty, so someone showing with a blog post their incremental progress, prompting or code assistant use, rough time spent on each iteration... would be a great start.
I am not looking for formal study level of trust (though even that is frequently debatable), but multiple accounts of this where there is clear quality of output and significant (estimated) time savings would be wonderful.
I don't need much convincing that for many engineers, an LLM can bring incremental speed up (10-20%), though I think that really depends on the personality (eg. do you prefer to fix stuff not created by you or write it nicely from the start?).
Is "what has been my experience" not implied in what I am still waiting for — "someone using an LLM to produce high quality code in the field of their expertise in less time than without an LLM"?
So, observing a couple of my colleagues (I am an engineering manager, but have switched back and forth between management and IC roles for the last ~20 years), I've seen them either produce crap, or spend so much time tuning the prompts that it would have been faster to do it without an LLM. They mostly used Github Copilot or ChatGPT (most recent versions as of last few months ago).
Again, I am not saying it's not being done, but I have struggled to find someone who would demonstrate it happen in a convincing enough fashion — I am really trying to imagine how I would best incorporate this into my daily non-work programming activities, so I'd love to see a few examples of someone using it effectively.
If you don't want to engage in an honest discussion, please refrain from making assumptions: nobody mentioned "merging crap". I stopped clearly at guiding an LLM to make a code change, which is at best, a pull request.
How is that better? As a team lead, what would you think of a team member who consistently generated "crap" pull requests?
You see the same thing in every argument with LLM skeptics. 'The code is bad. You don't even know what the code is doing." This is obviously false. A professional reads the code they commit and push. A professional doesn't push code they know to be bad.
"Let's identify specific, clear areas for improvement, and if they are not able to improve, let's fire them": it's as simple as that.
Teaching a human with motivation, potential and desire to learn is both easier, and more rewarding (for most humans), than attempting to teach LLM to write good code every time — humans tend to value their personal experiences more, whereas LLM relies more on the training corpus. So when I've seen people massage LLM output to be decent or excellent, it took them more time than it would have taken for them to write it from scratch without an LLM.
Which makes LLMs mostly a curiosity, and not a productivity booster. Can it get there? I hope it can, because that would be amazing.
> As a team lead, what would you think of a team member who consistently generated "crap" pull requests?
This is directly answered with my first paragraph: that's exactly what I would think of them, and how I would act on it.
Your first question was:
> How is that better?
In the second paragraph, I explained why it's better to do a code review for a crappy pull request that's human-produced vs LLM-generated: it is easier, faster, and more psychologically rewarding.
If you are talking about a case where an inexperienced human uses LLM to start off with a crappy code change, but then adapts the output during the review process, and potentially learns through it (though research confirms people learn better when they produce mistakes themselves) — they still won't be able to use LLM to produce comparable code the next time, so they'll have to do the review and improve it by hand before putting it up for review by somebody else, thus negating any productivity gain (which was the original premise), and likely reducing the learning potential.
If there was a question I misinterpreted, please enlighten me. Thanks! :)
Jim and Toby are interviewing Darryl Philbin for the position of manager at Dunder Mifflin Scranton. Jim asks what Darryl would do to resolve a conflict between two employees in the warehouse Darryl already managed. "I'll answer that, Jim. I would use it as an opportunity to teach... about actions... and consequences of actions".
That's the answer you just gave me. Good note! (Darryl didn't get the job.)
You're dodging my point. If you are managing a team where people are using LLMs to generate pull requests full of "crap" code (your word), you have a mismanaged team, and would with or without the LLMs, because on a well-managed team people don't create PRs full of crap code.
I'm fine if you want to say LLMs are dangerous tools in the hands of unseasoned developers. Fine, you can have a rule where only trusted developers get to use them. That actually seems pretty sane!
But a trustworthy developer using an LLM isn't going to be pushed by the LLM into creating "crap" PRs, because the LLM doesn't make the PRs, the developer does. If the developer isn't reading the code the LLM is producing, they're not doing their job.
Sometimes you get people saying "ok but reading that code is work so how is the LLM saving any time", which is something you could also say about adding any human developer to a team; their code also has to get reviewed.
So help me understand how your concerns here cohere.
You seem to be arguing a point I never made: I never claimed LLMs are "dangerous", nor that crappy code only comes out of LLMs. They are just a tool, and I agree it is the responsibility of the developer wielding them to produce good work with (or without) them — this was never disputed.
I don't have any issue with someone using an LLM but I have not observed any efficiency gain from those who do — that's my entire point, and the biggest selling point for using coding assistants. I've either seen them produce "crappy code" faster (which, ultimately, they could do by hand as well), or be slower than doing their work "manually".
At the same time, I disagree about teams producing lousy PRs being mismanaged by definition: there are circumstances where doing that is warranted (LLM or no LLM), as long as the long term direction is improving (less crappy code over time). There are plenty of nuances there too.
As always, I reject wholeheartedly what this skeptical article has to say about LLMs and programming. It takes the (common) perspective of "vibe coders", people who literally don't care what code says as long as something that runs comes out the other side. But smart, professional programmers use LLMs in different ways; in particular, they review and demand alterations to the output, the same way you would doing code review on a team.