ccoreilly's comments

ccoreilly · 2026-01-14T20:56:29 1768424189

There‘s many approaches being discussed and it will depend on the size of the task. You could just review a plan and assume the output is correct but you need at least behavioural tests to understand what was built fulfilled the requirements. You can split the plan further and further until the changes are small enough to be reviewable. Where I don’t see the benefit is in asking an agent to generate test as it tends to generate many useless unit tests that make reviewing more cumbersome. Writing the tests yourself (or defining them and letting an agent write the code) and not letting implementation agents change the tests is also something worth trying.

The truth is we’re all still experimenting and shovels of all sizes and forms are being built.

nuky · 2026-01-14T21:58:10 1768427890

That matches my experience too - tests and plans are still the backbone.

What I keep running into is the step before reading tests or code: when a change is large or mechanical, I’m mostly trying to answer "did behavior or API actually change, or is this mostly reshaping?" so I know how deep to go etc.

Agree we’re all still experimenting here.

ccoreilly · 2026-01-05T21:48:05 1767649685

Would you say you’re able to draw a diagram of the application architecture out of your head or do you treat it as a black box? Do you need an AI to debug issues or not? In my experience with spec driven development, even if reviewing every single PR, it is hard to develop a mental model of the codebase structure unless you invest on it. It might be fine to treat it as a black box, not arguing the opposite but will all software be a black box in the future?

nunobrito · 2026-01-05T22:47:46 1767653266

For a completely new project it is a high risk. While the AI is fantastic at brainstorming and writing detailed architecture, it is difficult to get the "big picture" and even more difficult to verify that it is being done correctly or which things can be improved/reused, because on this situation you don't look into the code.

I don't believe people will spend time looking at the code beyond the small blurbs they can read from the command line while talking with the AI, so I agree with you that it ends being treated as a blackbox.

Did an experiment for a server implementation in Java (my strong language), gave the usual instructions and built up the server. When I went to look into the code, it was a far smaller and more concise code base than what I would write myself. AI is treating programming language on the level of a compiler for javascript, it will make the instructions super efficient and uses techniques that on my 30 years experience I'm not able to pair-review because we tend to have our own patterns of programming while these tools use everything, no matter how exotic they will use it to their advantage.

After that experience I don't look at generated source code any longer. For me it is becoming the same as trying to look at compiled binary data.

ccoreilly · on Nov 2, 2023

ElevenLabs and Gemelo.AI are services that both support text input streaming for exactly this use-case. I am not aware of any open-source Incremental TTS (this is the term used in research afaik) model but you can already achieve somthing similar by buffering tokens and sending them to the TTS model on punctuation characters.

selfhoster11 · on Nov 4, 2023

ElevenLabs only has streaming output available. I've had a look at both recently and ElevenLabs doesn't have streaming input listed as a feature. Would be cool if it had it, though. You could probably approximate this on a sentence level, but you would need to do some normalisation to make the speech sound even.

ccoreilly · on Nov 2, 2023

Whisper is an STT model, you can use whisperx to transcribe audios locally via the CLI or whisper-turbo.com that runs in the browser.

For TTS coqui has the best UX and models for a lot of languages although quality is not on par with commercial TTS providers.

jcuenod · on Nov 2, 2023

I've just been looking for SOTA TTS. I found coqui.ai and elevenlabs.io (and a bunch of others). They're good (and better than older TTS), but I am not fooled by any of them. Do you have recommendations?

selfhoster11 · on Nov 4, 2023

Gemelo was the other one listed. I doubt you'll get anything sounding more natural than ElevenLabs with the following settings:

* Model: Multilingual v2

* All options and sliders to boost similarity: set to max/yes

* Stability slider: experimentally set to a value where the model sounds natural enough without destabilising sound output

ccoreilly · on April 4, 2023

"Gat" is catalan for "cat"

ccoreilly · on July 13, 2022

What events make you conclude industry in EU will grind? Is it gas scarcity or is there something else?

01100011 · on July 13, 2022

Yeah, energy prices overall and specifically lack of methane ("natural gas") which is needed by many industries (i.e. German chemical industry).