There's definitely a method to using them well. It took me 6 months of trial, giving up, trying again, refining my approach, ... to eventually get fairly good results in a consistent manner. It's useful to know what the LLMs are good at and what type of errors they will do. It's also very useful to be a stickler about software engineering practices to keep the LLMs focused in the right direction.
Example stuff that helps:
- extensive test suites
- making LLMs use YAML for data-intensive files, instead of writing inline
- putting a lot of structural guard rails using type-systems, parse-dont-verify, ...
- having well scoped tasks
- giving the LLM tight self-serve feedback loops
Recently I made it fix many bugs in a PEG grammar and it worked really well at that. I made it turn a test suite from an extensive Go test array to a "golden file" approach. I made it create a search index for documentation and score the search quality using qualitative IR metrics, and then iterate until the metrics met a minimum standard.
Example stuff that helps:
Recently I made it fix many bugs in a PEG grammar and it worked really well at that. I made it turn a test suite from an extensive Go test array to a "golden file" approach. I made it create a search index for documentation and score the search quality using qualitative IR metrics, and then iterate until the metrics met a minimum standard.