Just want to clarify, this is not my Substack, I’m just sharing this because I found it insightful.

The author describes himself as a “fractional CTO”(no clue what that means, don’t ask me) and advisor. His clients asked him how they could leverage AI. He decided to experience it for himself. From the author(emphasis mine):

I forced myself to use Claude Code exclusively to build a product. Three months. Not a single line of code written by me. I wanted to experience what my clients were considering—100% AI adoption. I needed to know firsthand why that 95% failure rate exists.

I got the product launched. It worked. I was proud of what I’d created. Then came the moment that validated every concern in that MIT study: I needed to make a small change and realized I wasn’t confident I could do it. My own product, built under my direction, and I’d lost confidence in my ability to modify it.

Now when clients ask me about AI adoption, I can tell them exactly what 100% looks like: it looks like failure. Not immediate failure—that’s the trap. Initial metrics look great. You ship faster. You feel productive. Then three months later, you realize nobody actually understands what you’ve built.

  • MangoCats@feddit.it
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    15 hours ago

    where the massive decline in code quality catches up with big projects.

    That’s going to depend, as always, on how the projects are managed.

    LLMs don’t “get it right” on the first pass, ever in my experience - at least for anything of non-trivial complexity. But, their power is that they’re right more than half of the time AND when they can be told they are wrong (whether by a compiler, or a syntax nanny tool, or a human tester) AND then they can try again, and again as long as necessary to get to a final state of “right,” as defined by their operators.

    The trick, as always, is getting the managers to allow the developers to keep polishing the AI (or human developer’s) output until it’s actually good enough to ship.

    The question is: which will take longer, which will require more developer “head count” during that time to get it right - or at least good enough for business?

    I feel like the answers all depend on the particular scenarios - some places some applications current state of the art AI can deliver that “good enough” product that we have always had with lower developer head count and/or shorter delivery cycles. Other organizations with other product types, it will certainly take longer / more budget.

    However, the needle is off 0, there are some places where it really does help, a lot. The other thing I have seen over the past 12 months: it’s improving rapidly.

    Will that needle ever pass 90% of all software development benefitting from LLM agent application? I doubt it. In my outlook, I see that needle passing +50% in the near future - but not being there quite yet.