• ferrule@sh.itjust.works
    link
    fedilink
    arrow-up
    1
    ·
    1 hour ago

    The issue is two fold.

    First the scope of the project is very important. When I am working on a web app the most complicated project is still 90% boilerplate stuff. You write some RESTful code on some framework using CRUD and make a UI that draws based on data. No matter what you are making, lets be honest, it’s not novel. This is why vibe coding can exist. Most of your unit tests can be derived from the types in your functions. Do a little bit of tracing through functions and AI can easily make your code less fragile.

    When you are working on anything more complicated making code better requires you to actually grok the business requirements. Edge cases aren’t as simple. The reasons for doing things a specific way aren’t so superficial. Especially when you start having to write optimizations the compilers don’t do automatically.

    The second issue is learning matterial. The majority of the code we write is buggy. Not just in range testing but in solution to problems. There is a reason why we don’t typically write once and never go back to our code.

    Now think about when you, as a human, go back over old code. The commit log and blame usually don’t give a great picture of why the change was needed. Not unless the dev was really detailed in their documentation. And even then it requires domain knowledge and conceptualization that AI still can’t do.

    When teaching humans to be be better at development we suck at it even when we can grok the language and the business needs. That is a hurdle we still need to cross with AI.

  • MonkderVierte@lemmy.zip
    link
    fedilink
    arrow-up
    8
    arrow-down
    1
    ·
    edit-2
    7 hours ago

    For example, if I’m vibe-coding a quick web app with more JavaScript than I care to read

    Ah, please don’t publish that code then. It’s a experiment and not something juniors should come to learn as “good enough”.

  • 14th_cylon@lemmy.zip
    link
    fedilink
    arrow-up
    31
    ·
    20 hours ago

    i have read it all hoping to find out what he is talking about… instead, the blog post ended 🤷‍♂️

    • gtrcoi@programming.dev
      link
      fedilink
      arrow-up
      4
      ·
      12 hours ago

      I’m guessing he’s alluding to a bunch of asserts, data sanitization, and granular error reporting. But yea, who knows.

  • FishFace@piefed.social
    link
    fedilink
    English
    arrow-up
    26
    ·
    21 hours ago

    The word you are looking for is “robust”.

    Debugging isn’t the worst thing in programming. The worst thing is having a task you need to do and a solution already written, but not knowing how to use the solution to solve the task.

  • spireghost@lemmy.zip
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    14
    ·
    20 hours ago

    Large language models can generate defensive code, but if you’ve never written defensively yourself and you learn to program primarily with AI assistance, your software will probably remain fragile.

    This is the thesis of this argument, and it’s completely unfounded. “AI can’t create antifragile code” Why not? Effective tests and debug time checks, at this point, come straight from claude without me even prompting for it. Even if you are rolling the code yourself, you can use AI to throw a hundred prompts at it asking “does this make sense? are there any flaws here? what remains untested or out of scope that I’m not considering?” like a juiced up static analyzer

    • TehPers@beehaw.org
      link
      fedilink
      English
      arrow-up
      12
      ·
      18 hours ago

      Why not?

      Are you asking the author or people in general? If the author didn’t answer “why not” for you, then I can.

      Yes, I’ve used Claude. Let’s skip that part.

      If you don’t know how to write or identify defensive code, you can’t know if the LLM generated defensive code. So in order for a LLM to be trusted to generate defensive code, it needs to do so 100% of the time, or very close to that.

      You seem to be under the impression that Claude does so, but you presumably can tell if code is written with sufficient guards and tests. You know to ask the LLM to evaluate and revise the code. Someone without experience will not know to ask that.

      Speaking now from my experience, after using Claude for work to write tests, I came out of that project with no additional experience writing tests. I had to do another personal project after that to learn the testing library we used. Had that work project given me sufficient time to actually do the work, I’d have spent some time learning the testing library we used. That was unfortunately not the case.

      The tests Claude generated were too rigid. It didn’t test important functionality of the software. It tested exact inputs/outputs using localized output values, meaning changing localizations was potentially enough to break tests. It tested cases that didn’t need to be tested, like whether certain dependency calls were done in a specific order (those calls were done in parallel anyway). It wrote some good tests, but a lot of additional tests that weren’t needed, and skipped some tests that were needed.

      As a tool to help someone who already knows what they’re doing, it can be useful. It’s not a good tool for people who don’t know what they’re doing.