The instances should be based on the context. For example we had a few different API providers for the same thing, and someone refactored the separate classes into a single one that treats all of the APIs the same.
Well, turns out that 3 of the APIs changed the way they return the data, so instead of separating the logic, someone kept adding a bunch of if statements into a single function in order to avoid repeating the code in multiple places. It was a nightmare to maintain and I ended up completely refactoring it, and even tho some of the code was repeated, it was much easier to maintain and accommodate to the API changes.
I have, and in each sprint we always had tickets for reviewing the implementation, which could take anywhere from an hour to 2 days.
The code quality was much better than in my current workplace where the reviews are done in minutes, although the software was also orders of magnitude more complex.
You are comparing compilers to a completely non deterministic code generation tool that often does not take observable behavior into account at all and will happily screw a part of your system without you noticing, because you misworded a single prompt.
No amount of unit/integration tests cover every single use case in sufficiently complex software, so you cannot rely on that alone.
I just rewrote a utility for the third time - the first two were before AI.
Short version, when someone designs a call center with Amazon Connect, they use a GUI flowchart tool and create “contact flows”. You can export the flow to JSON. But it isn’t portable to other environments without some remapping. I created a tool before that used the API to export it and create a portable CloudFormation template.
I always miss some nuance that can half be caught by calling the official CloudFormation linter and the other half by actually deploying it and seeing what errors you get
This time, I did with Claude code, ironically enough, it knew some of the complexity because it had been trained on one of my older open source implementations I did while at AWS. But I told it to read the official CloudFormation spec, after every change test it with the linter, try to deploy it and fix it.
Again, I didn’t care about the code - I cared about results. The output of the script either passes the deployment or it doesn’t. Claude iterated until it got it right based on “observable behavior”. Claude has tested whether my deployments were working as expected plenty of times by calling the appropriate AWS CLI command and fixed things or reading from a dev database based on integration tests I defined.
You don't. I can guarantee that 90% of the generated code will never receive a detailed review, simply because there's too much of a cognitive overhead, and too little time, everything moves too fast.
I remember having to do such a code review before an AI in a highly complex component, and it would take a full day of work to do it. In this day and age, most of the people i know take like half an hour and are mostly scanning for obvious mistakes, where the bigger problem are those sneaky non obvious ones.
Exactly. Its same for reviewing somebody else's code. How many companies did this perfectly before llms came? I know mine didn't. But these days people that aren't senior enough do reviews of llm output, and do a quick mental path through the code, see the success and approve it.
What could work - llm creating a very good test suite, for their own code changes and overall app (as much as feasible), and those tests need a hardcore review. Then actual code review doesn't have to be that deep. But if everybody is shipping like there is no tomorrow, edge cases will start biting hard and often.
Yes. The concepts apply to all programming languages. The exercises use multiple programming languages. For example, one exercise has you explore the Git source code (which is in C) to learn how to use a program's data structures to understand an unfamiliar codebase. Other examples have you find bugs in Python code or write Java.
Well, turns out that 3 of the APIs changed the way they return the data, so instead of separating the logic, someone kept adding a bunch of if statements into a single function in order to avoid repeating the code in multiple places. It was a nightmare to maintain and I ended up completely refactoring it, and even tho some of the code was repeated, it was much easier to maintain and accommodate to the API changes.