Mukunda Johnson – Software Engineer

Can LLMs Solve Software Engineering?

December 9, 2024

No doubt you have noticed the excitement surrounding LLMs and their ability to solve problems. Everyone wants to shape them into disruptive products that will revolutionize old processes, especially in the software engineering domain.

But can LLMs really get past the wall of reasoning to write good code? Honestly, I don't know for sure. There are big bets on "agentic" workflows breaking the reasoning boundary, but with my programmer background it's easy to remain skeptical. Here is one way to look at it: when you ask an LLM to multiply two very large numbers, such as 7380580207762439311 and 237196197329347341, no doubt an LLM by itself will get it wrong, because the answer is not written somewhere. However, today's models -do- get it right by including training to generate code scripts, basically using a calculator.

In other words, basic arithmetic is a simple enough concept for an LLM's training to handle, to translate the question into code to execute. So, that level of problem you could consider to be "solved."

Now increase the complexity—increase it by a lot. That's software engineering. The people behind powerful AI models will tactfully ignore how complex software engineering is while promoting their product.

While chain-of-thought and other prompting techniques can help you dredge up real answers from the depths of an extremely complex model, it's easy to overlook the basics: it can't find answers that aren't there. No matter how much power is thrown at an LLM with hundreds of agents running in parallel, there is still the base problem.

Consider the basic arithmetic question again. How much would be needed to brute-force an answer if the model could not generate code? If it was not given a technique to solve that sort of problem? In addition, it needs to verify if the answer is actually correct.

Software engineering is a much harder problem than basic arithmetic. How would you go about making a technique to solve it? How can LLMs leverage said technique? Is it an impossible problem? I don't like to say anything is impossible, but I would think this would not be solved for the considerable future.

I would like to be proven wrong. Why? Because programming is such a ridiculously time-consuming craft. I've sunk hours upon days upon years of my life into piecing logic together, achieving correctness in so many bizarre scenarios. Meanwhile, hardly anyone can appreciate that effort because the real work is invisible to the average person. If there's going to be a future where it would be easier to create as a developer, I'd love to see it.

I don't want to discredit the hard work going into changing the future of software engineering. There have been some very real advances. For one, I absolutely enjoy the time saving we get from tools like GitHub Copilot, autocompleting much of my modern code. When Google says that 25% of their new code is being written by AI, I'd bet it is from autocomplete. LLMs and humans go great together.

As for your regular Joe being able to create full business applications from a feedback loop, I wouldn't bet on that being viable for quite some time. It's certainly achievable to "flavor" applications, to tailor them with prompting, but, beyond that, it gets very, very tedious.

Not being correct 100% of the time is a fudamental concept of LLMs, and in a domain centered around correctness (software engineering), it's easy to disagree with what AI vendors are predicting for the "next 5 years."

<< Blog Index