Ambiguity in Natural Programming

2026-06-27 · 4 min · #computing

I am once again thinking about programming in natural language.

For most programming languages, you can parse them according to strict grammars that eliminate their ambiguity. These can get ugly, especially for languages like C++ where one symbol can mean many different things depending on its context, but the code is, in theory, able to be parsed.

One of the biggest barriers for programming in natural languages, however, are their inherent ambiguity. Natural languages were optimized for efficiency and convenience in communicating between people. They were not designed for clearly communicating with machines.

This means that computer programs that try to understand natural language must do so stochastically. It is likely that the text means this. It is possible that a text means that. It's impossible to work with certainty, even if we had a language model with perfect knowledge of a language and the surrounding cultures.

One answer is to embrace the probabilistic nature of language and use Large Language Models (LLMs) for natural language programming. After all, they can already write code that seems statistically likely based on natural language prompts. This approach embraces the chaos.

The other answer is to describe a strict subset of a language and use that. Instead of trying to let the computer understand the entirety of what a user is saying, make the user explain it in language the computer will understand. This is still a computer language, but something more like the user's native language. This approach clings to rigidity.

Neither of these options are ideal, in my opinion. LLMs require a large amount of resources to use, shackling access to natural language programming to those who can pay to rent from large corporations or can afford an expensive GPU while costs are rising. Another computer language might ease new programmers into the concepts of programming, but they are still forced to describe their problems in a rigid language that punishes straying from its parser.

I'd like to propose a combination of the two. A language model that parses natural language but checks back in with the user to see if it has understood. This can be done by converting high-ambiguity sentences to low-ambiguity ones when appropriate and storing the latter as source. When the interpretation program is confused, it can check back in with the user to ask what they meant.

There are a few important differences between this and programming with an LLM. First, the source is unambiguous natural language, not prompts. This means we can retain deterministic pipelines and reproducible builds. Second, the language model needs to parse, not generate. This should mean that we can use models that are less expensive to run, opening it up to be used more wildly.

The goal is understandability. A system that produces consistent output based on given input is more understandable than one that doesn't.

In summary, there would be three main layers:

Natural language (parsed by a language model)
Computer language (subset of a natural language)
Language-agnostic IR

The computer language is still considered the "source code". It is a common-ground understanding between both the user and the machine. It is expected that such a programming environment convert between the natural language to the computer language live to provide immediate feedback.

Once this is done, the computer language can be compiled into a language-agnostic IR that can then be compiled or interpreted. I say language-agnostic because the top-level natural language should be any language the user desires.

I haven't made a proof of concept yet, because this is outside of the realm of my current expertise. The current state of Natural Language Processing (NLP) is focused on LLMs and generating text, not interpreting it. I don't know if there are robust solutions for machines to understand human text beyond simply transforming it directly.

If anyone reading this has any suggestions or knows of any existing research in this area, please contact me! I'd love to hear your thoughts.