ChatGPT / Agile #11: Prompt Engineering as Theory Building
Following on from the previous posts on Prompt Engineering, in this post, we begin to weave in some concepts from Agile & Software Development that help build out the approach to robust and helpful interaction with ChatGPT.
This isn’t exactly what I thought I’d write when I started this series. But I’ve come to believe that high-impact exploitation of the ChatGPT resource requires a deeper understanding of how we interact.
But in researching the topics of Prompt Engineering and Prompt Structure, I’ve had an “Aha” moment: writing prompts is writing code.
With this realisation, we can bring to bear a range of theories, concepts and practices that relate to good code writing and use them to augment the existing work that focuses on the structural and linguistic aspects of prompt design.
Writing Code is Theory Building
For years, writing code was thought of as the process of creating the text representation of source code. In many organisations, this conceptual framing exists today. In this framework, the measure of productivity is “lines of code”.
However, a better conceptual framework for writing code is developing an understanding of a business or technical problem and mapping that to a model of a software solution.
Writing the source code is less important than understanding the problem and its solution. Thank you, Mr Einstein.
Peter Naur, a Danish computer science pioneer and Turing award winner, wrote a paper in 1985 called “Programming as Theory Building”, also included in Alistair Cockburn’s excellent book “Agile Software Development” (2015).
In summary, Naur’s proposition is:
“…the proper, primary aim of programming is, not to produce programs, but to have the programmers build theories of the manner in which the problems at hand are solved by program execution.”
In this proposition, the word “theories”, Naur meant:
“the knowledge a person must have in order not only to do certain things intelligently but also to explain them, to answer queries about them, to argue about them, and so forth. A person who has a theory is prepared to enter into such activities; while building the theory the person is trying to get it.
The implications of this proposition on how software developers are trained, perform their jobs, and their roles in projects are massive and many.
Not the least of which are the 4th and 6th principles of the Agile Manifesto:
Business people and developers must work together daily throughout the project.
The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
This is because it is only by direct interaction with the people who understand the problems to be solved — or at least who experience them in their day-to-day lives — can the developers build these “theories” and so develop their software.
In the world of coding, ignoring this framing leads to many of the problems we currently have in software development.
Prompt Development is Theory Building
And so it is in Prompt Engineering.
In the current wild and raw world of the wide availability of powerful and generic AI models such as ChatGPT, the focus is almost entirely on hacking out prompts, either by trial and error or by using collections of pre-written prompts that purport to produce the results as quickly as possible.
As in the software industry, this is a severe misfocus.
Sure, prompt developers need to learn skills and techniques in constructing a set of text characters as input to a computer-based tool.
That’s a given. It’s just an entry ticket to the game: it’s not the game.
Understanding the problem you’re trying to solve and building a “theory” of how that works is the most fundamental enabler of successful prompt development.
All the concepts behind Prompt Engineering and Prompt structures are interesting, but they won’t help you if you really don’t know what you’re trying to achieve. If you don’t have a “theory” of that problem, we’re back in the world of .45 pistols in the playpens at pre-school: a powerful tool in the hands of the naive.
But with a good “theory”, you can select and adapt the tools you use — the prompts and different ways of constructing them — independently of the theory.
Another Word for “Theory” is “Mental Model
Naur’s description of theory-building is very formal and academic. Reading his paper demands focus and time — there are many excellent summaries you can search for on the web, however — and it’s not always clear how to adapt his ideas into our own practice.
But another way of thinking about this is to view the “theory” as a “mental model”, a well-established concept of cognitive processing in humans. A model is a representation of something else (the “thing”), constructed to make it easier to understand the relevant aspects of the “thing” and apply that knowledge to some activity or domain.
We see models always: they are usually much smaller than the original “thing” they represent and only contain enough information to enable the use-case.
A “mental model” is a model we construct in our brains for exactly the same purpose. Our brains are model-building engines and have evolved structural elements — the cerebellum — to support model creation and manipulation. By some brain theories, this is the entire purpose of the cerebellum and the key to our intelligence and thinking abilities.
Systems, Models of the World and the Real World
Building a map between the model and the real world is an integral part of building a mental model of the problem and how to solve it. In the current time, there are three domains that we need to keep straight in our thinking, as illustrated below.
We have the “real world” we live in, as interpreted by our senses. We have “models of the real world”, such as the “theories” and mental models discussed in this article, and we have the systems and solutions that implement those models.
That diagram doesn’t get the full picture because we have at least two sets of models: we have the models that the platform developers created to build and maintain the system, and we have the models that the user community maintains about their world and how the platform operates.
I raise this as cautionary advice: when dealing with ChatGPT, deliberately separate these three domains as you work with the interface. Your job in using ChatGPT is to use it as a tool to solve problems in your world. Those models are more in your control and understanding, and you can validate them through experiment and dialogue.
Your understanding of how the solution works — and the platform developer’s models — will never be as good (by a country mile) as theirs. So don’t waste time anticipating how the platform works or “game” the platform to work better for you.
The Bottom Line
Prompt Engineering — both formal and informally developed — is growing at a phenomenal rate. The knowledge and ideas behind Prompt engineering have a solid theoretical and practical history going back many years. They seem to be a firm foundation to learn and become more skilled in your interactions with ChatGPT and gain valuable skills.
But developing the “theories” — and learning how to develop and frame those theories — is, as the ad says — “priceless”. There is nothing more valuable than the ability to understand domain problems in a fundamental and solution-independent way.
If nothing else, keep in mind that ChatGPT is the front-runner at the moment. But other AI engines and language models are coming fast.
If you have a solid theory (mental model) of your problem space, you can easily transition to new tools with this knowledge intact.
But if you don’t have that theory — or worse, your theory is bound up in ChatGPT-specific prompts — you will have a harder time transitioning to new tools.
In the next yarn, I will address another concept from Agile that gives a framework for knowledge acquisition.