ChatGPT / Agile Project Mgt — Part #6: Some thoughts on capabilities.
This is Part 6 of an open-ended series on ChatGPT and how to use it in Agile Project Management. For other parts, check out the master index for this series.
The TL;DR
After five previous parts in this series describing many interactions on ChatGPT — including the monster yesterday on User Stories (maybe I should split that bad boy into two shorter sub-yarns?) — I’m ready to pause and think about what’s happening. I will give you a mix of my opinions, what OpenAI says, what ChatGPT says and what others say in some other yarns on Medium. I give a shout-out to three excellent yarns that give factual background to how ChatGPT works. I then look at the information I’ve collected on the key capabilities of ChatGPT and how they break down into underlying use cases. I discuss some limitations, both those that OpenAI list and my interpretation of my experience with ChatGPT so far. And I counterbalance that with my view of the strengths.
Other People’s Thoughts on ChatGPT
It’s hard to open a magazine, blog platform (like Medium) or even a newspaper without seeing something about ChatGPT and AI. OpenAI is starting to do a Google on “web search” — I see & hear people talking about “googling” even when using Bing or some other search engine. The term “ChatGPT” is being generalised to cover many products, applications and even converging with the far bigger term “AI” (Artificial Intelligence).
Much of it is spammy, shouty, ill-advised or breathlessly enthusiastic. However, there are still some good yarns out there, ranging from the very detailed and very long to long-medium length but still very factual, to shorter and more light-hearted or amusing. In that order, here are my current favourites:
What Is ChatGPT Doing … and Why Does It Work? — Stephen Wolfram Writings
How ChatGPT Works: The Model Behind The Bot | by Molly Ruby | Jan, 2023 | Towards Data Science
Introducing ChatGPT!. The Revolutionary New Tool for… | by Cassie Kozyrkov | Medium
Depending on your reading approach, those will give you a good overview of what’s really happening.
What OpenAI Says
In organising their Help content, OpenAI provides “Guides” to what seem to be the top five functions of their technology. These are illustrated in the diagram below.
Whilst this isn’t a perfect top-level breakdown of what’s available, it is how OpenAI has organised the level of documentation beyond the introductory. Check out the detail in the “Key Concepts” section of the OpenAI online documentation.
Code Completion and Image Generation are in some form of Beta release, and “Fine Tuning” is about training and enhancing models. I’ll leave those for you to explore if they interest you.
Most people are interested in the two main text processing and generating capabilities: “Text Completion” and “Embeddings”.
Text Completion
OpenAI describes the text completion capability of ChatGPT as:
You input some text as a prompt, and the model will generate a text completion that attempts to match whatever context or pattern you gave it.
This is what most people start with when they get going.
I asked ChatGPT itself for some use Text Completion use cases, and it gave me the following:
If you want to explore more of these, you can iteratively ask ChatGPT for more use cases in each category or even for use cases in other categories.
ChatGPT’s massive underlying data set (including Wikipedia content) gives it a huge fact base to help complete your prompts.
Embeddings
Firstly, Embeddings is a technical term for what I interpret as the “internal representation” of the language ChatGPT parses based on your inputs and uses to generate the required output text. When ChatGPT performs functions, they are on the Embeddings “version” of your input text.
ChatGPT performs two types of embeddings:
After ChatGPT generates these vector representations, it can perform a wide range of very powerful functions:
Both the above capabilities, Text Completion and Embeddings, support the enormous range of detailed use cases you see, read about, or even try for yourself.
But, just as in motorsports, having a powerful engine doesn’t mean you’ll always win the race; having these powerful capabilities doesn’t mean you can use ChatGPT for anything or that it can compete with everything (as some pundits will assure you).
My Thoughts on ChatGPT
Limitations
The key to using ChatGPT for any given “real world” application is understanding that ChatGPT is not a god or infallible. In fact, it is quite fallible. You would be wise to read the section on “Limitations” here ChatGPT: Optimizing Language Models for Dialogue (openai.com)
I’ll quote just two of OpenAI’s warnings:
ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.
and
ChatGPT is sensitive to tweaks to the input phrasing or attempting the same prompt multiple times.
So the key to using ChatGPT is simple and perhaps daunting. You pretty much have two options:
- Have a use case with a high tolerance for error and noise, OR
- Have someone more expert than ChatGPT in the use case domain available to validate the output before using it.
So, you can play the numbers game and predict (hope) that the good answers substantially outweigh the bad ones and that the downside risk of someone using the bad info is not harmful. Or you can use ChatGPT output as the input to another workflow phase before you provide the information to its ultimate consumer.
Using ChatGPT Output in a Live Application or Use Case
Although ChatGPT has billions of data points in its model, using that knowledge is both an opportunity and a risk.
My experience so far is that I couldn’t (or won’t) use any of the output from ChatGPT in any practical application without first checking it. That notwithstanding, I have seen output from ChatGPT that — having read it — I would happily use in a practical application. For example, I tend to write long emails (bad habits die hard), and I’ve experimented with ChatGPT to give me a more succinct version. Having read the response carefully, I was happy to send the resulting email with only minor tweaks.
Perhaps I might become more relaxed over time as I used ChatGPT more and understood the range of outputs. We’ll have to see.
Expertise Bootstrapping
But let’s look at this more generically, and as a framework, we’ll use this chart from Kathy Sierra from her book “Making Users Badass”.
You can see that Kathy loosely divides ability (aka expertise) into three zones:
- The “Suck” Zone: beginners and others who generally “suck” at the skill
- the “Competent” Zone: people who have gotten over the “Suck Threshold” and become confident and productive
- The “Badass” Zone: someone who is highly proficient — “world class” at this skill
So far in my dealings with ChatGPT, my opinion is that the output is a person in the high-end “Competent” zone, but not a “Badass”.
I read somewhere today that ChatGPT is like a “know all” friend who can tell you huge amounts of facts and figures on any topic but gets them wrong sometimes and is very picky about how they interpret your questions. Or perhaps it’s like the “idiot savant” played by Dustin Hoffman in the film “Rain Man” who memorises the phone book one night in a motel and the next day recognises the waitress’s name in the coffee shop from her name tag and can recite her address and phone number.
If your expectations are wrong, then ChatGPT will underperform. It will not provide you with anything that is both expert and complete on anything longer than a few hundred words. It won’t write a book (unless it’s done short chapter by short chapter). It won’t create a complete course.
ChatGPT will provide examples and partial outlines but not the whole thing.
It feels to me that the more you push it to give you more detail, the more repetitive it becomes and the more cyclic re-use of previous content it outputs. Check out the chart below:
If we start with some fundamental concepts and progressively ask ChatGPT to elaborate, I’ve experienced that you get more detail, but you get the same level of detail. It’s like there’s an asymptotic barrier at the “high-end competent” level, and you’re getting more content, but it’s just saying the same thing in different ways.
To become “Badass”, you must do the heavy lifting and go beyond ChatGPT’s outputs.
ChatGPT Strengths
Notwithstanding the potential errors and the expertise that govern the outputs, ChatGPT has some enormous strengths:
- Natural Language Parsing: one key strength is ChatGPT’s ability to understand what you type in, both in terms of the concepts you’re providing and the actions that you want it to perform. This type of interface provides many benefits over a GUI interface into a data set because you only have one entry point — there are no modal screens or forms to enter your requests, and all the issues with how those are implemented (bad form validation … ugh!).
- Knowledge Base: ChatGPT’s ability to parse your requests and map those into the underlying factual and other data it has in its model is spectacular. You might argue that the data is a bit stale (the model stops in early 2022) or that there’s no web interface. But for many applications, what is there is more than enough. I can always hope this is improved, but we also have to recognise how much work went into setting up this knowledge base and training it to its current level of utility. Extending it further will cost someone big time.
- Output flexibility: I love how you describe the output you want, and ChatGPT will do it. Sometimes it’s a bit flaky, but it’s very flexible.
The Bottom Line
ChatGPT has many positive benefits and capabilities, but it won’t make you an expert or produce any level of expert output that you can use without editing, checking or enhancing. But it can provide the scaffolding for you to “bootstrap yourself to badass” if you actively participate.
I’ve probably said enough for this yarn. I have some additional perspectives that I’ll include in the next yarn before we start back on our actual use case examples.
Stay tuned.
And don’t forget to check out the master index for this open-ended series.