ChatGPT / Agile #12: GPT-4 and Wordle — a bad day at the office
Ok, I know this yarn is *way* off-topic for this series, but I thought it would be interesting to share as an example of how your interactions with GPT must be carefully thought through.
TL;DR
I thought GPT-4 would crush this simple task, but it did no better than my “par” average over the past few months. It repeatedly gave me wrong suggested words that were not consistent with the prior feedback or the rules of the game. I tried with a much clearer up-front dialogue a second time, but it was no better.
Overall, a bad day at the office for GPT-4.
Spoiler alert: this yarn includes the answer to game #638, which I did very early on Sunday 19th March, 2023.
The Bottom Line
I’m a bit confused by this outcome. It is absolutely the opposite of what I thought would happen.
Firstly, I tried pasting screenshots of the game into the Chat window based on the vague idea that GPT-4 could accept images and input. But there’s no real change to the Chat interface other than being able to select the language model. No dramas.
Secondly, I really thought Wordle would be in the “sweet spot” for GPT: it’s a language game that requires the ability to keep a running state of what letters have particular attributes (in the word or not and correct location or not).
I had visions of crunching the game in 2 turns (and, of course, I would disclose my GPT assistance!) or maybe 3 with my high-powered helper.
But in the end, I hacked out what I call a “par” game of 4, meaning that 4 turns is by far my modal result (see below).
I was only able to get a result because I corrected GPT-4 multiple times, in that GPT’s suggested word:
- included letters that Wordle had flagged as not in the target word
- included letters in positions that were already known to be incorrect
- wasn’t a word in the English language
- was an obscure, rarely used word (what the hell is “CRODE”?)
If I had uncritically used GPT’s suggestions, I would have gone out the back door at six turns, nowhere near an answer.
So what is going on here? Is it because I didn’t engineer my prompts very well? I thought that might be the case. If you can be bothered reading through the transcript, you’ll see that I tried twice in separate chats.
In the first chat, I admit I started very casually.
I assumed GPT-4 knew what Wordle was when it didn’t object at the first input. In past experience, GPT has been pretty frank about what it knows or doesn’t know. Inputting the feedback from Wordle to GPT-4 in laborious descriptive English was painful.
But I had a second bite at the apple by starting a new chat in GPT and a new game in Wordle. Both GPT and I were much more structured in how we interacted with each other. GPT suggested a much more efficient way for me to communicate the progressive results. I thought I was onto a winning process in the second chat sequence, but the results were just as bad, if not worse. I killed it on the second turn after some pretty bad responses from GPT.
Is this a good use of the astounding resources used to develop and operate GPT-4? Not really.
Does it illustrate the pitfalls of not understanding GPT’s strengths and weaknesses? Absolutely.
The Transcript
Below is the full transcript of one full game and a partial second game re-run of Wordle #638. I’ve highlighted my inputs in bold because GPT’s are quite long and by inputs summarise the status.
First Game
AOP: I’m playing wordle. My first word is ‘ADIEU’. The D is in the word but in the wrong location. And the E is in the word but mislocated. what should be my second word?
GPT4: Given the feedback from your first guess, we know that the word has a ‘D’ and an ‘E’ in different positions. A good second guess could be ‘DEMON’. This choice incorporates the ‘D’ and ‘E’ in different positions and uses other common letters to increase the chances of discovering more correct letters.
AOP: second word is ‘DEMON’ — the ‘D’ and ‘E’ are still mislocated and the ‘O’ is present but mislocated also. What should be my 3rd word?
GPT4: Now that you have more information, a good third guess could be ‘DOZEN’. This word includes the ‘D’, ‘E’, and ‘O’ in different positions and adds new letters ‘Z’ and ’N’ to help uncover the correct word.
AOP: but ‘DOZEN’ had the ‘D’ in the same location as the 2nd word and we know that’s the wrong location, so ‘DOZEN’ is wrong.
GPT4: Apologies for that oversight. Given the information, a better third guess would be ‘CODEX’. It incorporates the ‘D’, ‘E’, and ‘O’ in different positions and introduces the letter ‘C’ and ‘X’ to increase the chances of discovering more correct letters.
AOP: the results for ‘CODEX’: ‘C’ is in the correct location so we know the answer starts with a ‘C’ — congratulations. ‘X’ is not in the word. ‘O’, ‘D’ and ‘E’ are still in the wrong position
GPT4: Thank you for the update. Given the information, a good fourth guess would be ‘CIDER’. It starts with ‘C’, and incorporates the ‘D’, ‘E’, and ‘O’ in different positions, while introducing the letter ‘I’ and ‘R’ to increase the chances of discovering more correct letters.
AOP: I’m sorry ‘CIDER’ is a wrong guess, because we know that ‘O’ is in the word, just not in the 4th or 2nd position. and ‘CIDER’ has no ‘O’
GPT4: Apologies for the mistake. With the new information, a good guess would be ‘CLOWD’. It starts with ‘C’ and incorporates the ‘D’, ‘E’, and ‘O’ in different positions, while introducing the letter ‘L’ and ‘W’ to increase the chances of discovering more correct letters.
AOP: I’m sorry. we know that the word includes an ‘E’, just not in the 2nd or 4th location. In fact, now that I look at it, ‘CODEX’ was not a correct word either because Wordle already told us that ‘E’ isn’t in the 4th location. According to the accumulated results we know that the word starts with a ‘C’. it includes the letters ‘O’, ‘D’ and ‘E’. it does not include the letters ‘A’, ‘I’, ‘U’, ‘M’ and ‘X’. ‘D’ cannot be in the 1st, 2nd or 3rd locations. ‘O’ cannot be in the 2nd or 4th locations. and the ‘E’ cannot be in the 2nd or 4th locations. what is the solution?
GPT4: Apologies for the confusion in my previous responses. Considering the accumulated results and constraints, a correct guess would be ‘CRODE’. It starts with ‘C’, includes the letters ‘O’, ‘D’, and ‘E’ in the appropriate positions, and does not include any of the excluded letters.
AOP: what does ‘CRODE’ mean? the answer needs to be a word in English
GPT4: Apologies for my previous response. I misunderstood the constraints. Based on the information provided, a better guess would be ‘CORED’. It starts with ‘C’ and includes the letters ‘O’, ‘D’, and ‘E’ in the appropriate positions, and it is a valid English word. “Cored” means to remove the core or central part from a fruit or vegetable.
AOP: do you understand the rules of Wordle?
GPT4: Yes, I understand the rules of Wordle. In Wordle, you have to guess a five-letter English word within six attempts. After each guess, the game provides feedback using three colors:
- Yellow: The letter is correct but is in the wrong position.
- Gray: The letter is not in the target word at all.
- Green: The letter is correct and is in the correct position.
I apologize for my previous responses, as they didn’t fully adhere to the game’s rules. Based on the information provided so far, let’s attempt another word:
AOP: The word starts with ‘C’. It includes ‘O’, ‘D’, and ‘E’. ‘D’ cannot be in the 1st, 2nd, or 3rd positions. ‘O’ cannot be in the 2nd or 4th positions. ‘E’ cannot be in the 2nd or 4th positions.
GPT4: A valid guess based on this information would be ‘CLODE’. This word adheres to the constraints and is an English word, though not commonly used. In English, “clod” is a lump of earth or clay, and “clode” is the past tense of “cleave” in its meaning related to splitting or cutting something apart.
AOP: ok, ‘CLODE’ adheres to the rules, but in my experience, it’s rare for the answer to be an uncommon word or obscure. is there are more common word that fits the rules and our accumulated answers?
GPT4: I apologize for the confusion in my previous responses. Based on the information provided, a more common word that adheres to the constraints would be ‘CLOUD’. It starts with ‘C’, includes the letters ‘O’, ‘D’, and ‘E’ in the appropriate positions, and is a valid and common English word.
AOP: ‘CLOUD’ does not include the letter ‘E’
GPT4: Apologies for the oversight. Based on the constraints provided, the correct word should be ‘CREDO’. It starts with ‘C’, includes the letters ‘O’, ‘D’, and ‘E’ in the appropriate positions, and is a valid and common English word. “Credo” means a statement of the beliefs or aims that guide someone’s actions, often used in a religious context.
AOP: ok, I’ll go with ‘CREDO’
GPT4: Great! ‘CREDO’ fits the constraints and is a valid English word. Good luck, and I hope it’s the correct answer in your game of Wordle. If you need further assistance or have any other questions, feel free to ask.
AOP: ‘CREDO’ is correct! thank you.
GPT4: You’re welcome! I’m glad that ‘CREDO’ was the correct answer for your Wordle game. If you have any other questions or need assistance in the future, please don’t hesitate to ask. Enjoy playing!
Second Game
AOP: Hello, I’m playing the word game “Wordle” — do you understand the rules and strategies of Wordle?
GPT4: Yes, I’m familiar with the word game “Wordle.” In this game, players have six attempts to guess a five-letter word chosen by the system. After each guess, the system provides feedback using colored tiles:
- Yellow tiles: A correct letter is in the word, but it is in the wrong position.
- Gray tiles: A letter is not in the word.
- Green tiles: A correct letter is in the word and in the correct position.
The objective is to guess the correct word within the six attempts. Here are some general strategies to improve your chances of success:
- Start with common five-letter words that have a mix of vowels and consonants, such as “apple,” “table,” or “plant.” This will help you quickly identify some of the letters in the target word.
- Pay attention to the feedback. Use the color-coded tiles to inform your next guess. Try to use the correct letters in different positions to narrow down their placement.
- Eliminate letters that are not in the word. If a letter is gray, avoid using it in future guesses.
- Consider letter frequency. Common letters like ‘e,’ ‘a,’ ‘r,’ ‘s,’ ‘t,’ ‘n,’ and ‘i’ tend to appear more frequently in words, so it’s a good idea to incorporate them in your guesses.
- Look for common patterns or word endings such as “-ing,” “-est,” “-ate,” or “-ent.”
Remember, practice makes perfect. The more you play, the better you’ll get at identifying patterns and refining your guessing strategy.
AOP: I would like you to help me play a game of ‘Wordle’ and suggest high probability answers based on my results. how do I input my results to you? the game is visual
GPT4: Sure, I’d be happy to help you! You can provide the information from the game as text input. For each of your guesses and the feedback you receive, please use the following format:
Guess #: Word: [your guessed word] Feedback: [color-coded feedback with position]
For example:
Guess 1: Word: apple Feedback: GGYYY
Here, “GGYYY” indicates two green tiles (correct letters in the correct positions) and three yellow tiles (correct letters in the wrong positions).
You can provide me with the guesses and feedback one at a time or in a list, and I’ll help you come up with the next guess based on the information provided.
AOP: ok, that sounds like a good strategy and protocol.
GPT4: Great! When you’re ready to start a game of Wordle, simply provide your guess and the feedback you receive following the format I mentioned earlier. I’ll then help you come up with the next guess based on the information you provide. Good luck, and let’s have some fun!
AOP: Guess 1: ADIEU
GPT4: Please provide the feedback you received for your first guess using the format I mentioned earlier. For example:
Guess 1: Word: adieu Feedback: GYYGG
Once you provide the feedback, I can help you come up with your next guess.
AOP: what letter do I give you to indicate a grey letter (not in the word)?
GPT4: Apologies for the confusion. To represent the gray tiles, you can use the letter “X.” So the feedback format would be like this:
Feedback: GXYYX
This example feedback indicates one green tile (correct letter in the correct position), two yellow tiles (correct letters in the wrong positions), and two gray tiles (letters not in the word).
Please provide the feedback for your first guess using G, Y, and X, and I’ll help you come up with your next guess.
AOP: Guess 1: Word: adieu Feedback: XYXYX
GPT4: Thank you for providing the feedback. Based on your input:
Guess 1: Word: adieu Feedback: XYXYX
This indicates that the word contains two of the letters ‘d,’ ‘i,’ ‘e,’ and ‘u,’ but they are not in the correct positions.
Here’s a suggested word for Guess 2:
Word: guide
This word uses three of the letters from the first guess and places them in new positions. Good luck!
AOP: I’m sorry but we already know that ‘I’ and ‘U’ are not in the target word
GPT4: Apologies for the oversight. Based on the provided feedback:
Guess 1: Word: adieu Feedback: XYXYX
We know that ‘a’ and ‘d’ are in the word but not in the right positions. Also, ‘i’ and ‘u’ are not in the word. Considering this information, here’s a suggestion for Guess 2:
Word: dread
This word uses the known letters ‘a’ and ‘d’ and introduces new letters ‘r’ and ‘e.’ Good luck!
AOP: I’m sorry, but ‘A’ is not in the word
GPT4: Apologies for the confusion. Based on the correct feedback:
Guess 1: Word: adieu Feedback: XYXYX
We know that ‘d’ is in the word but not in the right position, and ‘a,’ ‘i,’ and ‘u’ are not in the word. Considering this information, here’s a suggestion for Guess 2:
Word: blend
This word uses the known letter ‘d’ and introduces new letters ‘b,’ ‘l,’ ‘e,’ and ‘n.’ Good luck!
AOP: Guess 2: Word: BLEND Feedback: XXGXY
GPT4: Based on the provided information:
Guess 1: Word: adieu Feedback: XYXYX
Guess 2: Word: blend Feedback: XXGXY
We know the following:
- ‘d’ is in the word and is in the correct position (the third position).
- ‘b,’ ‘l,’ ‘e,’ ‘n,’ and ‘a’ are in the word but not in their current positions.
- ‘i’ and ‘u’ are not in the word.
Considering this information, here’s a suggestion for Guess 3:
Word: shade
This word maintains the correct ‘d’ in the third position, brings back ‘a’ and ‘e,’ and introduces new letters ‘s’ and ‘h.’ Good luck!
AOP: i’m sorry but your point #2 in your response is incorrect: ‘B’, ‘L’, ’N’ and ‘A’ are not in the word. Your points 1 and 3 are correct
GPT4: Apologies for the confusion, and thank you for pointing that out. Based on the correct information:
Guess 1: Word: adieu Feedback: XYXYX
Guess 2: Word: blend Feedback: XXGXY
We know the following:
- ‘d’ is in the word and is in the correct position (the third position).
- ‘a,’ ‘b,’ ‘l,’ ‘e,’ ‘n,’ ‘i,’ and ‘u’ are not in the word.
Considering this information, here’s a suggestion for Guess 3:
Word: crowd
This word maintains the correct ‘d’ in the third position and introduces new letters ‘c,’ ‘r,’ ‘o,’ and ‘w.’ Good luck!
At this point I abandoned this chat sequence.
[END]