The Thing About Using ChatGPT for API Design
Net API Notes for 2022/12/28 - Issue 208
The hyperbole around "AI" generated output reached new heights in 2022. Perhaps you've seen art created by one of the many popular image models (DALL-E 2, Midjourney, and Stable Diffusion). Or maybe you've listened to music generated from sonograms (Riffusion). In October, I explored the possibility of using Github's Copilot to create an OpenAPI description.
However, the latest head turner has been OpenAI's ChatGPT. ChatGPT is a Large Language Model (LLM) that excels in conversational dialog, generating all sorts of plausible-sounding text. The web has been filled with experiments testing ChatGPT's capabilities, from cringeworthy lyric generation to workshop ideation.
Those examples are impressive (and problematic), but how "large" is ChatGPT's language model? Can it "speak" OpenAPI? Can a million virtual, cloud-based monkeys typing on keyboards eventually write Shakespeare? And, if so, are API designers' days numbered?
I attempt to answer that and more in this edition of my Net API Notes.
Is ChatGPT A Potential Tool for API Designers?
ChatGPT Writes Human-Like Responses to Chat Prompts
OpenAI trained ChatGPT to generate human-like text in response to prompts it receives. It utilizes OpenAI's GPT-3.5 architecture, another model used to process and render text. If the name "GPT" sounds familiar, you might remember OpenAI describing previous models as "too dangerous to release to the general public". Safety improvements (along with marketplace pressure from other open-source model creators, like Stable Diffusion) have resulted in greater public access.
ChatGPT takes a prompt from a user typing in a web form to generate text. The response generated by ChatGPT is not pre-written or predetermined; instead, ChatGPT generates text on the fly based on the model's understanding of language and its ability to predict the next word in a sequence. It is important to note here that coherent does not necessarily equate to accurate.
Also notable is that ChatGPT retains "memory" between prompts, allowing users to iterate on a response. The bot "remembers" earlier comments and applies them to subsequent information provided.
ChatGPT has written short stories, poems, and even tricky emails. But can it write an OpenAPI description?
JTBD Are Powerful Prompts for OpenAPI Generation
Before ChatGPT can attempt to write an OpenAPI description, it needs text input in the form of a prompt. If we're doing API-design first, there is the chance that we have the Jobs To Be Done (or JTBD) stories already written.
Many API folks familiar with OpenAPI are also probably familiar with the Pet Store example (the 3.0 minimal YAML syntax is on Github. In my test, I want to compare this "canonical" version, written by experts, with the output rendered by ChatGPT.
Unfortunately, I couldn't find a set of JTBD examples that might have preceded the sample YAML OpenAPI description. (If Pet Store JTBD are indeed out there, somewhere, please let me know). With a detailed prompt, we will be able to compare our descriptions.
Time to "galaxy brain" the problem:
- We need JTBD text to prompt ChatGPT
- ChatGPT writes text
What if I back up a step and ask ChatGPT to create an OpenAPI description AND also the use cases for that description?
Writing detailed yet succinct stories is a skill. For my Pet Store request, Chat GPT generated the following JTBD stories:
- As a pet owner, I want to be able to search for and view a list of available pets so that I can find a suitable pet to adopt.
- As a pet owner, I want to be able to filter the list of available pets by various criteria (e.g. type, breed, age, size) so that I can narrow down my search and find a pet that meets my specific needs.
- As a pet owner, I want to be able to view detailed information about a specific pet so that I can make an informed decision about whether or not to adopt it.
- As a pet owner, I want to be able to add a pet to my shopping cart so that I can proceed with the adoption process.
- As a pet owner, I want to be able to place an order for a pet so that I can complete the adoption process and bring my new pet home.
- As a pet owner, I want to be able to view the status of my order (e.g. pending, processing, shipped) so that I can stay informed about the status of my adoption.
- As a pet owner, I want to be able to cancel an order if I change my mind or encounter any issues during the adoption process.
- As a pet store employee, I want to be able to add new pets to the available inventory so that I can expand the options for potential adopters.
- As a pet store employee, I want to be able to update the information for a specific pet (e.g. change its status from available to adopted) so that I can keep the pet inventory up to date.
- As a pet store employee, I want to be able to view the orders placed by customers so that I can process them and ensure that the pets are delivered to their new homes.
Honestly, that is much better than I had anticipated. After a quick skim, I'd change all references of "pet owner" to "prospective pet owner" (or "adopter"). And there are a few use cases that could be added. But if I just run down the list, I am easily able to sketch out the following endpoints:
- JTBD 1 and 2: POST /search where sensitive information is provided for lookup (or a GET /pets with query parameters)
- JTBD 3: GET /pets/{petId}
- JTBD 4: POST /carts with a body having /pets/{petId} information, among other things
- JTBD 5: POST /orders with a body specifying /carts/{cartId}
- JTBD 6 and 10: GET /orders/{orderId}
- JTBD 7: DEL /orders/{orderId} (or a PATCH to the status property of /orders/{orderId})
- JTBD 8: POST /pets
- JTBD 9: PATCH /pets/{petId} (or maybe a PUT, depending on the complexity of the pet schema)
Returning to our canonical reference, you'll notice that some of those proposed endpoints are missing; ChatGPT created plausible use cases that a real store would probably request not found in the original, 'minimal' example. Even the expanded Pet Store example falls short of our generated use cases, only adding a DELETE method to the /pets/{id} method. Unless ChatGPT knows of another, more detailed OpenAPI Pet Store description somewhere, it has already demonstrated value by creating a compelling set of use cases.
But can it write an OpenAPI description from them?
ChatGPT Can Write OpenAPI Descriptions. Mostly.
Time to input our text. Notice, I'm alluding to ChatGPT's "memory"; rather than retype the JTBD, I refer to what has come before. Here is a screenshot of the results. At first blush, this is very impressive.
The "listPets" operation could be somewhat unwieldy. Can ChatGPT add some pagination?
Not only did ChatGPT generate the YAML, but it also explained how "page" and "per_page" would work. While not my favorite implementation, it isn't the worst, either.
At this point I noticed that the generated Pet Store description seemed "cut off". For example, the schema objects defined do not appear. What happens if I ask ChatGPT to write something that exists "below the fold"? It turns out, it easily knows not only what I'm talking about but how to style it in the same manner as the rest of the conversation.
I can't say for sure, but there seems to be a limit on the size of the response returned to the user. A valid OpenAPI object, however, seems to be rendered and accessible within ChatGPT's model.
It is also worth noting, in the inclusion of the error object, that what ChatGPT produces is not perfect. It still requires human intuition to identify omissions and “true up” passable, but problematic items (numeric identifiers, for example).
Automated Code Generation From OpenAPI Descriptions Is Also Possible
Thus far, what is generated is tremendously helpful as a kickstarter to a healthy design process. While there are enough omissions or unique preferences that I would want to modify, starting from the suggestions provided puts me miles ahead of those starting with a blank IDE window.
Now, for the final step: using ChatGPT to write code based on the OpenAPI description and JTBD that it generated.
Again, there seems to be a cut-off in what is rendered to the screen; the model generated an entire file, but I am not getting the entirety of that returned to the screen. Much of what can be seen is a restating of the OpenAPI description to be used by the ‘openapi3-ts’ package. That is disappointing, because I can’t actually fire up Node to test the validity of the code.
That said, I've gotten more support here from an inanimate language model than I ever did from a senior engineering "mentor" during my career. That’s something.
There Are Caveats To This Approach
Hopefully, with this brief introduction, you can see the possibilities of having a tool like ChatGPT in the API design process. While I wouldn't put the results directly into production, the benefits of having a tailored theoretical basis to raise questions and jump-start progress is a powerful aid.
Lots of folks have recognized that power, too. At the current time of this writing, getting in more than a request or two at a time has been difficult; the free trial has been swamped; within a few days of launch, ChatGPT had already reached its first 1 million users.
That amount of processing power isn't cheap. OpenAI CEO, Sam Altman, called the compute costs "eye-watering". Some estimates put each operation at just under a dime per chat, equating to nearly $3 million per month. While much of that has been underwritten by OpenAI's partner, Microsoft's Azure Cloud, at some point the "free lunch" will end. When that happens, the paid-for version could have a significant price tag.
For now, it is a compelling tool that individual API designers should be experimenting with. But, given the unpredictability of availability and pricing, it is too soon to declare how ChatGPT fits into an overall design lifecycle.
Have you experimented with ChatGPT in your design process? Were there surprises? I'd love to hear about your experiences.
Milestones
In the last edition, I talked about how exciting the ActivityPub space is. Since publishing, word comes that Mozilla is spinning up a Mastodon effort. In addition, Medium is rumored to also be considering adding ActivityPub. Still talk thus far, but I'm excited to see this play out nonetheless. If you're interested in seeing some experimental NodeJS code, check out the Cubiti Github project.
Wrapping Up
First, thanks to Bruno Pedro for his YouTube video demonstrating how he used ChatGPT to create an OpenAPI document. That and several other folks experimenting in this space inspired this deep dive.
Secondly, I need to thank all the readers - new and old - for their continued attention this past year. This year was a big one: I celebrated the 200th issue, announced a book project, moved newsletter providers, abandoned one social media presence and embraced another, had to change my writing pipeline and bot APIs (thanks, Heroku, grrrrr), and continue to explore a new format. Whew!
I remain flattered that my notes remain relevant to others. I am so fortunate that as I continue to grow and change there are others that are also interested in the journey. Thank you.
If you're interested in supporting the caffeine necessary to keep this enterprise going, and these emails free of ads, paywalls, or information sharing, check out my Patron page.
That's all for now. Till next time,
Matthew (@matthewreinbold and matthewreinbold.com)