There are a whole lot of promises being made about Generative AI tools, alongside some pretty clear hazards — as well as the whole awkwardness of trying to figure out whether people are getting work done, or just dodging the assignment. Perhaps you're currently trying to figure out what to say to an engineer who's been using ChatGPT to write pull requests (tests and description included). Or you have an executive who came back from a conference excited to "add AI" now.
The pitch you might hear from the top is "AI will help us work faster" or "using AI will cut costs". It may even be "everyone is doing it" which is kinda hard to make rational arguments against. Let's look at what GAI can actually do, then where it may fit into your needs. That will prepare you to have productive discussions about how to move forward.
One idea I continue to return to is "let computers do the things that they're good at, and have humans do the things we do best". Let’s see how the design of GAI systems can show us its strengths and weaknesses.
The first part of a GAI system is the source data set. A good body of data will be broad enough to cover the nuances of how the tool is going to be used, but focused enough to not get sidetracked. Data choices can increase relevance, but not entirely eliminate the hallucinations that happen when AI has to interpolate answers, which can be caused by edge cases the data just didn't cover, as well as deliberate manipulation.
Biases will be amplified — if everything in a category has an unrelated attribute, the AI can't know if those things are related in a broader context. The results can be racist, sexist, and abusive. That's not because those are sensitive topics — things will get just as off track if you’re trying to clean up variable naming conventions across different parts of the code base.
When looking at code, those biases might lead it to gravitate toward a particular style or conventions. That could be helpful if there’s a strong consensus about how to write things, but not so great if you’re picking up unintended norms. When using a commercial GAI product you don’t really know what the quality of that source data is like, whether the code is well-written, or if it contains major security flaws. That means the output is going to require some oversight.
Next, you need the algorithm that will make connections between each tiny piece of data and the things around it. This is where a common misconception comes up — GAI doesn't know anything about what an apple or a pear is in a real-world sense. What it knows is that if someone types "a p p l" there's a good chance the next letter they want is "e". If the source data contains a lot of recipes, it might know that "apples" and "pears" can be used to make "pie" as well as "fruit salad". I'm simplifying, but this is an important detail. It's all just bits of data.
Data annotation is a key part of building these associations. You already have a glimpse of this if you've filled out a CAPTCHA. I remember the ones Google used years ago during their book scanning process to improve the interpretation of text. Now we get grids of images with traffic lights and crosswalks, things that are needed for image AI systems in self-driving vehicles. But much of this data labeling is done by low wage workers:
Even the most impressive AI chatbots require thousands of human work hours to behave in a way their creators want them to ... The work can be brutal and upsetting. Data annotators in Kenya, for example, were paid less than $2 an hour to sift through reams of unsettling content on violence and sexual abuse in order to make ChatGPT less toxic. These workers are now unionizing to gain better working conditions.
If you're using or developing a GAI system, make sure that both the data sources and processing work are handled in a way that aligns with your goals. Making someone else look at grotesque content so the rest of us don't have to see it kinda sucks. But there are less severe examples, too. Those CAPTCHAs do not strike joy in users who are just trying to get work done.
Now that you understand the components that go into it, you can think about whether GAI is an effective solution for your needs. Where it really excels is in tasks with specificity and volume. That's why translation, content moderation, and categorization have been key commercial areas. Remember, predictive algorithms don't assess truth, just similarity.
We already have a lot of tools available that may be a better fit in our work. You don't really need AI to explain how to do something that you already know yourself and can describe step-by-step. If the work is tedious, there's a good chance that something needs to be automated. Or maybe there are signs that more training is needed, or a shuffling of responsibilities. Sometimes people dodge tasks because they lack confidence in their abilities. GAI is not going to make that better, what's needed is more coaching.
The best GAI system is going to augment what your team is already rocking — not conflicting with their creativity or growth, but wrangling things that computers never complain about, like sorting apples from oranges or generating large datasets for testing. There are interesting opportunities for ChatOps too. Prediction algorithms can do fairly well with mapping a range of possible queries to the information a user is looking for.
Doing GAI well requires a significant investment — a large enough body of relevant source data, the labeling process, all that algorithm tuning, and a strong user interface. AI-generated code is going to miss many of the nuances of your system, and the tools now on the market don’t really have the ability to understand that. Go ahead and experiment, but keep asking – is there anything here that we can’t do with a template?
Writing this made me realize that there are aspects of writing code and documentation that we never really examine. With so much emphasis on the output, we can overlook what the impact is of having people do the work. That’s what I’ll be writing about for next time.
Further reading: The 2022 ACM conference paper “The Fallacy of AI Functionality” goes into a ton of detail on the gaps between how AI is expected to work and its actual capabilities. The authors point out that most work on AI policy doesn’t examine the basic issue of function. If you’re planning to incorporate GAI into your products, this one is worth a read.
Photo credit: CC-BY #WOCinTech