We Thought GenAI Could Handle It: We Were Wrong... Until We Weren’t

Sep 28, 2025
2 min read

Like everyone else, we’ve been experimenting with GenAI at work. In our Content Design team, that meant investing serious time in a CustomGPT trained on our product copy guidelines. We refined it again and again, testing and retraining until it could reliably generate compliant copy.

When we fed it the right prompts, it worked. It gave us copy that followed our standards and helped boost efficiency. Other teams used it to insert compliant product copy earlier in the design process. Of course, our content designers still did the heavy lifting. They polished language, made sure everything flowed, and kept the voice consistent. But, the tool was showing real promise.

We were so confident, in fact, that we committed to rewriting error messages in our Designer product. We'd wanted to tackle this project for some time and asked Engineering for an extract, thinking we could quickly bang out a few hundred. Easy peasy, right?

Wrong.

First, the extract contained over 8,500 messages. Then, our CustomGPT failed miserably. The spreadsheet format broke it. The volume broke it. Instead of fixing the copy, the model introduced more errors: typos, skipped rows, inconsistencies. What we thought would be a showcase initiative quickly looked impossible. We never would have suggested the project if we thought we'd have to do it by hand.

The team tried everything: breaking the spreadsheet into smaller files, sorting by error type, feeding the model different formats. The only approach that produced even somewhat reliable results was the most painstaking one: copying and pasting 15 error messages at a time. At that pace, it would have taken nearly 50 hours. Too long!

Then came the breakthrough.

In parallel, our teams had been working on new GenAI-enabled workflow tools for Alteryx Designer. One of them allows you to submit text prompts to an LLM at scale. So, the Content Design Team decided to dogfood it. Instead of forcing our CustomGPT to chew through messy spreadsheets, they built a workflow that submitted each error message individually.

This time, it worked beautifully.

The workflow updated all 8,500+ messages in just 3.5 hours. Accurate, consistent, and compliant.

Compared to the copy/paste method: 1,243% faster
Compared to manual rewriting: 3,928% faster

For me, this was one of those leadership moments that swelled my heart with pride and gratitude. I got to watch a team run straight into a wall, refuse to quit, and then come back with a solution better than anything we imagined at the start.

The best part: the feature that made it possible was something our teams designed and built. And, it’s already available in private preview to Alteryx customers!

It’s a win-win-win:

More readable error messages for Designer users
Huge efficiency gains without sacrificing quality
A chance to validate a feature we designed and built by using it ourselves

Better, human-readable content, a massive leap in efficiency, and a proud case of drinking our own champagne. That’s a win I’ll celebrate any day.

We Thought GenAI Could Handle It: We Were Wrong... Until We Weren’t

Recent Posts

Comments

Elizabeth Benker

Quick Links

Topics