Confluence for 11.17.24

What would it mean if generative AI progress slows? Employees are uncomfortable talking about AI. Is prompt engineering becoming less important?

Nov 17, 2024

Midjourney prompt: *Mountain range formed by rising columns of binary code, transitioning from chaotic ascending peaks to ordered horizontal plateau. Matrix green. Technological atmosphere.*

Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI and corporate communication:

What Would It Mean If Generative AI Progress Slows?
Employees Are Uncomfortable Talking About AI
Is Prompt Engineering Becoming Less Important?

What Would It Mean If Generative AI Progress Slows?

Even if progress comes to a complete halt, the power of today’s generative AI models remains largely untapped.

On November 11, Reuters published an article titled “OpenAI and others seek new path to smarter AI as current methods hit limitations.” The past two years have seen rapid advances in the capabilities of frontier large language models. The question raised in the Reuters article and elsewhere is whether that trend can continue based on current methods (which, to date, have mostly been driven by increasing the scale of the data and computing power used to train the models). There’s no doubt that this question is important. Whether or not generative AI capabilities continue to advance at the pace we’ve seen for the past two years will have serious economic and societal implications. Our view on questions about future capabilities is that time will tell. Upcoming model releases from OpenAI, Anthropic, and Google may represent a step change to a new capability level (as happened with the jump from GPT-3.5 to GPT-4), or they may only represent an incremental improvement. We’ll have to wait and see.

There is, though, a question that we can answer now: what would it mean for corporate communications (and most other professional fields) if such a slowdown or halt in progress were to occur? Our answer is that, in the near term, it probably would not mean much. Even if progress on frontier models were to completely stall — which we do not think is likely — the power of the current models available to us remains largely untapped. That value is there for the taking whether or not we see a “GPT-5 class” of models someday in the future. What’s holding most teams and organizations back from harnessing the power of these models is not so much the models’ capabilities but the (relative lack of) human utilization of them.

There’s mounting evidence of the power of these tools when utilized effectively. It’s been over a year since the publication of “Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality,” the landmark study that we reference frequently in Confluence and with clients. One of the major findings of that study is that subjects (consultants at Boston Consulting Group) who used generative AI for tasks for which it was well-suited saw a 40% increase in the quality of their work (as evaluated by human judges), 25% decrease in task completion time, 12% increase in task completion rates. They got more done, faster, and at a higher level of quality. And that’s with a model — GPT-4 — that is substantially less capable than the improved models available today. One key, of course, is that those improvements only came when using generative AI for tasks for which it was well-suited — to use the parlance of the researchers, tasks that were “inside the jagged frontier” of generative AI’s capabilities. When subjects used generative AI for tasks outside of the frontier (i.e., for which generative AI was not well-suited), their performance (specifically, their accuracy) could actually get worse. Knowing where these models excel and where they struggle is essential to using them well. The good news is that these models excel at a lot — much of which is directly relevant to the work of corporate communications — and similar gains are available to anyone who decides to take advantage of them.

Another example of the power of the current models is a paper by Brian Porter and Edouard Machery published in Nature this week, finding that “AI-generated poetry is indistinguishable from human poetry and is rated more favorably.” The AI-generated poems in question were written in the style of prominent poets like Shakespeare, Walt Whitman, and T.S. Eliot. So, the results actually found that the AI-generated poems were rated as “more Shakespeare than Shakespeare, more Whitman than Whitman, more Eliot than Eliot,” and so on. Why does this matter for corporate communications? One point of skepticism and resistance we often hear is that today’s generative AI models aren’t capable of handling the nuances of voice, tone, and other subtleties that make for effective written communication. Studies like this one show — and our own experience suggests — that that is likely not the case. Getting this level of sophistication is not particularly difficult. All it takes is some decent prompting and some examples. The power is there; it’s a matter of eliciting it.

We’ll share one last data point from a recent study that closely aligns with our own experience. Last week, we wrote about an MIT study that found that the introduction of an AI tool to materials scientists increased their discovery of new materials (by 44%) and patent filings (by 39%). Those data points are fascinating in their own right, but the paper also contains some striking findings about the human effects of this experience. Among them was how the scientists changed their beliefs about AI after working with this tool. We think the findings are important enough to include the full chart below:

Figure 15 from “Artificial Intelligence, Scientific Discovery, and Product Innovation,” Aidan Toner-Rodgers, MIT

Each of these data points is interesting enough to warrant its own Confluence post. But the bigger point is that the extended exposure to and experience of working with the power of today’s AI substantially changed these scientists’ beliefs about AI and what it means for their work. As the author of the paper notes,

These results show that hands-on experience with AI can dramatically influence views on the technology. Furthermore, the responses reveal an important fact: scientists did not anticipate the effects documented in this paper. This fits a recurring pattern of domain experts underestimating the capabilities of AI in their respective fields.

The emphasis above is ours. The speculation about the pace and extent of generative AI’s improvements will continue, and again, time will tell what future improvements look like. What we know today is that the models available to us right now are not just powerful, but useful. It takes some effort to understand the strengths and limitations of these models, and it takes some skill with prompting. But ultimately, what’s most important is to spend some time using the tools yourself. Those who haven’t done that already may find that — like the scientists in the MIT study — they’re underestimating the capabilities of today’s models. We’d advise our readers to not let the speculation about the future distract you from what’s possible right now.

Many Employees Are Hesitant to Discuss Their Use of Generative AI

To normalize the use of generative AI, we need to normalize the conversation.

Slack’s Fall 2024 Workforce Index, published on November 12, includes several interesting insights. One stood out to us more than the others:

Nearly half (48%) of all desk workers would be uncomfortable admitting to their manager that they used AI for common workplace tasks. The top reasons for workers’ discomfort are 1) feeling like using AI is cheating 2) fear of being seen as less competent and 3) fear of being seen as lazy.

This data point is eye-catching but not necessarily surprising. In June of 2023, Ethan Mollick wrote about the phenomenon of “secret cyborgs” in organizations — those who use generative AI for their work but do not openly talk about it. As he put it then,

People are streamlining tasks, taking new approaches to coding, and automating time-consuming and tedious parts of their jobs. But the inventors aren’t telling their companies about their discoveries; they are the secret cyborgs, machine-augmented humans who keep themselves hidden.

The Workforce Index data gives us a picture of just how prevalent the secret cyborg phenomenon is, and some of the human factors contributing to it. It should also lead us to bring a healthy dose of skepticism to polls and surveys exploring generative AI utilization, as there is a decent likelihood that participants could underreport their use due to similar concerns.

This phenomenon is likely playing out to some extent in nearly every organization. In the absence of clarity on an organization’s generative AI strategy, policy, and guidelines, employees will be hesitant to discuss how they’re using it. As we’ve written before, there’s increasing evidence of a gap between leadership and employee perceptions of organizations’ approaches to generative AI, which one would expect to exacerbate the “secret cyborg” dynamic.

Individuals keeping their use of generative AI secret is suboptimal for a number of reasons, but perhaps the biggest is that it prevents teams and organizations from learning from each other. Most of the best use cases for a given role or team are discovered organically, through individual trial-and-error. The best way to amplify the impact of those discoveries is to share them. When individuals — again, often for good reasons — keep those discoveries to themselves, it prevents the team and the organization from reaping the full reward of their investment in generative AI.

In his original post about secret cyborgs, Mollick concludes that “at least for now, the only way for an organization to benefit from AI is to get the help of their cyborgs, while encouraging more workers to use AI.” Communication leaders have an important role to play in shaping the organizational conditions that allow this to happen. This includes clear communication about the organization’s AI strategy and guidelines from senior leadership, but also equipping and empowering managers and supervisors to have productive conversations with the individuals on their teams. Normalizing the productive use of generative AI within organizations requires normalizing the conversations about it.

Is Prompt Engineering Becoming Less Important?

Recent research suggests that it might be.

A new study suggests the sophistication of our prompts may become less critical as AI language models advance. The research, titled “ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs,” comes from a team at the Shanghai AI Laboratory with researchers from multiple institutions, including The Chinese University of Hong Kong. The findings show that newer models handle different question phrasings better, suggesting we may need less precise prompt engineering in the future. The researchers propose a new framework called ProSA for measuring prompt sensitivity, with metrics for assessing how AI models respond to different phrasings of the same request.

We’ve long considered prompt construction to be central to successful collaboration with generative AI — look no further than our Prompt Engineering Guide. And while this study’s ProSA framework might offer interesting methods for analyzing generative AI’s behavior, the key takeaway remains clear: the data shows that as models become sophisticated, they grow more adept at understanding our intent, regardless of phrasing. For example, the study finds that providing just one good example to generative AI proves more valuable than crafting the perfect prompt.

We maintain a measured perspective on the study’s developments. The research offers insights about the direction of AI interaction, and we’re monitoring how these findings might translate to real-life applications. We’ll be interested to see whether this trend toward AI adaptability continues as new models become available to the general public. For now, we’ll balance our best practices in prompt engineering with an open mind toward evidence about evolving AI capabilities.

We’ll leave you with something cool: how Midjourney power user Nick St. Pierre is using Midjourney and other generative AI models to create physical objects.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

Confluence: AI, Leadership, and Communication

Discussion about this post