Confluence for 10.13.2024

Insights from our annual client retreat. The Turing Trap. Sequoia’s latest paper on generative AI. Navigating the question of cost versus capability in LLMs. A quick prompt design tip.

Oct 13, 2024

Midjourney prompt: *An illuminated manuscript titled “CONFLUENCE” with decoration evocative of the Northwoods of Wisconsin in autumn.*

Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI and corporate communication:

Insights from Our Annual Client Retreat
The Turing Trap
Sequoia’s Latest on Generative AI
Navigating the Question of Cost Versus Capability in LLMs
A Quick Prompt Design Tip

Insights from Our Annual Client Retreat

What we took away from our conversations with clients and a leading generative AI developer.

Last week, we hosted our annual client retreat. And, like last year when we hosted Wharton economist Dan Rock, generative AI was once again an area of focus. This year, we welcomed Dean Thompson from Eloquence AI, a leading AI developer and collaborator for our firm.

Throughout the week we learned much from Dean about not only the current state of generative AI and its applications, but where we go from here. While we’ve previously shared some of what we’ve learned from Dean, we want to share new insights from the past week.

When individuals and organizations start using generative AI, they shouldn’t expect immediate return on the investment. There is going to be trial and error, and people will need the space to experiment, learn, and define use cases without expecting immediate and obvious returns. Although this process may seem inefficient in the short term, it lays the foundation for substantial long-term value and innovation.
Communication professionals, and those with expertise in human interaction, will need to play a role in developing generative AI applications. Interacting with generative AI is much more like interacting with a human than with a computer program. If we want these applications to be useful to people, we need to involve those who understand how to work with and communicate with people in the development process.
The next few years will resemble a global change management program with people around the world adapting to having generative AI at their fingertips. The last change that was close to what we’re likely to see with generative AI was the introduction of the internet — but this is going to happen much more quickly and potentially be much more disruptive. We will not simply be adapting to and learning how to use enterprise tools that make us better or more efficient. We are going to be learning how to live in a world where generative AI permeates many facets of our lives.
As disruptive as generative AI is likely to be, we can also expect the adoption to be uneven. The AI haves and have-nots — those who can and can’t effectively wield generative AI — will create real performance gaps across sectors. The gap may take time to appear, but expect its consequences to be meaningful.
Generative AI is going to keep getting better, even if the first application of an idea doesn’t live up to expectations. Consider AI personal assistants. The first wave, which will likely come in some form within the next year, will be pretty terrible and feel useless. A year or two later, they’ll start to get good and have real day-to-day utility. A year or two after that, they will be scary good. While the technology and the applications that use it have their limitations and can feel incomplete, they are going to evolve rapidly.

These insights hardly scratch the surface of the week’s conversations. We’re finding that our clients are becoming more eager to dive headfirst into the world of generative AI. It will be fascinating to see how the conversation evolves between now and our next client retreat in the fall of 2025.

The Turing Trap

The economic concept has timely implications for corporate communication.

In a 1950 paper titled “Computing Machinery and Intelligence,” Alan Turing first articulated what we now refer to as the Turing Test. As many readers of Confluence will know, the test posits that if a human evaluator cannot reliably distinguish between responses from a machine and a human, the machine has demonstrated human-like intelligence. The notion is that machines are intelligent to the extent that they can imitate humans (the first section of the paper is “The Imitation Game”). 72 years later, in a 2022 essay, economist Erik Brynjolfsson introduced a related concept: the Turing Trap, which points out the limitations and risks of an excessive focus on human imitation (or, in economic terms, replacement of human labor) by machines.

We came across this concept in a recent lecture Brynjolfsson delivered as part of the Wharton School’s second annual Business & Generative AI Workshop. Early in the lecture, Brynjolfsson uses an example to illustrate one dimension of the Turing Trap, which we’ve used NotebookLM to summarize:

Brynjolfsson asks us to imagine that Daedalus, the legendary craftsman and inventor, had somehow succeeded in building human-like machines capable of automating every job performed by the ancient Greeks. He lists several examples of tasks that could be automated, including:
Herding sheep
Making clay pottery
Weaving tunics
Repairing horse-drawn carts
Bloodletting victims of disease

The takeaway, as summarized by NotebookLM, is that “focusing solely on automating existing human tasks, as embodied in the pursuit of human-like AI, is a trap. It limits AI’s potential to revolutionize our lives and create a truly prosperous future. He advocates for shifting our focus to developing AI that augments human abilities, enabling us to explore new frontiers, create new industries, and solve complex challenges in ways that were previously unimaginable.”

Brynjolfsson is an economist, and he views this challenge through a macroeconomic lens. But we can apply the Turing Trap to corporate communication from at least two different angles. The first is how we integrate generative AI into our work. The natural inclination is to focus on how generative AI can augment (or, in some cases, replace) the tasks our teams do today. This is a good starting point — it’s something we’re doing ourselves and on which we’re advising our clients — but if we’re mindful of the Turing Trap, there are more questions we should begin to ask. What are we not doing today that generative AI might enable us to do? What might be possible for us today that wasn’t possible without the assistance of generative AI? If we focus solely on our existing tasks, we’re susceptible to the Turing Trap — and are likely to leave substantial value on the table.

A second implication of the Turing Trap for corporate communication has to do with the framing of organizational messaging around generative AI. We recently cited Arvind Karunakaran’s research on the impact that message framing has on organizational adoption of AI. What might the Turing Trap tell us about how to craft messaging related to organizational implementations of generative AI? For one thing, it suggests that we should position these technologies as much as about creating new possibilities for tomorrow as about simply finding productivity gains today. For organizations, the Turing Trap lies in failing to see these new possibilities and missing out on opportunities to innovate — the equivalent of the Daedalus example of simply finding more efficient ways to repair horse carts. Communication teams have an important role to play in making sure that doesn’t happen.

Sequoia’s Latest on Generative AI

“The agentic era begins.”

Sequoia Capital, the renowned venture capital firm founded in 1972 and headquartered in Menlo Park, California, has issued its most recent white paper on generative AI, “Generative AI’s Act o1,” which you may read here. Known for its early investments in Apple, Google, Oracle, and Airbnb, Sequoia has established itself as one of the most successful and influential venture capital firms in Silicon Valley and globally. Given its reputation for identifying and nurturing groundbreaking companies, we take careful note of its take on where generative AI goes from here.

The paper is written for a Silicon Valley/tech investor audience, but don’t let that deter you from reading it (although if you prefer a podcast version, we’ve created one with NotebookLM that you can listen to here). The gist is that OpenAI’s new o1 model represents an important threshold in generative AI development, one that Sequoia describes as a shift from “System 1” thinking, where a model provides pre-trained instinctual responses, to “System 2” thinking, which is deeper, deliberate reasoning, where the model can “pause, evaluate and reason through decisions in real time.” Describing o1, they note:

… something fundamental and exciting is happening that actually resembles how humans think and reason. For example, o1 is showing the ability to backtrack when it gets stuck as an emergent property of scaling inference time. It is also showing the ability to think about problems the way a human would (e.g. visualize the points on a sphere to solve a geometry problem) and to think about problems in new ways (e.g. solving problems in programming competitions in a way that humans would not).

Another Sequoia observation got our attention, and it’s about what comes next with generative AI applications. Their contention is that the large model companies (OpenAI, Google, Meta, Anthropic) will not invest significantly in specific generative AI applications outside of their base models, and that instead, they will rely on application developers to create those tools. As they note:

As the research labs further push the boundaries on horizontal general-purpose reasoning, we still need application or domain-specific reasoning to deliver useful AI agents. The messy real world requires significant domain and application-specific reasoning that cannot efficiently be encoded in a general model.

We’ve been forecasting this for over a year: while you will use tools like ChatGPT and Claude for all sorts of general things, if you want a version that is, say, an incredibly strong and accurate ghostwriter that knows your brand voice, style guide, etc., someone else is going to need to write that application (using a large language model (LLM) like ChatGPT as its core engine) — the base model itself won’t be expert enough to provide the specificity and quality you need. And for very smart agents, these applications will include a cognitive architecture (how the generative AI agent should “think”), combined with searches of proprietary data, specific prompts working behind the scenes, and more.

This is in essence what we’ve done with our leadership AI, ALEX: while it runs on Claude to interpret and generate responses, we’ve tried to code into it the thought processes that we believe our advisors use when serving clients, we’ve included a search of over one million words of our proprietary text, and we’ve written thousands of lines of prompting code that outline our advisory philosophy, ALEX’s personality, response preferences, and more. Because without all that, the leadership advice general models give just isn’t very special.

In this way, ALEX is a form of AI agent (in this case, your leadership advisor). If Sequoia is right, in the coming few years we will see significant investment in agent development. This won’t make the base models obsolete, as we will still want to use a powerful general intelligence for all sorts of things. But for very specific use cases, from engineering to medical research to booking travel, agents written to run on top of the base models will be what we turn to. A year ago we were saying “pay attention to generative AI.” Now we’re saying, “pay attention to agents.”

Navigating the Question of Cost Versus Capability in LLMs

When it comes to generative AI, you might just get what you pay for.

We are often asked about the practical differences between various large language models. It’s a topic we wrote about in March, when we explored the widening gulf in AI capabilities. A new study, “Not All LLM Reasoners Are Created Equal,” validates our earlier observation: when it comes to AI, investment often correlates with capability.

The researchers tested AI models on two types of problems: straightforward, single-step reasoning tasks (the GSM8K test, which consists of about 8,000 grade school-level math word problems that require multi-step reasoning to solve), and more complex, multi-step reasoning challenges (the Compositional GSM test, which is a variant of the GSM8K that evaluates an LLM’s ability to solve more complex, multi-step mathematical problems that require compositional reasoning). The results are striking. While smaller, cost-efficient AI models perform comparably to their larger counterparts in simple tasks, they fall notably short in advanced reasoning challenges — like the difference between a competent home cook and a professional chef. Both can prepare quality meals, but one excels in creating complex dishes. For tasks that require maintaining context and performing multi-step reasoning, larger models consistently outperformed their smaller counterparts — often by significant margins. This difference has real implications for how these tools perform in complex, real-world scenarios.

As communication professionals use generative AI for increasingly complex challenges, we need capable tools. For tasks requiring comprehensive analysis and the ability to connect dots and generate insights, you’re going to get what you pay for. This isn't to suggest that every task requires the most expensive LLM available — and, in fact, many free models continue to perform at a level adequate for many tasks. The key to a successful cost / benefit analysis lies in a careful assessment of specific needs and challenges. While simpler AI models may suffice for the basics, complex tasks stand to benefit from the advanced reasoning capabilities of premium models.

This performance gap underscores that in the realm of AI, increased investment often correlates with enhanced capabilities. In an era where AI plays an increasingly significant role in strategic decision-making and operational efficiency, understanding these distinctions could prove valuable in maintaining a competitive edge.

A Quick Prompt Design Tip

Write your prompt for an educated layperson who knows nothing about the topic.

This post by Garry Tan on X caught our attention this week, as it crystallized much of our own thinking on how to write strong prompts for an LLM: think of an educated layperson who knows nothing about the topic, and write your prompt as a set of instructions for them to do what you’d like them to do. Tan embellishes this a bit more:

Show your prompt to a real human being. If they don’t understand, speak to them until they do. Then anything you needed to tell them verbally, add it to the prompt.

Great advice. Try it — it works.

We’ll leave you with something cool: Pika AI has released their latest video model, Pika 1.5. You can use it to inflate, crush, or squish any object in an uploaded image. Or turn it into cake.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

David C Morris

Oct 13

Re: The Turing Trap - I listened to a podcast recently with another Economist and I like his take on AI - It's not just making a worker more productive in tightening the screws, but it's really creating new jobs that didn't exist.” https://overcast.fm/+AAALb7T4bOA

Expand full comment

Confluence: AI, Leadership, and Communication

Discussion about this post