Confluence for 9.17.23
The future of professional work. More disclosure declarations. AI produces better innovation ideas than Wharton students. A guide to writing with ChatGPT. Mustafa Suleyman on the coming AI wave.
Welcome back to Confluence. As you can see above, we begin each issue with an image generated by Midjourney’s generative AI tool, and we always note the prompt used as a caption. We do this to bring some imagery to each issue, and to offer a sense of Midjourney’s capabilities. Each image takes roughly 30 seconds to create. The one above evokes the painting style of Georgia O’Keeffe, and relates to our first item of interest in today’s issue.
With that said, here’s what has our attention at the intersection of AI and corporate communication:
The Future of Professional Work
More Disclosure Declarations
AI Produces Better Innovation Ideas than Wharton MBA Students
A Guide to Writing with ChatGPT
Mustafa Suleyman on the Coming AI Wave
The Future of Professional Work
A new working paper provides important insights into the implications of generative AI for professionals.
Our first item in this issue is one to which we hope readers will pay significant attention. Just yesterday a very impressive set of researchers released, through Harvard Business Schoool, a working paper on the largest pre-registered experiment on the future of professional work in the context of generative AI. Ethan Mollick, who contributed to the study, has a writeup here, and you may see the working paper here. We strongly suggest reading both, even if you may find the technical elements of the paper not your favor.
Here is Mollick’s headline:
[F]or 18 different tasks selected to be realistic samples of the kinds of work done at an elite consulting company, consultants using ChatGPT-4 outperformed those who did not, by a lot. On every dimension. Every way we measured performance … consultants using AI finished 12.2% more tasks on average, completed tasks 25.1% more quickly, and produced 40% higher quality results than those without.
The research team ran the study, which has real methodological rigor, at Boston Consulting Group with 758 of its consultants. We won’t restate all the detail that Ethan provides, but there is much more nuance — and importance — to what the study suggests than just an “AI makes professionals better” conclusion (and we predict many will miss the nuance). Here are the main points we wish to spotlight:
There are tasks AI does well and for which it is powerful, and there are tasks it does poorly or inaccurately.
It’s difficult to know which tasks are in which group, as we don’t really know how generative AI does what it does, and there is no real documentation for its use.
As a result, one can think of AI’s capabilities as a “jagged frontier” — tasks inside the frontier's boundary the AI does well, outside of it the AI does poorly, and the distribution of that boundary line is vastly uneven and often unpredictable.
When consultants in the study used generative AI on tasks inside the frontier they were significantly more effective — the AI augmented their speed, productivity, and creativity, by a lot.
When consultants used generative AI on tasks outside the frontier they often were faster and more productive, but came to incorrect conclusions and often produced worse solutions with high confidence.
There is precedent for this last point in the research. The term is “falling asleep at the wheel” — with automation humans have a tendency to become less situationally aware and concede too much trust to the automated process. There are other implications as well, and we have an essay in development that goes much deeper into this important consequence of AI use.
“In our experiment, we also found that the consultants fell asleep at the wheel – those using AI actually had less accurate answers than those who were not allowed to use AI (but they still did a better job writing up the results than consultants who did not use AI). The authoritativeness of AI can be deceptive if you don’t know where the frontier lies.” — Ethan Mollick
Another important insight from the study was that while the use of generative AI improved the effectiveness of participants overall, it had a much larger positive effect on the less-skillful consultants:
The consultants who scored the worst when we assessed them at the start of the experiment had the biggest jump in their performance, 43%, when they got to use AI. The top consultants still got a boost, but less of one. Looking at these results, I do not think enough people are considering what it means when a technology raises all workers to the top tiers of performance.
In essence, the use of AI acted as a skill-leveler, closing the gap between the lowest and highest performers. Given how most talent-development and merit systems work inside organizations, we agree with Ethan that the implications of a possibly flatter, more homogeneous distribution of talent across professional workforces is an area of second-order consequences that we will need to watch.
Finally, the team learned that the distribution of effectiveness outside the frontier — where AI tended to make the consultants produce worse output — was not universal. The study shows that some consultants who used AI on tasks outside the frontier actually produced high-quality work, and it seems how they chose to interact with the AI and oversee its work had much to do with it:
[A] lot of consultants did get both inside and outside the frontier tasks right, gaining the benefits of AI without the disadvantages. The key seemed to be following one of two approaches: becoming a Centaur or becoming a Cyborg … two approaches to navigating the jagged frontier of AI that integrates the work of person and machine.
Centaur work has a clear line between person and machine, like the clear line between the human torso and horse body of the mythical centaur. Centaurs have a strategic division of labor, switching between AI and human tasks, allocating responsibilities based on the strengths and capabilities of each entity … In our study at BCG, centaurs would do the work they were strongest at themselves, and then hand off tasks inside the jagged frontier to the AI.
On the other hand, Cyborgs blend machine and person, integrating the two deeply. Cyborgs don't just delegate tasks; they intertwine their efforts with AI, moving back and forth over the jagged frontier. Bits of tasks get handed to the AI, such as initiating a sentence for the AI to complete, so that Cyborgs find themselves working in tandem with the AI.
Some participants were Centaurs, some were Cyborgs, and some switched between both modes. The research team notes that this finding, which surprised them, is something that requires much more study as the implications of how one uses AI will have lasting effects on how it makes a professional better — or worse. We will be following that examination as it unfolds.
“In my mind, the question is no longer about whether AI is going to reshape work, but what we want that to mean. We get to make choices about how we want to use AI help to make work more productive, interesting, and meaningful. But we have to make those choices soon, so that we can begin to actively use AI in ethical and valuable ways, as Cyborgs and Centaurs, rather than merely reacting to technological change. Meanwhile, the Jagged Frontier advances.” — Ethan Mollick
More Disclosure Declarations
Google and Amazon require disclosures on AI-generated content.
In the first edition of Confluence, we explored the question of disclosure for the use of generative AI. As we said then, it’s an important question, and one that organizations and corporate communication teams should be thinking deeply about right now — both for internal and external communication.
In the past week, Amazon and Google announced disclosure policies of their own, elevating the topic. Amazon is requiring self-publishers to disclose whether their content is AI-generated, and Google is requiring disclosure for the use of “synthetic content” in political ads.
We expect to see more “digital transparency” policies like these in the coming months. As the technology grows more capable (making it increasingly hard to distinguish between human-generated and “synthetic” content) and its use proliferates (flooding the internet, social media, and the market with more and more AI-generated content), audiences will likely grow increasingly skeptical about what is human versus AI-generated.
While the Amazon and Google policies are external-facing, disclosure will be increasingly important for internal communication as well. Our guidance today is the same as it was a month ago: Having a clear perspective on disclosure, and providing dependable assurances as to content origins, is something communication functions are going to have to reconcile in the interest of preserving trust in “official” voices.
AI Produces Better Innovation Ideas than Wharton MBA Students
Creativity is emerging as a powerful use case for generative AI.
In the Wall Street Journal two Wharton business school professors describe their experience, and data, in a competition of MBA students and generative AI as sources of ideas for innovation and product design. Ideas from Wharton students and ChatGPT-4 each provided ideas of new products and services based on the same prompt, and ideas were vetted against three dimensions of creative performance: the quantity of ideas, the average quality of ideas, and the number of truly exceptional ideas.
The machines won the day, on all three dimensions.
First, on the number of ideas per unit of time: Not surprisingly, ChatGPT easily outperforms us humans on that dimension. Generating 200 ideas the old-fashioned way requires days of human work, while ChatGPT can spit out 200 ideas with about an hour of supervision.
Next, to assess the quality of the ideas, we market tested them. Specifically, we took each of the 400 ideas and put them in front of a survey panel of customers in the target market via an online purchase-intent survey. The question we asked was: “How likely would you be to purchase based on this concept if it were available to you?” The possible responses ranged from definitely wouldn’t purchase to definitely would purchase.
The responses can be translated into a purchase probability using simple market-research techniques. The average purchase probability of a human-generated idea was 40%, that of vanilla GPT-4 was 47%, and that of GPT-4 seeded with good ideas was 49%. In short, ChatGPT isn’t only faster but also on average better at idea generation.
Still, when you’re looking for great ideas, averages can be misleading. In innovation, it’s the exceptional ideas that matter: Most managers would prefer one idea that is brilliant and nine ideas that are flops over 10 decent ideas, even if the average quality of the latter option might be higher. To capture this perspective, we investigated only the subset of the best ideas in our pool—specifically the top 10%. Of these 40 ideas, five were generated by students and 35 were created by ChatGPT (15 from the vanilla ChatGPT set and 20 from the pretrained ChatGPT set). Once again, ChatGPT came out on top.
The authors draw three conclusions from the study. First, is that these tools represent a new and novel source of idea creation. Not using it would be, in their words, “a sin.” Anyone can benefit from complementing their own idea set with those generated by generative AI. Second, the work in the early stages of creativity can shift from generating ideas to evaluating them. Third, rather than seeing ideation like this as a competition between humans and machines we should view it as a complementary opportunity. Using generative AI to improve a creative process like ideation while keeping humans at the center is an excellent example of what AI/labor scholars have described as “augmentation” — AI improving, but not replacing, a human skill set. We agree, and communication professionals are well-served to begin using AI as an augmentative resource in this and other arenas of their work.
“Generative AI has brought a new source of ideas to the world. Not using this source would be a sin. It doesn’t matter if you are working on a pitch for your local business-plan competition or if you are seeking a cure for cancer—every innovator should develop the habit of complementing his or her own ideas with the ones created by technology. Ideation will always have an element of randomness to it, and so we cannot guarantee that your idea will get an A+, but there is no excuse left if you get a C.”
— Christian Terwiesch and Karl Ulrich
A Guide to Writing with ChatGPT
A writer offers a helpful how-to on how creatives can use ChatGPT.
Writer David Perell has a helpful post online about writing with ChatGPT here. It’s a long piece, but worth reviewing in full, in part because David approaches the use of ChatGPT not just as a writer, but as a creative. He offers advice on using ChatGPT to consume information, create content, repurpose content, edit, and identify gaps in thinking. He notes three “cardinal tricks and use cases”:
Write from conversation. GPT is your sparring partner. Use it to beat writer’s block, discover new ideas, color-up dull sentences, and find examples that breathe life into your writing.
Use it to fix local problems, not global ones. The more specific the problem you’re trying to solve, the more GPT can help. Give it constraints and specific details.
Enhance, don’t replace. GPT can’t do your thinking for you, but it can definitely improve it. It’s like a super-genius intern, but you’re still the boss.
There is a lot here, with a lot of examples that are helpful as one builds a skill set using this AI tool. Read the whole thing.
“As a creative, GPT lets me focus more on coming up with interesting ideas and less on executing them. It’s made me a better consumer and a better creator, and it can do the same for you.” — David Perell
Mustafa Suleyman on the Coming AI Wave
The DeepMind co-founder believes society isn’t ready for what’s coming.
Mustafa Suleyman is a British entrepreneur and co-founder of DeepMind Technologies, which was acquired by Google in 2014. His current project is Inflection AI and their chatbot Pi (standing for personal intelligence).
Suleyman recently published The Coming Wave: Technology, Power, and the 21st Century’s Greatest Dilemma (more on the book in a future edition), and he’s been doing the rounds of podcast interviews and press appearances. He aims to raise awareness about just how impactful the next wave of technological advances (AI, synthetic biology, quantum computing, and more) are likely to be for society — and to spur industry and government to begin preparing for and mitigating these risks now.
His wide-ranging interview on the 80,000 Hours podcast is worth your time to better understand what we might expect in the next five, ten, or twenty years. You’ll hear about his expectations for computation increases and what that will mean for AI capabilities, near-term risks of misuse versus longer-term risks of superintelligence, governance and alignment challenges, and initial thoughts on what a workable legal and regulatory framework could look like.1
If you just want the highlights, here’s the Claude 2 summary of the transcript:
Here is a summary of the key points from the transcript:
Mustafa Suleyman, co-founder of DeepMind and Inflection AI, discusses the opportunities and risks of advancing AI capabilities.
He believes advanced AI could greatly improve human conditions but also lead to disaster if misused or if it becomes misaligned. Policy is lagging behind technological capabilities.
Suleyman tried to get more external oversight at Google/DeepMind but faced challenges. Broader representation in AI decision making is difficult but important.
At Inflection AI, Suleyman is focused on building helpful personal AIs. He argues this doesn't directly speed up capabilities he thinks are risky, like autonomy.
He supports some AI safety research but believes “superintelligence” fears have been a distraction. The main risk now is proliferation of power if AI becomes widely accessible.
Suleyman advocates for government audits, oversight, and some mandatory commitments on AI companies to manage risks. But he notes benefits often require some tradeoffs with harms.
Overall he argues we must take safety seriously but also continue advancing capabilities to understand the technology and work towards beneficial outcomes.
That’s all we have for this edition of Confluence. We will leave you with something cool: For those of us who love the idea of creating music but lack the skill to do so, check out Stable Audio …
AI Disclosure: We used generative AI in creating imagery for this post. We also use it selectively as a summarizer of content and as an editor and proofreader.
On legal and regulatory front, government and industry leaders in the U.S. took some initial steps in that direction on September 13’s AI Insight Forum on Capitol Hill.