Confluence for 12.22.24

Why you must read this issue of Confluence. Communication leaders must pay attention to o1 Pro. OpenAI announces o3. Overhang is increasing.

Dec 22, 2024

Midjourney prompt: A person standing on the edge of a massive canyon overhang that stretches cartoonishly for miles over the canyon, profile view, dramatic depth, golden hour lighting, detailed watercolor style, loose brushstrokes, muted earth tones with plenty of white and negative space, running paint texture, panoramic composition, sense of scale

This may be the most important Confluence you read this year. If you read only one edition, read this one. That’s not because of any one item in this week’s missive – it’s because of what, collectively, they represent. It has been a very, very busy week in the world of generative AI. Google continued its trend of impressive releases (which we wrote about last week) with the release of Veo 2, their new video model. It’s very impressive. OpenAI brought its video generation model, Sora, to a large portion of their users. We have access and have been playing with it. It does some things very well, and others, not so much. Here’s a video we created from a prompt asking for “A flying, cinematic shot over a Northwoods Wisconsin lake in the fall. The camera is low to the water. It approaches a lodge on the shore. There are a few high cirrus clouds in the sky picking up the color of a low sun. Wind on the water and in the trees. Move-filmic quality”:

It took Sora about a minute to generate that video.

With so many updates, we looked back to what we wrote about this time last year:

X Launches Grok AI
More AI Plagiarism Shoes Drop
Stability AI Releases Stable Video Diffusion
Google Releases NotebookLM
A Lesson in Change Communication and AI

That list, today, seems almost quaint. In the Stability AI piece, we wrote:

You can get a brief glimpse of the capabilities in this video. Limitations of the current model include the length of videos users can generate (less than four seconds) and some of the same issues we see with image generation (problems with text, human features and faces not always rendering properly, etc.). As we’ve written many times, however, we would advise readers not to mistake current limitations with lasting ones.

Amen to that. Now, just 12 months later, anyone with enough money for a cocktail in a major city can create much better videos, up to 20 seconds long, at will. Consider what’s happening in video an example of what’s happening across the board. Things are advancing at a fast — and increasing — pace. Example: o1 Pro from OpenAI, and, announced this week, their new frontier model, o3. o3’s benchmark tests blew close observers of the field away, and in competitive testing it already qualifies as one of the most talented software engineers alive. We cover both models below. The future is coming sooner than you probably expect. With that said, here’s what has our attention at the confluence of generative AI and communication this week (and yes, only three pieces, in part because of some additional reading we’re going to suggest you do):

Communication Leaders Must Pay Attention to o1 Pro
OpenAI Announces o3
Overhang Is Increasing

Communication Leaders Must Pay Attention to o1 Pro

We think the leading o1 model can do much of the work many communication professionals do.

We posted the other week about o1 Pro, Open AI’s leading available model. It’s not cheap: $200 a month. This fee gives you unlimited access to o1 Pro, and pretty much unlimited access to everything else OpeanAI offers (their other models, Sora, etc.). One of us plunked down the fee this week to start exploring o1 Pro, and we’ve found it very smart, and very capable at communication work.

This is a “reasoning” model, meaning it takes time to plan and work iteratively, improving its thought process as it goes, prior to providing output. And it does take a while to think, sometimes a few seconds, sometimes minutes or more. As we noted the other week, it seems that scientists and academics are doing the most impressive work with o1, but we took some time to test it on a communication challenge this week and found the results striking.

Here’s the initial prompt we gave o1 Pro (an amalgamation of real-world matters facing several of our clients):

Here’s the brief. You are an experienced internal communication leader working for a quasi-independent regulatory agency in Washington DC. 8,000 employees, all intelligent and generally committed to the organization’s vision. Almost all live in the DC area. All have had total work from home flexibility since Covid in 2020. The organization has touted this flexibility as a strength of the organization, noting that work is about what people produce more than where they produce it. In mid-2024, senior leadership came to believe that the WFH policy was actually harming productivity, culture, relationships, and talent development. They have made the decision to require two days a week in the office. The new policy is employees must be in the office two days a week or 8 days a month, the employee picks the days. Those are minimums. They are welcome to be in as much as they like. The policy will go into effect on April 1, 2025, giving people time to plan and adjust. It is not up for debate or negotiation. This will be a very unpopular decision with many employees. There is a town hall next week, on Wednesday, in which the CEO plans to announce this new policy. The organization has 92 people leaders in four bands of management. The town hall is virtual. Plan and create a comprehensive communication strategy for the organization to execute. It should include activity before and after the announcement. You determine the timeline and framework. It should include core messaging (not scripts, but the core message set), message sets for discrete audiences, channels, activity, and timeline. First develop the strategy. Then we will go from there.

o1 Pro thought about this prompt for one minute, 38 seconds, and developed a first take of its strategy. We then worked with it through a series of prompts to improve the strategy and messaging, develop a list of deliverables to support the strategy, draft those deliverables, create a detailed “shot clock” of the communication timeline, and write an executive summary to cover the entire thing. You can see the full prompt chain in the footnotes.1 We copied all that content into a Word file and applied some basic formatting. The file is here for you to download and review. You should do so.

O1 Pro Communication Strategy

396KB ∙ PDF file

Download

The full document is 59 pages long and includes over 14,000 words. We developed it in 90 minutes — 60 of which was working with the model, and 30 of which was pasting and formatting. At first reading, we would say the quality is on par with most communication materials we see inside client organizations. The strategy is relatively insightful, and the quality of the deliverables is quite good. Especially impressive was o1 Pro’s ability to retain context throughout the process, keeping messaging and principles consistent, recognizing earlier steps in the communication process in later deliverables, etc. Is it as good as what we would produce? No, but if we were to also prime the model with our philosophy on change communication, a set of cogent reputational and relational objectives to honor in the communication, a style guide, and more, we have no doubt the work would be even stronger.

We ask that you reflect on how long it would take people on your team to develop a 59-page communication strategy, change approach, and deliverables packet. This is what o1 Pro represents. And as we’ll describe below, it’s a piker compared to o3.

Most users won’t pay $200 a month for o1 Pro. Compared to an intern’s salary, however, it’s a bargain. But the price isn’t really the point. The point is that the technology behind o1 Pro is coming soon to models more widely available at lower price points (be those models from OpenAI, Anthropic, Goole, or others). The state of the art comes quickly in this field. Just as Stable Diffusion’s video capability was novel 12 months ago but is now more broadly and cheaply available, we can expect the same of the reasoning and ability of o1 Pro — with o3 (and whatever the frontier models look like after that) soon to follow.

Which is why this is the most important Confluence you might read this year. 2025 is the year in which communication teams are going to have to grapple with what this technology means for the work they do, who does it, and how. The adage is that the future is already here, but that it’s just unevenly distributed. That distribution is coming, fast, to corporate communication.

OpenAI Announces o3

If the benchmarks hold, we are on the cusp of another new frontier.

And now for the second reason that this may be the most important Confluence you read this year. We’ve grown accustomed to the rapid pace of advancement in generative AI, but OpenAI’s latest announcement suggests we’re witnessing something more profound than mere progress. On the final day of its “12 Days of OpenAI” event, the company unveiled o3 (and no, you didn’t miss o2 — OpenAI skipped right past it for legal reasons associated with the name). While currently restricted to safety testers, its performance on key benchmarks signal that we’re entering new territory.

The numbers tell a striking story. On the ARC-AGI benchmark, one of the most challenging tests of AI reasoning capability, o3 achieved scores that dwarf previous models. Here’s a chart that compares o3’s performance against o1.

The “low-compute” version scored 75.7%, while the high-compute version reached 87.5%, surpassing typical human performance of 85%. For perspective, GPT-4o, which was the leading model when it was released barely seven months ago, scored just 5% on the same test. But perhaps even more striking is o3’s competitive programming performance. On Codeforces, a website that hosts competitive programming contests, o3 ranks as the 175^th best software engineer on the planet, based on ELO ratings, putting it in the 99.7% percentile.

If these benchmarks hold, we are not merely seeing incremental gains in pattern matching or text prediction. According to the technical analysis, o3 appears to have cracked a fundamental limitation of large language models — their inability to recombine knowledge effectively during use. When mathematicians and software engineers — professionals who deal in precise logic and rigorous thinking — start describing this as a new paradigm, we need to pay attention. It’s as if we’ve moved from AI that can follow complex instructions to AI that can actually think through problems with human-level reasoning.

While the current generation of models like Claude 3.5 Sonnet, GPT-4o, and Gemini 2.0 Flash can meet our needs in many cases, o3 suggests something more profound is coming. As this technology begins to diffuse into mainstream models, and it will, we’re looking at tools that don’t just augment human capability but can operate at an entirely different level.

While little will change in the very short-term, the long-term implications may be profound. When you have AI systems that can reason at the level of the world’s top software engineers, that capability likely isn’t limited to coding. The same logical frameworks that make o3 exceptional at programming will almost certainly translate to other domains: strategy, analysis, problem-solving, and yes, communication.

This puts a different spin on our usual advice about AI adoption. While it remains true that many organizations are still struggling to effectively deploy current AI tools, the window for getting comfortable with this technology on an individual level may be shrinking. The capabilities of generative AI are increasingly at rapid pace, and organizations that haven’t built the muscle for working with AI today will find themselves increasingly disadvantaged as the models get smarter and more capable.

The practical implications are clear. Yes, start with what’s available now. Learn it. Master it. Integrate it thoughtfully into your workflows. But do so with the understanding that we’re not just preparing for better versions of what we have — we’re preparing for something fundamentally different. The organizations that will thrive in the coming years won’t just be those that use generative AI effectively — they’ll be those that understand how to partner with AI systems that can reason and think at human levels or beyond.

Overhang Is increasing

The gap between the capabilities of leading models and how we leverage them continues to grow.

Picture a Formula 1 race car constrained to city streets, bound by speed limits and stop lights. That’s where we find ourselves with today’s frontier AI models — immensely powerful systems whose full capabilities remain largely untapped. The past few months have brought a cascade of advances from major AI labs, culminating in the announcement of o3, yet many aren’t even tapping into the most basic capabilities of generative AI.

We often discuss the concept of “overhang” in conversations about generative AI — the gap between what these systems are capable of and what we’re achieving with them. This overhang manifests in two ways. First, at the application layer. While the chatbot interfaces provide tremendous utility, they represent a narrow channel for accessing these models’ capabilities. Think of trying to drink from an ocean through a coffee straw. The underlying models can process complex multi-modal inputs, reason across vast amounts of information, and generate sophisticated outputs — but our current interfaces often reduce these capabilities to simple text exchanges. Yes, we’re seeing rapid development of more specialized applications, but the gap between what’s possible and what’s accessible remains vast.

The second constraint lies in our understanding and comfort level. In our work with organizations, we consistently find that even sophisticated professionals have only a superficial grasp of these tools’ capabilities. Many have tried ChatGPT, perhaps bumped into its limitations, and walked away with an incomplete picture. It’s like judging a smartphone’s utility after only using it to make phone calls. The depth of capability of the frontier models requires investment in learning how to effectively work with these tools.

With the release of o1 and announcement of o3, we’re seeing the gap between theoretical capability and practical application widen. As the leading AI labs build and train new models capable of solving some of the most challenging math and engineering problems we face, most of the world lacks the applications and understanding needed to complete far more basic tasks. They’re building increasingly powerful Formula 1 engines in a world where many are just learning to parallel park.

What does this mean for leaders and organizations? First, take these rapid advances seriously — each new model release offers a window into future capabilities. The generality of these models means that AI labs can’t anticipate your specific use cases. Only through hands-on experience with frontier models can you identify potential applications for your work and organization. This isn’t just about reading case studies — it’s about developing the muscle memory of working with these tools.

The overhang in generative AI isn’t just growing — with the release of o1 and announcement of o3, it’s safe to say it’s accelerating. As applications improve and user sophistication grows, we’ll see rapid unlocking of this built-up potential. The transition will feel jarring to most. Those who invest in understanding and experimenting with these tools today will be ready when the constraints begin to fall away. The rest risk being shaken by a technological acceleration they barely saw coming.

We’ll leave you with something cool: a jaw-dropping example of cinematic video produced by Google’s Veo 2.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

The full list of prompts we gave o1 Pro as we worked through the communication strategy and its deliverables:

The town hall is next week, on Wednesday. Today is Thursday the week before. Revise.

Make it better.

Excellent. Now create a list of all communication deliverables we must create to support this strategy.

Create these: Senior Leader Notification Email: Brief note to Executive Directors/Division Heads explaining the upcoming policy announcement, its rationale, confidentiality requirements, and what to expect next. Manager Briefing Invitations (Calendar Invites): Invitations for the Friday manager briefing sessions, including a short description of the purpose (confidential pre-briefing on an upcoming major policy change).

Great. Now we will create the toolkit. We will create each deliverable in turn so I can give you feedback on each. Here is the list. Be sure to honor the strategy and its principles when drafting the content. Remember it’s a skeptical and probably disappointed audience. Manager Toolkit (Private, Pre-Announcement): Key Messaging Guide: One-page summary of the policy, rationale, and must-know points. FAQs for Managers: Anticipated employee questions with recommended response framing. Talking Points Deck: Short PowerPoint or PDF to help managers structure post-announcement team meetings. Policy Overview Document (Draft): So managers know precisely what will be announced and can answer questions accurately after the Town Hall.

Make the key messaging guide better. Include message sets for specific audiences. Make the FAQ document significantly longer. Anticipate far more likely questions. Add suggested design layout concept descriptions for each slide in the talking points deck. Make the policy Overview better.

Update the FAQ to have a single, declarative, non-bs answer to each question before providing additional detail.

Now create the Town Hall Agenda & Moderator Notes: Detailed run-of-show for the Town Hall, including speaker order, time allocations, and Q&A instructions.

Add an agenda at the top. Make the whole thing better.

Now create the CEO Announcement Script (Talking Points): High-level speaking points for the CEO’s Town Hall segment that cover rationale, details of the policy, and tone guidelines (transparent, empathetic). Note that they should fill 15 minutes of time.

Make them better. Remember your strategy.

Back to the CEO town hall talking points. Don’t bury the lede. Treat them like fully formed adults. No BS.

Now draft the All-Employee Email from CEO (Immediately After Town Hall): Summarize the key points, provide links to resources, acknowledge concerns, and reinforce support.

Now create whatever text we need for the Intranet Landing Page: ○ Dedicated “Future Ways of Working” intranet hub featuring: § The new policy summary § Comprehensive employee FAQs § Downloadable manager toolkit and Q&A documents Links to commuting resources, flexible scheduling tips, and HR contact details

Now create the 9. Employee FAQs (Public Version): A user-friendly, clearly organized FAQ document addressing common employee questions about the policy, timing, logistics, and rationale. Be sure to reflect the earlier Q&A for managers as much as possible.

Remove the more detail headers. Put the main answer to each question in bold. Think of other questions and answer. Make the whole thing better.

Now draft content for: 12. HR/IC Office Hours Invitation & Schedule: Emails and intranet notifications announcing optional “office hours” sessions for employees. Include a recurring calendar invite link or sign-up instructions.

Now draft content for: 13. Commuter Resources & Logistics Guide: Document or intranet page outlining parking information, public transportation benefits, flexible arrival/departure times, and any available subsidies or support. Don’t make anything up. Use placeholders.

Now draft content for 14. Well-Being & Work-Life Balance Tip Sheet: Guidance on integrating in-office days with personal schedules, stress management resources, and EAP (Employee Assistance Program) contacts. Remember your audience. Treat them like adults.

Now create four posts for this: 15. Weekly Intranet Updates/News Posts: Short posts highlighting success stories, shared best practices, reminders of the April 1 start date, and any updates to FAQs.

Now draft 16. Optional Pre-Implementation Pulse Survey Questionnaire: Brief online survey (via intranet or survey tool) to gauge initial sentiment, identify concerns, and measure comprehension of the new policy. Keep it brief. Make it reliable and valid.

Now create six Periodic All-Hands Reminder Emails (Monthly or Bi-Weekly): These will reiterate policy details, note the countdown to April 1, and spotlight any new resources.

Now create the materials to support execution of 18. Manager Coaching Webinar or On-Demand Video (Optional): Resource for managers on handling continued concerns, optimizing hybrid collaboration, and maintaining team morale. Presume the webinar will be 20 minutes of coaching and 10 minutes for Q&A. Plan and then create everything we will need for this, including the slides (which we will then use to create in PowerPoint).

Take out all the cliches and BS like “change is hard.”

Now create the materials and creative treatments for 19. “Welcome Back” Campaign Materials: ○ Intranet banners celebrating “Back in the Office” moments ○ Digital signage or posted flyers in the office welcoming teams on-site Photo essays or short videos shared on the intranet featuring leadership walking the floors, highlighting team interactions. Include tagline concepts and graphic design concepts. Remember the audience: no cliche and no BS. The work should be detailed enough to give to our creative services team and have them run with it.

Now create the Post-Launch Survey Questionnaire (May–June 2025): A brief online survey to assess how employees are adjusting, whether the policy is meeting intended goals, and what further support might be needed. Keep the same questions as the prior survey.

Excellent. Now, create a detailed day-by-day, and if needed, minute-by-minute, shot-clock that is the execution schedule for the entire communication strategy. It should account for every deliverable and step in the process. Be detailed.

Make it all one timeline, folding in the details. Presume today is Thursday December 12 and the Town Hall is Wednesday, December 18. BE SPECIFIC AND DETAILED. Account for every step. Also designate who should be responsible (roles, not names).

Excellent. Now write a briefing document that will accompany this strategy, timeline, and deliverables as a cover document. It will go to all the senior leaders and stakeholders involved in the process. It should summarize the strategy and messaging and forecast what is included in the package. Make it excellent.

Make it better.

CRA | Admired Leadership

Apologies to our email-based readers for our consistent typo in this edition -- we misspelled "Communication" in three different places. At least we are consistent!

Expand full comment

David C Morris

The video at the end of the post is impressive. It's funny about the eyes, though. It's as if everyone is Gollum from LOTR.

Confluence: AI, Leadership, and Communication

Discussion about this post