Confluence for 7.20.25

Learning from Anthropic's approach to AI in hiring. OpenAI releases Agent Mode. Axios HQ data reveals deep leader-employee disconnect on AI success. A new model explainer.

Jul 20, 2025

Midjourney prompt: Ancient Roman floor mosaic depicting job candidates in a waiting room with translucent AI assistants, traditional tilework showing people in sleek attire with subtle holographic displays, weathered mosaic surface with visible colored tiles, archaeological close-up

Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI, leadership, and corporate communication:

Learning from Anthropic’s Approach to AI in Hiring
OpenAI Releases Agent Mode
Axios HQ Data Reveals Deep Leader-Employee Disconnect on AI Success
A New Model Explainer

Learning from Anthropic’s Approach to AI in Hiring

An example of how to account for generative AI when selecting talent.

Identifying and selecting talent poses a persistent challenge for leaders and organizations. Generative AI can make this harder to navigate. When we read cover letters or candidate assessments, we have to wonder: how much of the output represents the candidate’s thinking versus that of an LLM?

Recently, Anthropic published specific guidance for candidates on how to collaborate with Claude at each stage of its hiring process. The guidance serves as a model that other teams and organizations can start from. Their approach is clear and direct, as you can see below.

How to collaborate with Claude during each stage of our process
When applying (resume, cover letter, application questions) Please create your first draft yourself, then use Claude to refine it. We want to see your real experience, but Claude can polish how you communicate about your work.
Example prompt: “Please review my resume and the job description. Identify the experiences I should highlight in my cover letter that align most with the job requirements.”
During take-home assessments Complete these without Claude unless we indicate otherwise. We’d like to assess your unique skills and strengths. We’ll be clear when AI is allowed (example: “You may use Claude for this coding challenge”).
Preparing for interviews Use Claude to research Anthropic, practice your answers, and prepare questions for us.
Example prompt: “Create a study guide for interviewing for this job. Outline key topics I should review, including AI safety concepts, Anthropic's research focus, and typical technical or behavioral questions I might encounter.”
During live interviews This is all you–no AI assistance unless we indicate otherwise. We’re curious to see how you think through problems in real time. If you require any accommodations for your interviews, please let your recruiter know early in the process.

This guidance benefits both candidates and those assessing them. For candidates, clear expectations eliminate second-guessing and reduce the chances of awkward conversations around generative AI use. For leaders, explicit guidelines allow them to probe deeper into a candidate’s thinking, including their approach to working with generative AI.

The reality is candidates have LLMs at their fingertips, and working with LLMs will be required in most roles. We need hiring practices that assess both fundamental skills and the ability to work effectively with generative AI tools. The guidance from Anthropic provides a useful starting point, but the right balance will vary by organization and role.

Three considerations should guide your approach. First, be explicit about when and how AI can be used throughout your hiring process. Ambiguity serves no one. Second, design assessments to reveal both independent thinking and a candidate’s facility with generative AI tools. The best talent will excel at both. Third, recognize prior hiring practices may no longer serve their intended purpose. Some traditional assessments may need fundamental rethinking rather than minor adjustments.

Organizations must adapt their hiring practices thoughtfully to evaluate candidates effectively in a world with generative AI. The exact approach will depend on your organization’s needs, but the time to start this work is now.

OpenAI Releases Agent Mode

The future is here … almost.

Note: Claude Opus 4 is a co-author of this piece.

The big news this week was OpenAI’s launch of ChatGPT Agent, marking a significant shift in how we interact with large language models. The new feature, available to Pro, Plus, and Team users, combines the capabilities of Operator (OpenAI’s browser-controlling tool), Deep Research (their synthesis engine), and ChatGPT’s conversational abilities into a unified system. In OpenAI’s words: “ChatGPT now thinks and acts, proactively choosing from a toolbox of agentic skills to complete tasks for you using its own computer.” (Mark that word, “agentic,” down. It’s the “synergy” of the next 24 months.) You can watch the OpenAI stream of the announcement here (and it’s worth watching):

With Agent, users can delegate multi-step tasks—from analyzing data and creating slide decks to navigating websites and filling out forms—within a single conversation. The agent uses its own virtual computer, choosing from a suite of tools including visual and text browsers, a terminal, and API connections to complete work autonomously while keeping users in control.

The launch represents more than incremental progress. ChatGPT Agent can interrupt its work when you need to redirect it, seek clarification when uncertain, and transition seamlessly between research and action. The system adapts its approach based on the task, and this flexibility, combined with the ability to maintain context across multiple tools and steps, suggests we’ve crossed a threshold in what these systems can accomplish.

Yet our testing reveals the technology isn’t ready for every use case. PowerPoint design remains clunky, with the agent struggling to create visually compelling presentations despite technically completing the task. Complex form navigation can trip it up, and OpenAI has wisely restricted high-risk actions like financial transactions without explicit user approval. The company acknowledges these limitations, with CEO Sam Altman calling it “cutting edge and experimental” while advising caution with personal information. While the future has arrived, it’s still in beta.

As a test, we gave ChatGPT Agent some 3,000 anonymized employee feedback comments about a strategy cascade we helped a client run a number of years ago. Our prompt:

Attached are comments from a recent employee survey about a communication process that cascaded the organization’s strategy to all employees. The question is about if there is anything else leadership should know. Read the comments. Conduct a qualitative analysis of the sentiment and primary themes. Then write a memo summarizing the feedback to the senior leadership team (this is a large commercial bank). Then create a PowerPoint deck that I can use to present the findings at a town hall. That presentation will only be 10 minutes so keep it tight. I want minimalist slides. Graphs, simple statements on a slide, no images.

ChatGPT Agent churned away for a while, and you can see some of that thinking here (it opened that “Terminal window” and spit out the code you see before we started recording):

And it’s interesting to see it revising its own work here:

In six minutes Agent had a memo and PowerPoint file for us. The memo is in the footnotes1. It’s quite good, although for client work we would ensure Agent had analyzed either all the comments or a large enough representative sample to conduct a valid thematic analysis. The PowerPoint was, well, not good. We’d describe it as “Copilot Quality,” for those who have asked Copilot to create a slide presentation. Here’s a click-through:

But this is where it gets interesting. We told Agent the presentation was not good, and asked it to improve it consistent with our initial instructions. The second draft was as poor as the first:

But then we asked ChatGPT, in another chat, to “Become Nancy Duarte and give a set of PowerPoint design rules to an LLM that’s going to be creating slides for an employee town hall. The presentation is summarization of recent employee feedback about a communication process.” We then fed those instructions to Agent, and it created this draft:

That is much closer to what we would want. Certainly something we can work with.

ChatGPT Agent is designed for iterative, collaborative workflows, far more interactive and flexible than previous tools. While it may stumble on slide aesthetics, its ability to research, synthesize, and execute represents a real capability shift.

The question here is what you choose to focus on in all of this. One can choose to say (and many will), “These things are terrible. That presentation was awful, and you had to go back and forth to get anything close to useable. What a sham!” Or, you can choose to say (as we do), “I went from 3,000+ raw comments to a sentiment analysis, thematic analysis, a solid memo first draft, and a solid presentation draft, in 10 minutes.” And that work would have been even closer to production-ready if we had given Agent strong design parameters from the start.

Of course, there is a danger in this, and it’s a big one: falling asleep at the wheel, and trusting all this output. We seem to hold computers to a different standard than people. We believe their outputs are deterministic and necessarily correct. Large language models (LLMs) are probabilistic, and have an element of randomness and variability, just like people. None of us would write exactly the same memo if given two attempts, and neither would an LLM. The sentiment analysis in this example was code. Agent did math and arrived at deterministic outcomes. But the selection of comments, the sentences to describe the results—they are the product of a machine inventing output a few letters at a time. We would never trust work like this from a colleague without subjecting it to human editing, review, and verification, and we need to do the same with agents.

These are our views after just a few trials. Our process, as it always is, will be to start using ChatGPT Agent for everything we can think of. It’s the only way to understand the uneven nature of its abilities. We suggest you do the same.

Oh, and while all that was happening, over in another browser tab, we asked Claude Opus to read back issues of Confluence and write four paragraphs summarizing the Agent news, which it happily did in about two minutes. Those four paragraphs served as a foundation for much of what’s in this piece. That is the future, managing multiple agents assisting you in parallel. It’s closer than you might think.

Axios HQ Data Reveals Deep Leader-Employee Disconnect on AI Success

The numbers point to more than just a perception problem.

This week, Axios HQ released its 2025 report on the state of internal communications. One finding in particular piqued our interest. In the report, Axios reported a significant disconnect between leadership and employees when it comes to AI implementation. It’s not new that leaders and employees are seeing AI in their organizations differently, but the Axios report puts a finer point on a major barrier to successful AI adoption in organizations today: not the technology itself, but the communication gap around it.

Axios reports that just 27% of leaders believe they’ve achieved “effective” AI implementation, “with clear guidelines and benefits.” This is, in itself, a telling number. Nearly three-quarters of leadership teams don’t feel they’re succeeding with AI implementation, despite being the ones who should champion these initiatives. The numbers don’t point to a communication problem here, but rather an implementation problem. When fewer than one in three leaders can confidently point to effective AI adoption with tangible benefits, it’s a fundamental challenge that goes beyond messaging.

But there’s more. The data shows that when employees were asked the same question, only 11% agreed that their organization had achieved effective implementation. Organizations have more than a perception gap here. They have a real problem. The 16-point difference between leadership and employee perspectives indicates a breakdown in how AI initiatives are designed, communicated, and experienced across organizational levels. Leaders may be discussing AI transformation in boardrooms and strategy sessions, but those conversations aren’t translating into meaningful, identifiable change for employees.

To reverse this trend, organizations must treat communication as a foundational pillar of AI transformation, not an afterthought. This means acknowledging uncertainties, sharing both successes and setbacks, and making AI a regular part of organizational discussion. Most importantly, it means connecting AI initiatives to concrete changes employees will experience. Leaders need to talk the talk and walk the walk when it comes to AI implementation, and encourage their teams to do the same. Without this alignment, the disconnect will only grow.

A New Model Explainer

Zvi Mowshowitz provides a helpful breakdown of the leading LLMs.

Back in March, we wrote about our then-current model stack, noting that “even we [were] having difficulty keeping up, and we have a lot of work to do to figure out where each model excels.” Four months later, the pace of developments and releases has only accelerated. Three of the models we mentioned in that post — Claude Sonnet 3.7, OpenAI’s o1, and Grok 3 — have already been surpassed by newer, more powerful models from each of these labs (with Claude Sonnet and and Opus 4, OpenAI o3 and o3 Pro, and Grok 4, respectively). In that same timeframe, Midjourney has released a new image generation model as well as video generation capabilities. And those new releases are only a fraction of major developments in the past four months. Things were moving fast in March and are moving even faster now, with rumors that OpenAI is preparing to release the long-awaited GPT-5 soon.

Even for those who, like us, follow these developments closely, it can be a bit much. So we read with interest Zvi Mowshowitz’s recent post “Open AI Model Differentiation 101”, which provides not only a helpful breakdown of the different OpenAI models (with their maddening naming conventions) but also of Anthropic’s Claude, Google’s Gemini, and xAI’s Grok families of models. It’s a “back to basics” overview that aims to answer some of the most foundational questions about working with generative AI:

How did we get such a wide array of confusingly named and labeled models and modes in ChatGPT? What are they, and when and why would you use each of them for what purposes, and how does this relate to what is available elsewhere? How does this relate to hallucinations, sycophancy and other basic issues, and what are the basic ways of mitigating those issues?

Our current point of view, which we outlined in May, is that the capability differences between Claude, ChatGPT, and Gemini are mostly at the margins. Whichever one you choose probably will not matter too much. But choosing a tool is only the first step. As Zvi’s post shows at length, you’ll still face a confusing range of options within the tool. It’s worth taking some time to understand the different models and capabilities available in whichever tool you are using. That’s where Zvi’s explainer is useful, particularly for ChatGPT.

With the pace of development, explainers like these (and like ours from a few months ago) inevitably have a limited shelf-life. But for those looking to catch up on where things stand right now, and to get the most out of the tools available to you today, this is a good place to start.

We’ll leave you with something cool: The ARC Prize Foundation released a preview of their forthcoming benchmark, ARC-AGI-3. See if you can solve three puzzles designed to stump leading generative AI models.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

To: Executive Committee (Senior Leadership Team)
From: [Your Name]
Date: 20 July 2025
Subject: Qualitative analysis of employee feedback on the 2024 strategy cascade process

Overview of the feedback

The employee survey invited open‑ended comments on the bank’s updated strategy and the cascade process. We reviewed 2,893 comment lines (many respondents simply replied “no” or “N/A”). Using a simple sentiment lexicon, roughly two‑thirds of the comments were neutral (no further feedback), about one‑quarter were positive and ~15 % expressed concerns. Overall, the cascade process was well received and is seen as a significant improvement on previous years, but respondents also highlighted several areas where the strategy and cascade can be strengthened.

Primary themes and insights

1 – Strong support for clarity and the cascade process

Many respondents praised the clarity and conciseness of the strategy and the cascade mechanism. They felt the placemat and supporting slides were the best articulation they had seen and appreciated being engaged rather than simply receiving an email. One manager noted that “this was the best way they had seen the whole picture laid out” and another said the team found the process “positive”. Others described the cascade as “clear and easy to follow” and emphasised that the process increased engagement and ownership.
Employees welcomed the transparency and consistency of the message. Several comments suggested making the cascade an annual or bi‑annual ritual with periodic refreshes to reinforce progress and to schedule strategy sessions earlier in the year.

2 – Need for greater detail and inclusion

Respondents asked for clearer differentiation between vision, strategy, priorities and goals. One group noted confusion “about strategy vs ‘the how’ vs priorities vs goals” and asked for templates and examples to help managers translate high‑level themes into team‑level objectives. Others requested that support functions be represented and that each Executive Committee (EC) member share their own OKRs so that all functions can see how they fit into the pyramid.
Several comments called for more explanation of certain terms – for example, defining how “ESG capabilities” underpin the strategy, clarifying the wealth‑management offering, or providing definitions for “private assets” and “services platform”. Respondents working in Global Advisors asked for the GA strategy to be included in the cascade.
Managers in non‑client‑facing or “enabling” functions felt that the materials available to them were thinner than those provided to senior leaders and asked for more publicly available examples. Translation of materials into local languages was also requested.

3 – Execution challenges and resource constraints

While teams were excited about the strategy, many expressed scepticism about the bank’s ability to execute. One manager reported that the team was “very aligned and excited … however there were significant reservations about the firm’s ability to execute and concerns … about whether we would be able to get decisions signed off quickly enough”. Another observed that employee morale and engagement are low and that without concrete actions, “there is limited confidence in the firm’s ability to execute”.
A recurring theme was resource and capacity constraints. Respondents highlighted “staff fatigue”, projects funded for multiple years but cut early, a shortage of subject‑matter experts and insufficient project‑management resources. Others noted that client‑facing teams bear the brunt of remediation efforts and retention issues and that the firm must retain top talent to execute. Concerns about compensation, promotions and limited opportunities were frequently linked to morale and engagement.
Many comments urged leadership to provide adequate funding for technology and business priorities, ensure budgets and headcount align with ambitions, and prioritise decisions quickly to avoid bottlenecks.

4 – Technology and tools are a major pain point

Numerous respondents challenged the idea of being a “technology‑led” company, citing outdated systems and lack of investment. They asked for a clear roadmap to expand existing infrastructure (e.g., big data, private cloud VMs) while accelerating migration to public cloud and emphasised the need for clear criteria, workflow tracking and dedicated support. Others said the “tech‑led” aspiration felt incongruous because current technology (e.g., SalesForce, shared mailboxes) does not support the business.
The largest impediment to executing the strategy, according to one team, is “technology issues” such as lost access and non‑functional shared mailboxes, which “cause lost productivity … and provide a very poor employee experience”. Another example cited running out of Jira licences mid‑migration and excluding consultants from strategic discussions.
Respondents called for integrated data platforms (e.g., a central data lake), modern tools to tie top‑level OKRs to individual objectives and better systems for tracking progress.

5 – Culture, collaboration and “breaking through silos”

The survey surfaced cultural impediments. Employees said the trait they struggle with most is “Break through Silos” because feedback can be poorly received and sometimes “fall on deaf ears”. Others worried that the cascade process could become a “bla‑bla‑bla story” unless the leadership demonstrates daily behaviours to break silos.
Comments urged greater transparency and two‑way communication. Teams want to “cascade up” and “cascade across” – to give feedback to senior management and to understand how other business units are executing. Another team noted that the cascade is mainly vertical and that a horizontal component is needed to “share, execute and harmonise approaches” and to build stakeholder mapping across teams.
Respondents also flagged bureaucracy and slow decision making as cultural obstacles and suggested that empowerment, accountability and ownership need reinforcement. Several comments proposed using strategy ambassadors or formal job‑shadowing programmes to strengthen cross‑functional understanding.

6 – Timing and cadence of the cascade

Timing surfaced repeatedly. Managers struggled to schedule meetings during year‑end priorities and asked for strategy roll‑out earlier in the year (e.g., October) to avoid last‑minute rushes. Many supported making the cascade an annual or mid‑year event and suggested ongoing check‑ins to maintain momentum.
To integrate strategy with performance management, respondents proposed aligning the cascade with goal‑setting cycles and providing templates or tools to connect OKRs to Workday and other systems.

7 – Metrics and measurement

A number of comments highlighted the need for better ways to measure progress. Respondents asked for clear metrics for each business priority, dashboards that show how the bank is performing against its strategy and more systematic ways to monitor success down through the organisation. They also noted that none of the key business priorities explicitly focus on employee engagement and called for targets around retention and morale.

Implications and recommendations

The feedback suggests that the strategy cascade process is a meaningful improvement in communication and engagement but that execution and credibility depend on addressing several practical issues:

Deepen the content and tailor it for different audiences. Provide templates and examples that help managers translate strategy into team‑level OKRs; clarify terminology (e.g., ESG, private assets, “services platform”) and ensure all functions – especially support and enabling teams – are represented.
Schedule and sustain the cascade. Move the cascade away from year‑end and align it with goal‑setting cycles; incorporate mid‑year refreshes and monthly touchpoints to keep the strategy alive; continue to offer manager training and translations.
Invest decisively in technology and data. Address operational pain points (e.g., access issues, outdated systems, lack of licences) and provide a clear roadmap for expanding infrastructure and migrating to cloud. Build a unified data platform and deploy tools that link top‑level OKRs to individual goals.
Resource the strategy and manage workloads. Ensure budgets, staffing and timelines match the ambition of the strategy; provide dedicated support staff and project‑management resources; address compensation, promotion and workload concerns to maintain morale.
Cultivate a culture of collaboration and accountability. Demonstrate behaviours that break through silos; encourage “cascade up” feedback and cross‑functional collaboration; consider appointing strategy ambassadors and implementing job‑shadowing programmes.
Define metrics for success. Establish and communicate clear KPIs for each priority and incorporate employee engagement and retention metrics; provide dashboards that show progress and empower teams to course‑correct.

Closing

The cascade process has opened a valuable dialogue and generated widespread enthusiasm. Continuing to refine the content, address execution barriers and maintain regular two‑way communication will be essential to translating the strategy into tangible outcomes.

Confluence: AI, Leadership, and Communication

Discussion about this post