Confluence for 6.22.25

Cultivating taste in the age of artificial everything. The first draft trap. Anthropic's course in AI fluency. Getting better feedback from LLMs.

Jun 22, 2025

Welcome to Confluence. You’ll notice our cover image this week is a video. It’s because Midjourney released its first video model. It’s not the leap in capabilities Veo 3 represents, but it’s worth noting. Midjourney continues to create the most compelling visuals of any image generation model, and we’ll experiment as they update their video models.

Separately, today’s edition of The New York Times Magazine is focused almost entirely on AI. We expect it to give us plenty to read over the coming week and suggest our readers explore it, too.

With that said, here’s what has our attention this week at the intersection of generative AI, leadership, and corporate communication:

Cultivating Taste in the Age of Artificial Everything
The First Draft Trap
Anthropic’s Course in AI Fluency
Getting Better Feedback From LLMs

Cultivating Taste in the Age of Artificial Everything

Good judgment is more important than good prompting

Over the past year or so, our firm has engaged in ongoing dialogue about talent development in the age of generative AI. These discussions keep returning to one word: judgment. Judgment makes the difference between a mediocre writer using AI as a crutch, and a great writer using AI to become even better, for example. Generative AI can spin up a corporate narrative in seconds that, on the surface, seems compelling enough. Only with good judgment does this artificial narrative reveal itself to fall short.

A recent article in The Atlantic echoes this theme. In the essay “Good Taste is More Important Than Ever,” former Harvard Business School dean Nitin Nohria defines “taste” as “judgment with style.” Nohria posits:

Taste is neither algorithmic nor accidental. It’s cultivated. AI can now write passable essays, design logos, compose music, and even offer strategic business advice. It does so by mimicking the styles it has seen, fed to it in massive—and frequently unknown or obscured—data sets. It has the power to remix elements and bring about plausible and even creative new combinations. But for all its capabilities, AI has no taste. It cannot originate style with intentionality. It cannot understand why one choice might have emotional resonance while another falls flat. It cannot feel the way in which one version of a speech will move an audience to tears—or laughter—because it lacks lived experience, cultural intuition, and the ineffable sense of what is just right.

In a world where generative AI creates an increasing amount of the content we consume, what becomes of our own sense of taste? Consider how this plays out in corporate settings: the executive who can distinguish between an AI-generated strategy deck that looks impressive but lacks strategic coherence and one that truly advances organizational thinking. Or the communication leader who recognizes when an AI-drafted all-hands email technically hits every message point but somehow misses the moment’s emotional truth. Nohria suggests we develop taste through three steps: exposure, curation, and reflection. First, expose ourselves to greatness at every opportunity and through mixed media. Then, refine our discernment to understand what makes something truly great. Finally, track our preferences intentionally, noting how our passions evolve.

We buy this wholeheartedly. The influx of AI-generated content isn’t going to slow, making it more important than ever to recognize greatness where we can and strive to reach it ourselves. As Nohria writes, “When every option is instantly available, when every variation is possible, the person who knows which one to choose becomes even more valuable.”

This brings us back to where we started, with the talent development conversations we’re having and facilitating with others. In an era when generative AI can produce infinite variations of adequate, organizations that cultivate people with genuine taste will have the advantage. They’ll have teams who know not just how to prompt an AI, but when to ignore its suggestions entirely. They’ll have leaders who can spot the difference between artificial polish and authentic resonance. Most importantly, they’ll have professionals who understand that in any human endeavor, the highest form of judgment is knowing when good enough isn’t good enough.

The First Draft Trap

Why anchoring bias could limit how we extract value from generative AI.

Claude’s contribution to Confluence looks a bit different this week. Rather than asking Claude to summarize a specific piece of research, we engaged Claude in a conversation to write a piece based on an idea we had. There is no prompt to share this week as Claude’s work is the product of a lengthy conversation with one of our contributors.

There's a cognitive trap waiting for organizations as they adopt generative AI, and it stems from one of our most persistent biases. When we interact with LLMs, we tend to treat their first response like a Google search result — accepting it as "the answer" rather than recognizing it as just one possibility from a probability distribution. This anchoring bias, where we over-rely on the first piece of information we encounter, could significantly limit the value organizations extract from AI tools.

Consider what might happen as AI becomes embedded in high-stakes communication work. An executive crafting a merger announcement receives a solid first draft from ChatGPT and, satisfied that it's "good enough," makes minor edits and moves on. But what if the third or fifth version would have been transformative? Unlike search results that surface existing content, LLMs generate new possibilities each time. The first output isn't the "best" one — it's simply the first roll of the dice.

The risk is particularly acute for leaders trained to avoid premature convergence. These executives have spent years learning to resist the first idea in a brainstorming session, to push teams to explore multiple options before deciding. Yet when working with AI, they may not recognize that the first response is just one of countless possible outputs — and statistically speaking, it's unlikely to be the optimal one. Why would it be? The temptation is to abandon the discipline of exploration, treating that first response as definitive rather than as a starting point.

Compounding this is our tendency to carry human social conventions into AI interactions. We don't ask colleagues for five different versions of a document because it would be unreasonable — it would waste their time, seem indecisive, potentially damage the relationship. And while talking to LLMs like you're talking to a human is often helpful (we encourage it regularly), the key difference is that you can ask far more of them than you could reasonably ask of a human. Generating multiple versions takes minutes and costs pennies. The AI doesn't experience fatigue or frustration. Yet we often limit ourselves to the same requests we'd make of human colleagues.

Teams that recognize this pattern and develop new interaction habits — treating regeneration as standard practice for important communications, building "version exploration" into their workflows, using multiple outputs to refine their thinking — may well produce stronger results. The gap between those who stop at "good enough" and those who explore the full possibility space could compound over hundreds of communications yearly.

The lesson for leaders and communication professionals is clear: in a world where generating alternatives is nearly free, anchoring to the first option is an expensive habit. Building anti-anchoring practices into how we work with AI means recognizing that these tools offer us something we've never had before — the ability to explore the full range of possibilities without the traditional costs. The question is whether we'll overcome our biases to capture that value.

Anthropic’s Course in AI Fluency

One of the best ways we’ve seen to catch-up, quickly.

At the beginning of any program or session we lead with clients on generative AI, we always start by grounding participants in the fundamentals of the technology — what it is, what it isn’t, and how it works. The reason is simple: to effectively work with these tools you need to understand their strengths and limitations, and to understand these strengths and limitations, you need to understand what’s happening under the hood (at least to an extent). Last week, Anthropic made it easier to get up to speed when it launched its course on AI Fluency, an excellent resource that gets technical enough so you will walk away with understanding of what LLMs are, how they’re built, and how they work, while being accessible to most professionals.

What separates Anthropic’s course from the previous explainers we’ve shared (such as Andrej Karpathy’s videos and this Financial Times interactive) is the framework it offers for how we can work with generative AI. It teaches the fundamental skills and judgment needed to engage with generative AI to do real work. Within the course itself there are videos, written materials, exercises, and reflection questions to round out the experience. Anthropic goes heavy on what it calls “The 4Ds” of AI fluency, which include:

Delegation: Deciding on what work should be done by humans, what work should be done by AI, and how to distribute tasks between them.
Description: Effectively communicating with AI systems.
Discernment: Thoughtfully and critically evaluating AI outputs, processes, behaviors, and interactions.
Diligence: Using AI responsibly and ethically.

It’s a useful framework and addresses the type of questions we’ve been grappling with in our own work. While a course such as this doesn’t replace the conversations leaders and teams need to have about generative AI, it provides an excellent structure for anyone who needs to catch up, quickly, who wants to validate their own knowledge and understanding of this technology, or who wants a structured way of building their AI Fluency on their own.

Getting Better Feedback From LLMs

LLMs are powerful tools for critique — if we prompt them to be.

We led a recent post with the statement that “Large language models want to make users happy.” In that case, we were discussing ChatGPT-4o’s “slide into sycophancy,” which OpenAI has since rolled back, at least in its most extreme form. But the point that LLMs want to make users happy — that this tendency is inherent to the way they work — is worth remembering. We recently found a Jon Stokes blog post from 2023 on reinforcement learning from human feedback (RLHF), the process that creates this tendency in models. Stokes writes

RLHF is, loosely speaking, training in the art of rhetoric. And by this, I mean in the Aristotelian sense of, “the art of persuasion.” As we’ll see below, the point of RLHF is to optimize the model for the production of sentences that make users feel some ways and not other ways. At the core of RLHF is the recognition that the model’s words have an impact on the mental state of the user, and therefore must be tuned to create desirable mental states (satisfaction, understanding, curiosity) and avoid creating undesirable ones…

The emphasis above is ours. RLHF a critical step in making LLMs useful, but it can also lead to models telling us what they think we want to hear at the expense of what we need to hear. This matters particularly when we turn to LLMs for feedback or critique, one of the most valuable and powerful ways we can use these tools. Given the tendencies discussed above, we need to intentionally prompt the models to give us the quality of feedback we need.

We ran a simple test to explore this further, using three different prompts to review the same email. We used Claude Opus 4 to create both the email and the prompts. For the email, which we’ve provided in the footnotes1, we asked Claude to make it “pretty good, but with real room for improvement.” Below are the prompts we used:

Basic: “Can you review this email for me? Please rate it on a scale of 1-10 and let me know what you think.”
Constructive Middle Ground: “Please review this email and rate it 1-10. Be candid and focus primarily on what needs improvement — I’m looking for specific, actionable feedback to make this better. Don’t sugarcoat issues, and don’t worry about my feelings. The most important thing is to make this email as effective as it can be. What are the main weaknesses I should address?”
Harsh Critic: “I need you to be extremely critical and direct in reviewing this email. Rate it 1-10 where 10 is absolutely flawless and 1 is completely ineffective. Assume I’m paying you as a professional editor who has no interest in protecting my feelings. Focus primarily on what’s weak, unclear, or ineffective. Be thorough and point out anything that would make a recipient dismiss or misunderstand this message. I can only improve with candid, honest feedback.”

We ran each prompt three times on the same email and each yielded different results: the Basic prompt rated the email an average of 6.8/10, the Constructive Middle Ground prompt a 5/10, and the Harsh Critic a 4.67/10. More important than the ratings, though, was the quality of the feedback. The Constructive Middle Ground prompt provided the most helpful critique and actionable feedback. Its feedback was more critical and actionable than the Basic prompt, without the scathing tone of the Harsh Critic (which started its response “This email is a textbook example of corporate communication that checks boxes but fails to actually communicate” — true enough, perhaps, but not particularly helpful). The full outputs from each prompt are in the footnotes.2

LLMs default to diplomacy, not candor. But getting useful feedback doesn’t require prompting them to be brutal. The Constructive Middle Ground prompt worked because it redefined success: prioritize making the email better over making the writer feel better. When using these models, remember that they are literally trained to create “desirable mental states” in their users. If we want them to make us better rather than just make us feel good, we need to redefine what “desirable” means through our prompting.

We’ll leave you with something cool: The trailer for Darren Aronofsky’s new AI-generated movie project, Ancestra. Complement it with this interview with Aronofsky and Google DeepMind’s Demis Hassabis on the creative potential of generative AI.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

Subject: Digital Transformation Update - Action Required by July 5

Hi Team,

We've selected TechForward Solutions as our partner for the digital transformation initiative, following six months of careful evaluation and incorporating your valuable feedback.

This new platform will reduce time spent on manual reporting by approximately 40% and create a single source of truth for project data across all departments. Based on pilot testing, teams report significant improvements in cross-department collaboration.

Timeline:

Phase 1 (July-September): Core system migration and training
Phase 2 (October-December): Department-specific customizations
Phase 3 (January-March): Full integration and optimization

Your department heads will schedule team meetings within the next two weeks to discuss specific impacts and training schedules.

Action Required: Please complete the brief readiness survey (link coming tomorrow) by July 5 to help us tailor training to your needs.

I recognize this represents significant change. We're committed to providing comprehensive support throughout the transition, including dedicated training sessions and on-demand resources.

Questions? Reach out to me directly or contact your department head.

Thanks for your flexibility and continued engagement in moving us forward.

Best regards,

Sarah

Basic Prompt Output

I'd rate this email a 7/10. Here's my assessment:

Strengths:

Clear subject line with a specific action and deadline
Opens with the decision, avoiding burying the lead
Provides concrete benefits (40% time reduction) rather than vague promises
Timeline is specific and digestible in phases
Acknowledges the change is significant and commits to support
Clear call-to-action with a deadline

Areas for improvement:

Missing the "why now" - After six months of evaluation, what triggered moving forward now? This context helps people understand urgency.
Stakeholder voice is passive - "incorporating your valuable feedback" feels generic. Consider: "Based on your feedback about X, we've ensured the platform addresses Y."
Benefits need more texture - The 40% stat is good but abstract. What does this mean practically? "This means 2 hours back in your week for strategic work" or similar.
Change acknowledgment comes too late - Move the recognition of significant change higher. People need to know you "get it" before they'll listen to logistics.
Support details are vague - "Comprehensive support" and "on-demand resources" need specificity. What exactly can people expect?

Quick reframe of the opening: "After careful evaluation—and incorporating your specific concerns about data silos and reporting burden—we're moving forward with TechForward Solutions. I know system changes are disruptive, so here's exactly what this means for you and the support you'll have..."

The email is solid and functional, but could be more compelling with these adjustments to build stronger buy-in and reduce anxiety about the change.

Constructive Middle Ground Prompt Output

Looking at this email with a critical eye, I'd rate it 5/10. It covers the basics but misses several opportunities to drive engagement and action. Here are the main weaknesses to address:

Critical Issues:

Buried Call-to-Action: Your most important ask (complete survey by July 5) is hidden in the middle. This should be prominent - either at the top or bottom with visual emphasis.
Weak Urgency: "Link coming tomorrow" undermines immediacy. Why not include it now? Also, what happens if someone doesn't complete the survey? No consequence = low compliance.
Generic Benefits: "Significant improvements in cross-department collaboration" is meaningless corporate-speak. Get specific: How many hours saved per week? Which painful processes disappear? What can they stop doing?
Passive Next Steps: "Department heads will schedule meetings within two weeks" - will they though? This needs accountability: "Your department head will contact you by [specific date] to schedule..."
Missing the "Why": You jump straight to the "what" without painting a picture of why this matters to them personally. What's broken now that this fixes?
No Social Proof: "Based on pilot testing, teams report..." - which teams? Name them. Use quotes. Make it real.
Formulaic Empathy: "I recognize this represents significant change" reads like you checked a box. Either show real understanding of their specific pain points or cut it.

Quick fixes:

Move the CTA to the top or create a prominent box
Include the survey link immediately
Add specific metrics and examples
Create urgency with consequences
Lead with their pain, not your solution

The email reads like an announcement when it should read like a rallying cry. For a transformation this significant, you need to inspire action, not just inform.

Harsh Critic Prompt Output

I'll give you the brutal assessment you're asking for.

Rating: 4/10

This email is a textbook example of corporate communication that checks boxes but fails to actually communicate. Here's what's wrong:

Critical Weaknesses:

Buried lede and weak opening: You announce the vendor selection before explaining why anyone should care. "Hi Team" is generic and establishes zero authority. The first sentence reads like a procurement update, not a leadership message about transformation.
Unsubstantiated claims: "40% reduction in manual reporting" - Says who? Based on what? "Significant improvements in cross-department collaboration" - This is meaningless corporate-speak. Give me actual examples or don't mention it.
Jargon overload: "Single source of truth," "digital transformation initiative," "full integration and optimization" - these phrases mean nothing to someone worried about learning new software or losing their job.
Timeline tells me nothing useful: What does "core system migration" mean for my Tuesday morning? When will my work be disrupted? What if I'm on vacation during training?
Pathetic call-to-action: A survey link "coming tomorrow"? Why am I reading this today then? And a "brief readiness survey" sounds like busywork that IT will ignore anyway.
Zero emotional intelligence: One throwaway line acknowledging "significant change" doesn't address the elephant in the room - people hate new systems, fear job automation, and resist change. You're asking for "flexibility" when you should be addressing fears.
No vision or leadership: Where's the compelling reason this matters? What problem are we solving? Why should I invest my energy in making this work?
Passive voice throughout: "Your department heads will schedule..." "Training sessions will be provided..." Take ownership.

What would actually work:

Start with the problem you're solving and why it matters to THEM
Give specific, believable examples of improvements
Address resistance head-on
Make the ask clear and immediate (not "tomorrow")
Sound like a human who understands this is disruptive

This email will generate eye-rolls, not engagement. It reads like it was written by committee to avoid saying anything controversial, which means it says nothing at all.

Confluence: AI, Leadership, and Communication

Discussion about this post