Confluence for 11.9.25
The case for AI fluency. Digging in with Claude skills. Why context windows matter. Wharton's latest report on generative AI adoption.
Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI, leadership, and corporate communication:
The Case for AI Fluency
Digging In With Claude Skills
Why Context Windows Matter
Wharton’s Latest Report on Generative AI Adoption
The Case for AI Fluency
Use cases follow fluency, not the other way around.
Last week, one of our Confluence writers met with a client team for a conversation on how they were using generative AI, where they were getting traction, and where they were running into challenges. The team shared that they saw an immediate opportunity to use generative AI to help them with qualitative analysis of open-ended survey responses, but they were dissatisfied with the output they were getting from their organization’s approved tool, Gemini. They’d been more impressed with ChatGPT’s capabilities in this and other areas.
When we heard this challenge, we immediately hypothesized the most likely cause: a task like qualitative analysis calls for a reasoning model, and the team was likely not using one. So we asked the team if they were familiar with reasoning models and if they had attempted to use one (in this case, Gemini 2.5 Pro) for the analysis. As we expected, the team was not familiar with reasoning models and had not attempted to use one. So we did it live on the call, and in a few minutes the team was able to get the type of analysis they had been attempting to coax out of this tool for months.1
It was a textbook example of why, for any team in any function, some level of baseline AI fluency is the most important starting point. Within our own firm and with clients, we always start there. To get the most out of these tools – and to mitigate the risks that come with them – we need to have a foundational understanding of how the models that power them work. In the two most recent editions of Confluence (today’s and last week’s), we’ve written about four different technical topics: next token prediction, reinforcement learning, Claude Skills, and context windows. The reason is not because we’re nerdy people who enjoy this stuff (though, of course, we are nerdy people who enjoy this stuff), but because these are core concepts that anyone using these tools needs to understand.
Going back to last week’s client conversation, we could’ve taken two different approaches to addressing the challenge the team surfaced. One approach would’ve been to take that moment to teach them how to use generative AI for qualitative analysis as a distinct use case. There would have been merit to that, and we have no doubt it would’ve been helpful. But we chose instead to focus on the concept that mattered most, which was what reasoning models are and when (and how) to use them. All the team needed was that concept, and with that understanding it was off to the races. The team, equipped with that understanding, could then apply it to much more than just qualitative analysis.
What we see playing out in many organizations is the reverse: a focus on identifying the perfect AI use cases without first establishing an understanding of the fundamentals. Our view is that the people who are best suited to identify, build, and refine the best use cases for any team or organization are the people doing the work every day. But those people can’t identify opportunities if they don’t understand what’s possible. Without foundational fluency, teams can spend months wrestling with tools that could solve their problems in a few minutes, if they only knew how to tap into the right capabilities.
This kind of fluency has two layers. The foundational concepts – like next token prediction, reinforcement learning, context windows – are relatively stable. We’ve been teaching clients these same fundamentals for two years because they haven’t changed (and they likely will not change until we see a major architectural breakthrough). But on top of that foundation, the technology is evolving rapidly. Reasoning models were not publicly available 18 months ago. Claude Skills, which we write about in today’s edition, didn’t exist a month ago. The aim should be to establish that stable foundation first, then maintain awareness as new capabilities emerge.
There are many ways to stay current. You might assign someone on your team to be the AI lead who tracks developments and shares what matters. You might schedule monthly team learning sessions to explore new capabilities together. But whatever the approach, our advice to leaders is to make sure you’re investing at least as much time building fluency as you are hunting for use cases. Without the former, the latter can end up being guesswork.
Digging In With Claude Skills
Making the shift from custom prompts to abilities.
We wrote about Claude’s new “skills” capability a few weeks back. In the time since a group of us have been digging in with them, and just this past Friday we led a training for our practice on their use. We’ve come to believe skills are the future of using generative AI (at least until that future changes, which it has a tendency to do), and they’ll soon be a significant part of how you use your model of choice.
Per Anthropic, “Skills are folders of instructions, scripts, and resources that Claude loads dynamically to improve performance on specialized tasks. Skills teach Claude how to complete specific tasks in a repeatable way, whether that’s creating documents with your company’s brand guidelines, analyzing data using your organization’s specific workflows, or automating personal tasks.” They are abilities designed for the large language model (LLM), by the user, that the LLM can use when it sees fit, or when asked to do so by the user.
Skills represent an interesting progression in how we try to get LLMs to do things we wish them to do. Two years ago we did this with custom prompts, and most of us kept a file of prompts that we could paste into the chat window whenever we wanted the model to perform some very specific tasks. Then Projects came along, which allowed us to save a set of custom instructions in an online folder that the model would always use whenever we clicked into that Project. Now with skills, specific LLM abilities are always available, and you (or the LLM on its own, if it chooses to), can bring that specific ability to bear at any time on whatever work you might be doing. The thought of having an intelligent chatbot like ChatGPT or Claude that has dozens, or many dozens, of specific abilities that you have given it that can come to bear when needed is exciting to us and others in this space. We’ve been talking about agents for some time, and skills make Claude (in this case, as an equivalent has not yet come to ChatGPT or Gemini, though one soon will) a general agent that can become a specialist when needed.
Skills take a bit of programming knowledge to create unassisted, but Anthropic kindly has made a skill-builder skill part of every Claude account’s base skill set. We’ve been using it to build our Claude skills, and want to profile the journey of creating one such skill in this space.
Our goal was to build a proofreader skill. We did so by first saying to Claude:
Hi Claude. We’re going to build a new skill for you. It will be a proofreader skill, which is meant to find errors in text I give you: spelling, grammar, typography, and violation of our corporate style guide. I will also give you several stylistic rules that I try to follow so you can look for them, but it won’t be a copy editor — it’s a proofreader meant to find errors in text just prior to publication. I’ll have several bits of reference material for you to help you with this.
Claude asked for those materials, and we gave it our internal style guide. It then launched its skill-builder skill and churned away for a few minutes, providing updates in the chat window about code it was writing, files it was creating, and more. When it was done, we had a .ZIP file for download, and we added the skill to our Claude capabilities.
We then tested the skill with a document we’ve designed for testing proofreading in our AI projects and agents, one with a number of intentional errors. In our first round of testing, Claude missed a few things. We pointed this out, and asked Claude how we could improve the skill. There was some back and forth, with about 30 minutes of making changes, testing, and improvement.
One of the things that came of this was us deciding Claude should do six separate proofreading passes, each with a different focus, and logging all the changes in a scratchpad. This minimizes the chances of Claude “forgetting” early mistakes in its context window. At the end of this process, it reads the text a final time to verify the errors it has found. When this is done, Claude presents a list of all mistakes, with suggested changes.
In testing we have found this very thorough. We suspect it is not perfect, but we believe it is more reliable than human proofreading, especially if one were to run the skill multiple times against the same text.
Here’s a video of the skill at work (sped up double):
Through it all, Claude does all the programming work. It creates the skill, makes any changes, adds any tools, creates reference materials for itself, and more. The final result is a text file which is the prompt for the skill, and two folders, one with references that the LLM may choose to look at when running the skill (in our case, it converted our style guide into a reference file), and another folder with any assets. Assets might be programming scripts, templates, or other materials. In this case there are none, but our branding skill has an asset folder that contains our company document templates. The model does an excellent job of crafting its prompt, and it’s worth reading skill files to understand what Anthropic (and Claude) believe good prompts look like. We’ve posted the text of the proofreading skill in the footnotes.2
Today only Claude has skills, but the other models will soon enough. They represent an evolution in how to work with LLMs, and are not a proprietary technology. As they diffuse, users everywhere will start to have suddenly more capable models on their hands. We suspect something else will change, too. After having a number of skills built and working in our own work, we noticed that our way of working with Claude changed. Rather than it being a set of set pieces, with each a chat asking Claude to do a specific thing, our work with Claude has become more of a flow, with us treating Claude like a very capable colleague with many specializations, whose specializations we ask it to employ in different ways and at different times as we move one piece of work forward. It’s a difficult thing to describe, but it was notable (and very effective), and we expect you’ll have the same experience soon enough.
Why Context Windows Matter
Understanding context windows will help you make the most of new features.
We’ve written about context windows before, most recently in relation to “context rot,” a performance issue that occurs when a chat exceeds the model’s context window. We don’t have an exact solution for that problem yet, but as new features and integrations (like Claude Skills, as we write about above) become available, understanding context windows will help you make sense of how and when to use those features, because nearly all of them are designed as tools for managing and expanding context.
First, a quick refresher: A model’s context window is the number of tokens it can process at once. Think of it as the working memory the model has within an individual chat — what it can see (hence, “window”) at any particular time. The longer the chat gets and the more files you share, the more likely it is to forget, misunderstand, or misuse that information. But context is critical for making these models useful for specific tasks. If you need help with a complicated project, the model needs enough understanding to be helpful, but not so much that performance suffers.
It’s a tricky balance, and striking it remains a challenge. That’s why so many new capabilities are designed to expand the amount of context these models can work with, either by making the windows themselves bigger, or by creating sophisticated ways to bring relevant information into that window at the right time. We can think about some of the most prominent features available today according to how they access and manage different kinds of context:
Projects in Claude and ChatGPT expand situational context. You’ve been working on something for months and need the model to understand the nuances without explaining each time. The project holds that background so it doesn’t consume your active chat window.
Skills, like our proofreader skill, let your chat reference detailed instructions and examples for specific actions exactly when needed. You don’t repeat instructions or clog up the conversation with guidance that’s only relevant to one task.
Persistent memory and custom instructions in Claude, ChatGPT, and Gemini carry facts about your work, your communication style, your ongoing projects. The models can pull this context in as needed, lessening the need to fill your context window with this information in each conversation.
System integrations like Claude Code, Microsoft 365 Copilot, and Gemini Workspace offer environmental context. They see your actual files, run commands, and interact with your working environment rather than just discussing it from the outside.
Each feature improves the models’ abilities to bring the right context in at the right time without overloading the active window. If you understand what kind of context a tool is designed to access, or how it introduces it into the window, it becomes easier to figure out when and how to use that tool.
Wharton’s Latest Report on Generative AI Adoption
What we can learn from three years of data.
The piece below was written by Claude Sonnet 4.5 after a short exchange with a Confluence writer. We provided Claude with the PDF of the full report and guidance on key insights after reading the report ourselves. The only edit we made was to add a link to the report itself in the first sentence.
Wharton just published the third edition of its annual report tracking generative AI adoption in the enterprise. The story emerging from three years of data is one where the technology has moved from curiosity to daily habit for most business leaders, with predictably uneven progress and some warning signs about how organizations are approaching this transition.
Some findings won’t surprise anyone paying attention. Usage continues its steady climb, with 82% of enterprise leaders now using generative AI at least weekly, up from 72% last year. Daily use jumped 17 percentage points to reach 46% of leaders. Budgets are growing, with 88% of organizations expecting to increase Gen AI spending over the next 12 months. Three-quarters of companies report positive ROI on their investments, though the numbers are rosier among smaller, more agile firms than among large enterprises still working through integration complexity. The most common use cases remain practical and productivity-focused: data analysis, document summarization, editing and writing, presentation creation. These are the core workflows where Gen AI has proven itself capable and where returns show up quickly.
Two patterns in the data deserve more attention from anyone leading AI adoption in their organization. The first is a stark divide in optimism by leadership level. Executives holding titles of Vice President or higher are twice as likely as mid-level managers to believe their organizations are adopting generative AI “much faster” than peers—56% of VPs versus 28% of managers. VPs are also far more bullish on ROI, with 45% reporting significantly positive returns compared to 27% of managers. This gap suggests that the view from the C-suite may not match what’s happening on the ground. Mid-managers, who see day-to-day implementation and have closer relationships with actual usage patterns, appear to have a more tempered read on both adoption speed and returns. The optimism gap between leadership levels should prompt questions about whether strategic enthusiasm is running ahead of operational reality.
The report’s authors frame accountability as the lens for this year’s findings, noting that 72% of companies now formally measure ROI on their Gen AI investments. This represents a maturation from early-stage experimentation to more disciplined investment. But accountability can be limiting if it pulls focus too narrowly toward near-term metrics. Use cases for generative AI remain emergent. We’re still discovering new ways of working with existing models, and the models themselves keep improving their capabilities. Take what we wrote this week about Anthropic’s skills feature, which fundamentally changes how Claude can handle complex workflows. The technology is not standing still, which means the full range of valuable applications hasn’t been mapped yet. Organizations that prematurely lock into narrow definitions of value risk missing opportunities that don’t fit established measurement frameworks. The distinction between accountability and experimentation need not be binary. You can maintain rigor around what you’re tracking while preserving space to explore what you’re not yet tracking.
More concerning is the report’s finding that investment and confidence in training both declined this year. Only 48% of organizations report investing in training programs for employees, down 8 percentage points from last year. Confidence that moderate to extensive training will lead to fluency dropped 14 percentage points, while the share of leaders saying they’ll need to hire entirely new talent instead jumped 8 points to reach 14%. This represents a troubling retreat from capability building at precisely the moment when deeper understanding matters most. Our experience working with clients confirms what the research suggests: people need substantial, ongoing exposure to the technology to work with it effectively and to reasonably keep pace with its advances. Training—or whatever you choose to call structured learning—remains essential. The declining investment signals that many organizations may be underestimating what’s required to realize value from their Gen AI deployments.
Related to the training decline is growing concern about skill atrophy. The report found that 43% of leaders worry about declines in employee skill proficiency even as 89% believe Gen AI enhances employee capabilities. This tension is real. We’ve been discussing this risk with clients for years, particularly around how organizations develop the senior talent they’ll need in the future. When AI handles work that would typically fall to junior employees, those employees lose opportunities to develop foundational skills and judgment. The report notes that leaders anticipate AI will have its greatest impact on junior roles, with 17% expecting fewer intern hires compared to 10% for mid-level positions. Yet 49% expect more intern hires versus 40% for mid-level roles. The uncertainty about hiring combined with concerns about skill degradation points to an unresolved tension in workforce strategy. Organizations need to design ways of working with generative AI that allow people to develop the capabilities required for more senior roles. This means being deliberate about which work gets delegated to AI and which work humans continue to do themselves, not for efficiency reasons but for development reasons.
The Wharton report tells a story of technology moving from the margins to the mainstream of enterprise work. Usage is up, budgets are growing, and early returns appear positive. But the human factors—training, skill development, realistic assessment of adoption progress—lag behind the technical deployment. These gaps between technology adoption and organizational capability represent the real constraint on realizing value from Gen AI investments. The challenge for leaders is to maintain momentum on deployment while simultaneously investing in the harder, slower work of building the understanding, skills, and cultural readiness that convert usage into sustained performance improvement.
We’ll leave you with something cool: Project Suncatcher is Google’s self-proclaimed “moonshot” to see if they can train AI models in space.
AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.As a technical aside, the reason the team had success with ChatGPT but not with Gemini is because ChatGPT now takes a “routing” approach, which routes the user to the appropriate model based on the query. So when the user prompts ChatGPT for analysis, ChatGPT should automatically invoke a reasoning model to do that kind of task. Gemini, on the other hand, has 2.5 Flash (which is not a reasoning model) as its default. To use Gemini’s reasoning model, 2.5 Pro, the user has to manually select it. And to do this, of course, the user first has to know that different models are available, the relative strengths and limitations of each, and then manually select the right one.
---
name: proofreader
description: Use this skill when the user requests proofreading of text before publication. The skill performs systematic proofreading checks for spelling, grammar, typography, and style guide compliance. It focuses on finding errors, not copy editing or rewriting content. Trigger when user asks to “proofread,” “check for errors,” “review before publishing,” or similar requests for final quality control on written materials.
---
# Proofreader
## Overview
This skill provides a systematic approach to proofreading text before publication. It identifies spelling errors, grammatical mistakes, typographical errors, and violations of the CRA | Admired Leadership Style Guide. This is proofreading, not copy editing—the focus is on finding errors in nearly-final text, not on improving writing quality or restructuring content.
## When to Use This Skill
Apply this skill when:
- User explicitly requests proofreading of text
- User asks to “check for errors” or “review before publishing”
- User provides text that needs final quality control before distribution
- User mentions finding mistakes or ensuring accuracy
## Proofreading Mindset
Effective proofreading requires discipline and humility:
**Assume errors exist.** If you complete a pass and find nothing, you probably missed something. Re-read more carefully.
**Don’t trust your memory.** Your brain fills in gaps and corrects errors automatically as you read. The only defense is to re-read the actual text during each pass, not recall what you think you remember.
**Accept that you’ll make mistakes.** The verification step is not optional. It’s specifically designed to catch errors in your error-finding process. False positives happen - the goal is to catch them before presenting to the user.
**One thing at a time.** The six-pass structure exists because trying to check everything at once guarantees you’ll miss things. If you find yourself noting punctuation during a spelling pass, stop and refocus.
**Slow down.** If it takes twice as long to find an error, that’s still faster than sending the document back with the error still in it.
## Proofreading Workflow
### Step 1: Initial Read-Through
Begin with a complete read-through of the text without making corrections. This establishes familiarity with the content, tone, and structure. Note areas that may require closer attention in subsequent passes.
### Step 2: Load the Style Guide
Read `references/style-guide.md` to refresh understanding of CRA | Admired Leadership style requirements and common error patterns. Pay particular attention to:
- Firm-specific terminology and capitalization rules
- Divergences from AP style (Oxford commas, bullet punctuation, date formatting)
- Voice requirements (active voice, strong verbs, limited punctuation)
### Step 3: Systematic Error Checking
**CRITICAL DISCIPLINE REQUIREMENTS:**
- **NEVER work from memory.** Re-read the complete text during each pass.
- **ALWAYS quote exact text** when documenting errors, including 3-5 words before and after for context.
- **ONE FOCUS PER PASS.** Resist the temptation to note other error types during a focused pass.
- **Trust the process, not your recall.** If you think you remember seeing something, you must verify it by re-reading.
**Common Failure Modes to Avoid:**
- ❌ Relying on initial read-through memory instead of re-reading during each pass
- ❌ Documenting errors without quoting the exact text from the document
- ❌ Skipping verification because you “already checked”
- ❌ Mixing error types (e.g., noting punctuation issues during a spelling pass)
**MANDATORY WORKFLOW:**
1. **Create the todo list:** Use TodoWrite to create 7 distinct todos:
- 6 todos for the six pass types listed below
- 1 todo for verification of all findings
2. **Create scratch pad:** Write a temporary markdown file at `/tmp/proofread_scratch_[timestamp].md` with headings for each pass
3. **For each of the six passes:**
- Mark the todo as `in_progress`
- **READ THE ENTIRE TEXT from beginning to end** with ONLY this pass’s focus in mind
- When you find an error, document it IMMEDIATELY in the scratch pad with:
- Location (section name, bullet number, or paragraph)
- **Exact quoted text** from the original (5-10 words minimum for context)
- Error type and explanation
- Proposed correction with exact quoted text
- Mark the todo as `completed` only after the pass AND documentation are done
4. **Verification pass (MANDATORY):**
- Mark verification todo as `in_progress`
- Re-read the original text one more time
- For EACH error documented in your scratch pad, verify it actually exists by finding the exact quoted text
- Create a new file `/tmp/proofread_verified_[timestamp].md` with only confirmed errors
- Retract any errors you cannot locate in the original text
- Mark verification todo as `completed`
5. **Present findings:** Present only the verified findings to the user
Perform these six focused passes through the text, each targeting specific error categories:
**Pass 1: Spelling and Typography**
- Check for misspelled words
- Verify proper names are spelled correctly throughout
- Look for typographical errors (transposed letters, missing letters, extra spaces)
- Ensure consistent spacing between sentences (one space only)
- Check for missing or duplicate small words (”and,” “for,” “it,” “the”)
**Pass 2: Grammar and Usage**
- Verify subject-verb agreement
- Check for consistent tense throughout
- Ensure active voice is used
- Confirm proper punctuation (periods, commas, question marks)
- Check that sentences beginning with “However” indicate means of action, not contrast
- Verify proper use of i.e. vs. e.g. with correct punctuation
**Pass 3: Style Guide Compliance**
- Apply Oxford comma rules (use unless client style prohibits)
- Check title capitalization (capitalize all words except articles, short prepositions, and short conjunctions)
- Verify punctuation placement in quotation marks (periods and commas inside)
- Check bullet point punctuation (full sentences get punctuation, fragments don’t)
- Verify number formatting (words for one through nine, numerals for 10+)
- Confirm hyphenation of compound modifiers ONLY when they directly modify a noun (e.g., “results-focused leader” YES, but “results orientation” NO because orientation is not being modified)
- Check dash spacing for consistency within the document
- Verify “communication” vs. “communications” usage is appropriate
**Pass 4: Firm Terminology**
- Use “CRA | Admired Leadership” not “CRA, Inc.”
- Capitalize all CRA titles (Research Assistant, Consultant, Managing Director, Partner)
- Capitalize “Admired Leader”
- Capitalize “Strategic Communication Practice” when referring to the firm’s practice
- Check date formatting matches DD Month YEAR format (e.g., 5 January 2025)
**Pass 5: Numbers and Facts (if applicable)**
- Verify mathematical calculations are correct
- Ensure percentages sum to 100% where appropriate
- Confirm dates are accurate with correct days and times
- Double-check any referenced numbers, invoice numbers, or project numbers
- Verify data in text matches data in any charts, tables, or graphs
**Pass 6: Formatting Consistency (if applicable)**
- Check consistent formatting of dates throughout
- Verify consistent bullet point formatting
- Ensure punctuation formatting (if bold/italic, punctuation should match)
- Check for consistent font usage
- Verify page numbers are sequential and match table of contents
- Look for hanging initials or numbers that should move to the next line
### Step 4: Present Verified Findings
Present findings clearly from your **verified findings file only** using a bulleted format for easy tracking:
- **If you retracted errors during verification, briefly note how many false positives you caught at the beginning**
- Present each error as a single bullet with the format: **Location | Error Type:** Found “quoted text” → Should be “corrected text”
- Group similar errors together under category headings (e.g., “Spelling and Typography Errors”, “Grammar Errors”, etc.)
- For complex errors requiring explanation, add a brief note after the arrow
**Example format:**
```
I found 9 verified errors. During verification, I caught and retracted 1 false positive.
**Spelling and Typography Errors**
- **Communication Principles, bullet 2 | Missing space:** Found “providingas information” → Should be “providing information”
- **”We will help them believe” section, bullet 2 | Typo:** Found “Chris’asd and COMPANY” → Should be “Chris’s and COMPANY”
- **Main Framing Message paragraph | Misspelled word:** Found “The singulared message” → Should be “The singular message”
**Grammar Errors**
- **Communication Objectives opening line | Extra word:** Found “we want to stakeholders and audiences to:” → Should be “we want stakeholders and audiences to:”
- **Communication Principles, bullet 6 | Missing article:** Found “and role ORG will play” → Should be “and the role ORG will play”
```
### Step 5: Clarify Ambiguities
When encountering situations where the style guide allows discretion or where client-specific preferences may apply:
- Flag the issue for the user
- Present the options available
- Reference relevant style guide guidance
- Ask for clarification rather than making assumptions
## Resources
### references/style-guide.md
Contains the complete CRA | Admired Leadership Style Guide including:
- Voice and writing principles
- Common scenarios with AP and firm-specific guidance
- Firm terminology and grammar rules
- Proofreading best practices
- Formatting guidelines
Load this reference at the beginning of each proofreading task to ensure accuracy and consistency with firm standards.
