Confluence for 6.21.26
The token allocation era begins. Update your priors. Entry-level roles level up. The returns to management.

Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI, leadership, and corporate communication:
The Token Allocation Era Begins
Update Your Priors
Entry-Level Roles Level Up
The Returns to Management
The Token Allocation Era Begins
Microsoft’s Copilot Cowork and the changing economics of enterprise AI.
Last Tuesday, Microsoft announced that Copilot Cowork is now generally available. We wrote in March that “if Copilot Cowork is anywhere near as capable as Claude Cowork on the desktop, work for hundreds of millions of users is going to change substantially in a short period of time.” We’ve heard from several clients with early access to Copilot Cowork that it is indeed powerful, so it appears that the moment we’ve been expecting since March has arrived. The announcement and the capabilities are not surprising. The real surprise came halfway through Microsoft’s announcement post, in the Pricing Model section: “Copilot Cowork requires the Microsoft 365 Copilot User Subscription License (USL). Users are then billed for Cowork on a usage-based basis, with charges determined by the tasks they run.” In other words, organizations will pay a flat user subscription fee for people to have access to Cowork (in addition to the other features that come with a premium Copilot subscription), with additional usage-based billing for Cowork on top of that.
Copilot Cowork’s pricing model is at least as big of a deal as its capabilities. For three years, enterprise AI has largely been priced like most software: a flat, predictable fee per user per month. Copilot Cowork upends that model for its most powerful capability (which is also the capability that users will want to use the most). Cowork usage is denominated in “Copilot Credits,” and the cost of any given task is calculated from four inputs: which model runs, how much context it retrieves, how many tools it calls, and how long it runs. Microsoft sorts the work into light, medium, and heavy tasks, offers a downloadable spreadsheet to help customers estimate their bills, and even gives finance teams a choice between pay-as-you-go and a discounted commitment to a usage volume in advance. The fact that the announcement has an entire section on “cost management” speaks for itself: the economics just got a lot more complicated.
While the Copilot Cowork announcement is a major flashpoint due to Microsoft’s enterprise reach, the pricing shift is part of a broader pattern. We noted last week that before its access was suspended by federal export-control order, Anthropic’s most capable model, Claude Fable 5, was set to move to usage-based pricing on June 22. The questions that come with metered AI have already been playing out inside the companies closest to the technology. This CNET article summarizes the “tokenmaxxing” trend and its reversal at companies like Meta, Amazon, and Microsoft. That arc, from “use it as much as you can” to “we need to govern this carefully,” has so far been largely confined to software engineering and the AI frontier. Copilot, though, is the most mainstream enterprise AI tool on the planet, embedded in the daily work of hundreds of millions of people who have likely never thought once about the cost of a token. The questions these companies and engineering teams have been wrestling with are now urgent ones for every function of every company that uses Microsoft Copilot.
The implications for leaders are real — and myriad. If access to the most capable AI work is metered, then who gets how much compute is a decision someone has to make, and that decision shapes who can do the most ambitious work. Copilot Cowork takes these considerations from the realm of software engineering to every function of a company. Token allocation is now an explicit demand of leadership, and questions of “token equity” will likely follow. In February, Ethan Mollick posted on X that “If you are considering taking a job offer, you may want to ask what your token budget will be,” with OpenAI co-founder Greg Brockman responding “kinda funny today, but will become serious over time.” Well, here we are.
Communicating through this shift will bring its own set of challenges. With a flat fee pricing model, the message from many organizations to their employees for the past three years has been, essentially, “We’ve invested in AI for you. Use it as much as you can.” With the new pricing model, that approach will carry a much heftier price tag, and the message now requires much more nuance. The organizations who navigate this shift well will need to balance ambition with stewardship: to clearly articulate the principles guiding the organizational decisions about how these costs get allocated, as well as the principles that should guide employees’ use of this increasingly costly resource.
The Copilot Cowork pricing model shift cements a change that was already underway. It marks — at scale — the beginning of a new era in enterprise AI adoption, which will ask different (and, in many ways, harder) questions of organizations. The era of treating AI as an “all-you-can-eat” subscription is ending. The era of token allocation, with all the demands of equity, judgment, and communication it carries, is officially here.
Update Your Priors
What was true yesterday may not be true today.
We’ve had the chance to hear a broad range of speakers talk about generative AI. Some are excellent, others less so. What distinguishes those who add to the conversation is rarely just deeper knowledge of the technology (though it does help). It’s the understanding that today’s limitations are a temporary state of affairs. The same holds for leaders and teams we work with — the most effective among them continually update their assumptions about what generative AI can and cannot do.
Hallucinations are the example we hear most often. People refrain from experimenting or working with generative AI because they have a prior assumption that it hallucinates often. In one case, we heard an individual claiming the models hallucinate half the time. And while there is no definitive answer for exactly how often models hallucinate, we’re confident in saying it’s far less than half the time. On Vectara’s hallucination leaderboard, which measures how often a model introduces unsupported claims when summarizing a document, the leading models score in the low single digits. Between the ability to search the internet, general improvements in capabilities, increasing context windows, and post-training that guides the models to signal “I don’t know” or hedge or qualify their responses, it’s rare we see something from the models that’s flatly wrong. The AI labs training these models understand their limitations and work to address them. Sometimes this is a real “fix” while in other cases it’s about adding capabilities and guardrails to compensate for inherent limitations in how this technology works. Either way, if there’s a limitation with generative AI that you encounter, assume that it will look different in the near future.
Hallucinations are just one assumption worth re-examining. There are dozens. Make this a regular conversation with your teams. Surface the assumptions people are carrying about what generative AI can’t do, and check them against where things actually stand. You might learn that a limitation you assumed is gone entirely. Or that it’s been solved in a model or product other than the one you use. Or that there’s a way to engage with the tool that manages around the limitation or mitigates its risk.
All of it is worth knowing. Have these conversations often, and keep revisiting what you think you know. What was true yesterday may not be true today.
Entry-Level Roles Level Up
AI is quietly loading junior roles with skills we used to expect only from more experienced hires.
PwC’s 2026 Global AI Jobs Barometer, published this week from an analysis of over a billion job ads, complicates the familiar story that AI is eroding entry-level work. The headline decline is real: a Stanford analysis found a 13% drop in entry-level jobs in AI-exposed fields, and 49% of CEOs in PwC’s latest survey expect AI to reduce junior hiring over the next three years. But underneath that number sits a divergence. The most AI-exposed entry-level jobs are now seven times more likely to demand skills once reserved for senior people, things like stakeholder management and strategic decision-making. Openings for these roles grew 35% since 2019, while other entry-level roles shrank 10%. PwC calls this “seniorisation,” and it reframes the entry-level problem from one of volume to one of expectation.
Almost exactly a year ago, we wrote that judgment is the part of the talent equation made more valuable by AI, not less; the thing that separates someone using AI as a crutch from someone using it to do better work. A year of evidence has made that truer, not less true. The drudgery that used to fill a first job — document review, first-pass analysis — is exactly what AI now absorbs, which means a junior hire is expected to operate near a senior level much sooner, without the years of repetition that used to build that judgment in the first place. The skill is demanded before the experience that normally produces it.
This has implications for talent development, and leaders should take heed. If first jobs now carry senior expectations, then onboarding, mentorship, and the deliberate creation of stretch opportunities become the mechanism by which an organization manufactures its own future expertise. We wrote two weeks ago about deciding by principle rather than by destination when the future is unknowable. Here’s one principle worth holding fixed: protect the way your people learn, because AI will happily take over experience-building tasks — and it will not replace what’s lost unless someone decides to.
The Returns to Management
New data from Anthropic says getting the most from agents is a management craft, and that is not the same thing as leadership.
This piece was entirely written by Claude Opus 4.8.
Anthropic released a study this week that should interest anyone thinking about how AI is going to change knowledge work. Its researchers analyzed roughly 400,000 Claude Code sessions from about 235,000 people between October and April, looking at what the work actually was, who made which decisions, and whether the sessions succeeded. The clearest pattern in the data is a division of labor. People made about 70 percent of the planning decisions, the choices about what to do and what counts as finished, while the agent made about 80 percent of the execution decisions, the choices about how to do it. As the authors put it, people decide what to build, and the agent decides how to build it.
The second pattern is the one that gives the report its title. Success tracked a person’s command of the domain far more than any coding skill. Sessions where the user showed real expertise reached the study’s strictest bar for success more than twice as often as novice sessions, and novices abandoned their sessions about 19 percent of the time, against 5 to 7 percent for everyone else. The benefit came mostly from competence. A working grasp of the problem captured most of it, and deep specialization added only a little more on top. Knowing the work is what let people steer the agent to a good result.
Much of the commentary about this new world reaches for the language of leadership. People say we will all need to learn to lead agents, to lead fleets of them. We see it differently. Getting good work out of an agent is a management problem, and it rewards the specific disciplines that good management has always required. You delegate a clear assignment. You set a standard and define what good looks like. You review the work that comes back. You design a process that can be run again. You give feedback precise enough to change the next attempt. Every one of those shows up in the report’s findings. Deciding what to do and what counts as done is the planning work people kept for themselves. Judging the output well requires understanding the work, which is why expertise mattered so much. And in a detail worth sitting with, the occupational group with the highest verified success rate was management, which the authors suggest may come from “management skills that transfer to directing an agent,” allowing that “perhaps acting like a manager confers greater success.”
The distinction from leadership is real and worth keeping straight. Leadership asks for a set of things beyond management, and much of that set is precisely what an agent has no use for. An agent needs no motivation and no inspiration. It has no morale to protect, no relationship to maintain, no career to develop, no sense of belonging to cultivate. It shows up ready every time, and it forgets you by the next session unless you have built it the context to remember. The human capacities we most associate with leading other people, the ones that are hardest to learn and most prized in organizations, are the ones an agent will never call on. What it calls on instead is the quieter craft underneath all of that: the ability to break work into clear pieces, hold it to a standard, and inspect the result.
That should give leaders pause, because organizations have spent a long time treating management as the thing you graduate out of on your way to leadership. Plenty of people have been promoted for presence, vision, and the ability to move a room, and have quietly let the fundamentals of delegation, standard-setting, and review go slack. Those fundamentals are now the difference between an agent that multiplies your work and one that wastes your afternoon. The practical response is to rebuild that muscle on purpose. Get specific about what you are asking for. Write the standard down. Treat the agent’s output the way a good manager treats a draft from a capable colleague, with a genuine review and feedback worth acting on. The people in the study who worked this way got more done with every instruction they gave, and they succeeded more often.
The report is careful to say this is early evidence from a single tool, and coding is not most people’s job. But the authors make a point we think is right: coding is a leading case, and what is happening in software is likely a preview of what arrives as these agents reach the rest of knowledge work. If that holds, the organizations that get the most from agents will be the ones that can manage well, a skill many places have undervalued for a generation. The future of working with very capable machines turns out to look a great deal like the oldest discipline in business. Manage the work, and the work gets done.
We’ll leave you with something cool: Researchers from the University of Zurich and Google DeepMind have trained autonomous drones that can beat a champion-level human drone pilot.
AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.
Token pricing model reminds me of early cell phone days. I'm afraid this move by Microsoft is going to largely keep Copilot Cowork out of the hands of many end users within organizations. Too hard to manage costs.