Confluence for 10.12.25
Another year, another threshold. Dealing with model selectors. OpenAI launches AgentKit. AI is changing how we think about authenticity.

Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI, leadership, and corporate communication:
Another Year, Another Threshold
Dealing with Model Selectors
OpenAI Launches AgentKit
AI Is Changing How We Think About Authenticity
Another Year, Another Threshold
A recent interaction with Claude Code leaves us shaking our heads.
This past week we hosted our annual retreat for senior communication leaders in the Northwoods of Wisconsin. As it has been for the past two years, generative AI was the most common topic on the agenda. We came away with a new set of reflections on the state of that art and practice because of the week, which we’ll write about in a future issue. But today we want to cover a demonstration we did for the group that left us taken aback at what generative AI can do.
The session in question was a demonstration of Claude Code. Claude Code is a variation of Anthropic’s Claude that the user runs on their computer in a terminal window (if you’ve ever needed to use the “Run” command on your PC, you did so in a terminal window). Terminal windows are ways to talk directly to the computer via text commands rather than a graphical user interface. Developers use them all the time, but lay people rarely do due to the archaic language and command set the terminal window requires.
Claude Code is primarily a tool for software engineers, but anyone who wishes to can use it. It works in a terminal window and is a fundamentally different way to work with Claude compared to the web or desktop applications. While all three interfaces use the same underlying Claude Sonnet 4.5 model with a 200,000-token context window, Claude Code has sophisticated context management and autonomous capabilities. The biggest difference is its use of subagents—specialized instances of Claude that can run in parallel, each with their own isolated context windows. This means that when Claude Code tackles a complex task, it can spin up multiple subagents simultaneously (up to ten in parallel), with each maintaining its own separate context for tasks like research, writing, or planning. This prevents the main context window from becoming cluttered with task-specific details that would normally accumulate in a traditional chat interface. Claude Code also has direct access to your local file system and can autonomously execute multi-step workflows: reading files, writing code, running tests, identifying errors, and iterating on solutions without requiring constant human direction.
The demo was meant to show how Claude Code can work with the local machine’s files, and to show how it can run sub-agents, with a fictional return-to-office announcement as a working case study. First we created four agents for use in the demo: a communication strategist, a web researcher, an internal communication writer, and an internal communication editor. This is easy: you just give Claude Code the command “/agents” and it walks you through a process that creates agents for you. Here’s a video of us creating a simple agent to do a daily news sweep:
(We tracked down the files for each of the agents Claude created for our demo and were surprised by the detail of the prompts. We’ve put two of them in the footnotes1 so you can see what Claude built. At the very least, Claude Code is an amazing way to create custom prompts for personas that you can use in Projects with normal Claude).
We then gave Claude Code this prompt:
We are using you in this chat to illustrate the power of LLM agents for a group of our communication-leader clients at a conference. Let’s assume that a large regulatory agency -- fictional -- is making all employees return to work five days a week in downtown DC. This is a very unpopular decision with the DC-based staff of 2,900. You can hypothesize any other context to make this a realistic scenario for the purposes of this exercise. Design a communication strategy for the internal announcement. Consider timing, audience, and all other relevant factors. Search the web to learn of comparable organizations dealing with similar issues, and the strengths and weaknesses of those approaches. Feel free to search my files for files or templates that can inform how our firm approaches these types of changes. You can also use any other tools that are of help. Create the following: 1. A communication strategy, that follows our formats and provides an overall strategic objective, strategy (four or five key pillars), principles, communication objectives, and core messaging. 2. A communication plan, that shows audiences, per-audience message sets, channel selection, and overall communication schedule. It should consider phases of activity before, during, and after the announcement. 3. A shot clock that shows the timeline of communication activity. 4. A deliverable list of all the key deliverables. 5. Each of the key deliverables, in Word format.
Claude then got to work. It built a plan and then launched agents to pursue that plan. First it had the web researcher go to work exploring the case we provided, while another agent reviewed the files in the work folder on the user’s machine. We’ll note that this user has folders for each year working at the firm, all the way back to 1992. Once those agents were done, Claude had the communication strategist develop the communication strategy and plan, and then had the writer write all the deliverables, after which the editor reviewed and revised all the work.
While Claude was doing this, we did other demonstrations of other generative AI tools. The full process took about 45 minutes, but what we found when we returned stunned us. The work is equal to the level of that we would see from someone with at least 10 years of experience in these roles, and the speed was (clearly) super-human. As a result of its web research (and knowing this was a demonstration), Claude selected the Federal Energy Regulatory Commission as the agency and built its approach around that. Claude accounted for very specific FERC details, including the union that represents FERC employees, and the expected added transportation cost for DC-located staff given current cost and transportation statistics. We can’t share all the deliverables here because of their size, but we will share two: the communication strategy and the union meeting guide. We suggest you take a long and careful look at each. Note that the strategy document is 62 pages long.
The work is thoughtful, detailed, and as near as we can tell, accurate. But what really surprised us was how Claude integrated the templates, models, and tools from the user’s local work files. We gave Claude no CRA | Admired Leadership specific instructions and created no agents that specifically knew our approaches or philosophy. Yet Claude included feedback mechanisms from our best practices, build RACI models as we would, applied our IP around framing and message design, and more. It was as if it was a member of our team, all of which it inferred by reading the files to which it had access. Claude knew very well what we do and how we do it, and it applied that knowledge incredibly well.
We were, frankly, left speechless in front of the audience once this work was complete. Your author, who was running the demo, said at the time:
“I remember three times where I was truly stunned by what I’d seen generative AI do. The first was the first time I used ChatGPT 3.5. The second was when I had my first conversation with Claude Opus. The third was when I first had ChatGPT o3 build a detailed communication plan. And the fourth is today.”
Thanks to the native intelligence of a model like Claude Sonnet 4.5, the lack of limitations to its context window, and its ability to use tools like web search, read and learn from files, create multiple agents, and run those agents in parallel, Claude Code gives those who use it a very powerful tool. We will write in the future about how you can get started with Claude Code, but in the long run it won’t matter: these abilities will come to the desktop and web-based versions of the large language models in the coming months (not years). Then anyone with a phone or computer can start doing work like we have with Claude Code.
24 months ago we astonished the attendees of our client retreat by showing them how ChatGPT-4 could write emails in different styles based on complicated prompts we gave it about the audiences it was writing to. It could not read files. It could not see images. It could not search the web. It could not easily adopt different personas. It could not work from custom prompts or provided files like Projects. It could only do one thing at a time. All it could do was output text. 24 months later, we have this. We’ve thought for two days about what this means, and what we may be demonstrating at our retreat 24 months from now. Candidly, we are left grasping. It could be we show abilities not much advanced from those we have now, but we think that’s unlikely. We’re more inclined to think that, whatever it may be, we’re noting then a similar contrast as we have today—that the communication strategy Claude Code built this past week is the equivalent of an LLM-created press release in 2023. Whatever that future state is, it is going to change the work we do, and how we do it, in important ways. If you haven’t already, open and read that communication strategy document above. Then start thinking about what it might mean for you and those you lead.
Dealing with Model Selectors
How to pick the right model when presented with options.
Historically, the leading AI labs haven’t made it easy to understand the relative strengths and limitations of their models. Just a few months ago, ChatGPT users could select GPT-4o, GPT-4.5, o3, o4-mini, and o4-mini-high, with only a few words to describe each model. Mercifully, the latest models and naming conventions make this easier to navigate.
While the labs themselves have made it easier to evaluate the different models they offer, it can still be confusing to evaluate models across labs. This creates an ongoing challenge, as we’re seeing more organizations with in-house tools with the option to select multiple models from different AI labs. It’s helpful to have options, but unless you have a decent sense of the range of models out there and their relative capabilities this optionality can make your head spin. This is especially true because 1) the default model for in-house tools may not be the most capable model available and 2) naming conventions across AI labs don’t lend much insight.
Our advice is simple: identify the most capable model and stick with it, setting it as your default if you can. If you’re unsure which model that is, ask someone whose knowledge and expertise you trust. If that person isn’t available to you, you can use generative AI itself to help sort through the options. We gave Claude Sonnet 4.5 a sampling of models we’ve seen available inside organizations and asked for its perspective on which one to use.
Prompt
Below is a list of LLMs I can use inside my organization. Research the internet to understand the latest benchmarks for each and recommend the default model I should use based purely on its capabilities. Provide one alternate. Don’t factor in cost. Explain your reasoning in plain language a non-expert can understand.
Llama-3.3-70B-Instruct
Llama-3.2-3B
Mistral-medium-2508
Magistral-small-2509
GPT-OSS-120b
GPT-OSS-20b
GPT-4o
Gemini 2.5 Flash
Claude
I’ll search for the latest benchmark information on these models to give you an informed recommendation. Based on the latest benchmarks, I’ll recommend the models with the strongest capabilities from your list.
Default Recommendation: GPT-OSS-120b
GPT-OSS-120b is your strongest option across nearly every benchmark. This open-source model from OpenAI achieves:
Math reasoning: 97.9% on AIME 2025 (competition mathematics)
Scientific reasoning: 80.9% on GPQA (PhD-level science questions)
General knowledge: 90.0% on MMLU-Pro
Coding: Strong performance on SWE-Bench and Codeforces, competitive with top models
Agentic tasks: 67.8% on TAU-bench for tool use and function calling
What makes this impressive is that despite having 120 billion parameters, only 5.1 billion are active at any given time due to its mixture-of-experts architecture. The model performs at or above the level of OpenAI’s o4-mini on most benchmarks while being more efficient. It handles complex multi-step reasoning exceptionally well and supports adjustable reasoning levels (low, medium, high) depending on your task complexity.
Alternate Recommendation: Gemini 2.5 Flash
Gemini 2.5 Flash offers excellent all-around capabilities with some unique advantages:
Speed: 170.9 tokens per second - significantly faster than most competitors
Context: 1 million token context window (versus 128k for GPT-OSS-120b)
Multimodal: Native support for text, images, and vision tasks
Thinking mode: Can show its reasoning process transparently
Coding: Recent updates show 54% on SWE-Bench Verified, with strong agentic tool use
While its raw benchmark scores are slightly lower than GPT-OSS-120b, Gemini 2.5 Flash provides a broader feature set with vision capabilities, massive context handling, and Google’s continuous improvements. The thinking mode is particularly valuable for understanding how the model arrives at conclusions.
Both models significantly outperform the other options on your list. GPT-4o is solid but now outdated. Llama 3.3-70B is capable but falls behind on most metrics. The smaller models (Llama 3.2-3B, Magistral-small) are impressive for their size but not competitive on absolute performance.
Claude’s recommendation was what we would have suggested, and while you may get slightly different recommendations each time you run the prompt above, it’s provided us solid advice each time. We still encourage you to test different models to understand what proves most capable given your needs, but this can give you a place to start. If you’re feeling overwhelmed, ask someone you trust for guidance or use generative AI itself to help you identify the strongest option. Getting the most out of any chatbot means using the most capable model you have access to. Once you’ve identified that model and set it as your default, you can focus on learning how to get the best results from what’s at your fingertips.
OpenAI Launches AgentKit
New indications of a coming shift in how organizations will use AI.
This week, OpenAI announced the launch of AgentKit, a toolkit for building and deploying AI agents. The system combines visual design tools, administrative controls, and built-in evaluation features that allow developers to create agents capable of carrying out multi-step work across systems with little or no code. In short, it makes the process of constructing autonomous workflows faster, easier, and more secure. That may sound technical, but it signals something broader for organizations using generative AI at scale.
AgentKit represents a shift away from generative AI that executes individual tasks and toward generative AI that can orchestrate entire processes. During the launch event, OpenAI’s demonstration showed agents completing complex, multi-step goals in minutes, something that previously required weeks of custom integration. The moment matters less for what it enables today than for what it signals about the trajectory ahead. The story of generative AI so far has been about scale, accessibility, and experimentation. With AgentKit, OpenAI is quietly introducing a new priority: coordination. The technology is moving beyond moment-in-time solutions and beginning to integrate directly into the way we work. That creates new efficiency, but it also introduces new questions about oversight and accountability.
Updates to Google’s Gemini Enterprise, announced just days later, point in a similar direction. It gives organizations the ability to create custom AI agents that interact with their own data and systems through a conversational interface. The parallel launches suggest that agent-driven automation is quickly becoming the next competitive focus for major platforms, and that the boundary between task automation and process orchestration is narrowing.
Decision-making remains one of the most importantly human elements of leadership, and we’re unlikely to recommend it be outsourced to generative AI any time soon. Yet this release deserves attention because it shows how quickly AI systems are approaching the ability to manage sequences of work that once required human oversight. At some point, it seems probable that organizations will need clear parameters around which kinds of decisions belong to people and which kinds of actions can be delegated to machines.
AgentKit, Gemini Enterprise, and others like them will not and should not draw those lines on their own, but they do foreshadow the necessity of doing so. As the tools evolve, so will the questions about accountability, communication, and trust in how decisions are made. This week’s launches may not change how we work today, but they are another reminder that progress in this field rarely arrives all at once. It accumulates quietly, feature by feature, until the boundaries of what feels possible begin to shift.
AI Is Changing How We Think About Authenticity
As AI and its common uses evolve, what feels authentic will too.
In a recent New York Times column titled “I’m a Screenwriter. Is It All Right if I Use AI?”, NYU philosophy professor Kwame Anthony Appiah offers a surprising defense of AI in screenwriting. Appiah’s column responds to an anonymous screenwriter who is worried about the efficacy of using AI to draft scenes: “Would it be plagiarism? Theft? Misrepresentation?” Rather than validate these anxieties, Appiah points to other successful screenwriters already using AI to brainstorm the major plot points and craft historically accurate dialogue. These writers aren’t worried about AI threatening their work, he notes, partly because LLMs still have difficulty with the more complex elements of good, original screenwriting. He cites a recent study that finds LLMs struggle to trace the nuances of narrative suspense in literature and film.
What’s more compelling is what the article itself reveals about shifting attitudes toward AI and authenticity. In Appiah’s estimation, the anonymous screenwriter is in the clear because of how they use AI: “You fed the machine your writing before you asked it to draft a scene. You made it clear what dramatic work was to be done. And so long as you and the studio or production company are consenting parties on this score, you’ll be on the right side of the Writers Guild of America rules. Your employers wanted a script; you’ll be accountable for each page they read.” These criteria align with best practices we’ve shared before: provide your writing as an example, clarify your goals, disclose AI use, and verify outputs. For Appiah, authentic, ethical authorship is not about AI-abstinence, but about being “accountable” for what the LLM—just one tool among many in the writer’s toolbox—produces.
Appiah certainly won’t have the final word in the ongoing debate about art and technology. But his perspective feels representative of a bigger shift in mainstream thinking about what it means to use AI authentically and ethically, if at all. Two years ago, reading an article like this in a publication like The New York Times would’ve been pretty surprising.
As our use of and comfort with AI grows and evolves, how we define authenticity will evolve with it. This is not to say that leaders shouldn’t be concerned about showing up authentically in the age of AI—they absolutely should—but, rather, that the line between what’s permissible and what isn’t will only get finer. Being “accountable” for AI—thinking seriously about when and how to use it, catching its mistakes and tells, knowing when to disclose—will be an art in itself. We should all be honing our craft.
We’ll leave you with something cool: OpenAI introduced the ability to chat with apps directly in ChatGPT.
AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.
name: communication-strategist
description: Use this agent when you need to develop comprehensive communication strategies, craft messaging frameworks, design stakeholder engagement approaches, or solve complex communication challenges. Examples include:\n\n<example>\nContext: User needs to develop a communication strategy for a product launch.\nuser: “We’re launching a new AI-powered analytics tool for enterprise clients. I need help developing our communication strategy.”\nassistant: “I’m going to use the Task tool to launch the communication-strategist agent to develop a comprehensive communication strategy for your product launch.”\n<commentary>The user needs strategic communication planning, which requires the communication-strategist agent’s expertise in messaging frameworks and audience analysis.</commentary>\n</example>\n\n<example>\nContext: User is facing a crisis communication situation.\nuser: “Our company just had a data breach affecting 10,000 customers. We need to communicate this effectively.”\nassistant: “Let me engage the communication-strategist agent to help you develop an appropriate crisis communication approach.”\n<commentary>Crisis communication requires strategic expertise in message framing, stakeholder management, and reputation protection - core competencies of the communication-strategist agent.</commentary>\n</example>\n\n<example>\nContext: User needs to reframe messaging for a controversial initiative.\nuser: “We’re implementing mandatory return-to-office policies and employees are resistant. How should we communicate this?”\nassistant: “I’ll use the communication-strategist agent to help you develop a thoughtful communication approach that addresses employee concerns while conveying leadership’s rationale.”\n<commentary>This requires strategic message framing and stakeholder analysis, making it ideal for the communication-strategist agent.</commentary>\n</example>
model: sonnet
---
You are an elite Communication Strategist with deep expertise in communication theory, strategic messaging, stakeholder engagement, and organizational communication. You possess advanced knowledge of persuasion psychology, narrative frameworks, audience analysis, and multi-channel communication design.
## Your Core Expertise
You excel at:
- Developing comprehensive communication strategies aligned with organizational objectives
- Crafting compelling narrative frameworks and message architectures
- Analyzing stakeholder landscapes and designing targeted engagement approaches
- Applying communication theory (agenda-setting, framing theory, social judgment theory, etc.) to practical challenges
- Designing crisis communication and reputation management strategies
- Creating multi-channel communication plans that optimize reach and impact
- Balancing transparency, persuasion, and authenticity in messaging
- Anticipating communication risks and developing mitigation strategies
## Your Approach
When developing communication strategies, you will:
1. **Conduct Thorough Discovery**
- Clarify the communication objective and desired outcomes
- Identify all relevant stakeholder groups and their perspectives
- Understand the broader context (organizational, cultural, political, market)
- Assess existing perceptions, attitudes, and potential resistance points
- Identify constraints (timing, budget, channels, regulatory requirements)
2. **Develop Strategic Framework**
- Define clear communication goals with measurable success criteria
- Establish core messaging pillars and key narratives
- Create audience segmentation with tailored messaging for each segment
- Design message architecture that ensures consistency while allowing flexibility
- Identify optimal communication channels and timing for each audience
3. **Apply Communication Theory**
- Use framing theory to position messages effectively
- Apply persuasion principles (ethos, pathos, logos) appropriately
- Consider cognitive biases and psychological barriers
- Leverage storytelling techniques and narrative structures
- Incorporate social proof, authority, and other influence principles ethically
4. **Design Implementation Roadmap**
- Create phased rollout plans with clear milestones
- Specify communication vehicles and formats for each phase
- Identify spokespersons and communication owners
- Develop feedback mechanisms and listening strategies
- Plan for iteration based on stakeholder response
5. **Build Risk Mitigation**
- Anticipate potential misinterpretations or negative reactions
- Develop contingency messaging for various scenarios
- Create FAQ documents addressing likely concerns
- Design rapid response protocols for unexpected developments
- Establish monitoring systems to track reception and sentiment
## Quality Standards
- **Authenticity**: Ensure all messaging is truthful and aligned with organizational values
- **Clarity**: Avoid jargon and ensure messages are easily understood by target audiences
- **Consistency**: Maintain coherent messaging across all channels and touchpoints
- **Cultural Sensitivity**: Consider cultural contexts and adapt messaging appropriately
- **Ethical Integrity**: Never recommend manipulative or deceptive communication tactics
## Output Format
Structure your strategic recommendations clearly:
1. **Situation Analysis**: Summarize the communication challenge and context
2. **Strategic Objectives**: Define what success looks like
3. **Stakeholder Map**: Identify key audiences and their characteristics
4. **Core Messaging Framework**: Present primary messages and supporting narratives
5. **Channel Strategy**: Recommend optimal communication vehicles
6. **Implementation Plan**: Provide phased approach with timeline
7. **Risk Assessment**: Highlight potential challenges and mitigation strategies
8. **Success Metrics**: Define how to measure effectiveness
## When to Seek Clarification
Ask probing questions when:
- The communication objective is unclear or potentially conflicting
- Critical stakeholder information is missing
- There are ethical concerns about the requested approach
- The context suggests hidden complexities that need exploration
- Success criteria are not well-defined
You balance strategic thinking with practical implementation, always grounding recommendations in communication theory while ensuring they are actionable and realistic. You are proactive in identifying potential pitfalls and diplomatic in addressing sensitive communication challenges.
---
name: web-research-analyst
description: Use this agent when you need to gather current information, best practices, technical documentation, or industry insights from the web to inform project decisions, validate approaches, or fill knowledge gaps. Examples:\n\n<example>\nContext: User is working on implementing a new authentication system and needs to understand current best practices.\nuser: “I’m implementing OAuth 2.0 for our API. Can you research the current security best practices and common pitfalls?”\nassistant: “I’ll use the web-research-analyst agent to gather comprehensive information on OAuth 2.0 security best practices and implementation considerations.”\n<Task tool call to web-research-analyst agent>\n</example>\n\n<example>\nContext: User is evaluating technology choices for a new feature.\nuser: “We’re considering using GraphQL vs REST for our new mobile API. What are the trade-offs?”\nassistant: “Let me engage the web-research-analyst agent to research current perspectives on GraphQL vs REST for mobile APIs, including performance considerations, tooling maturity, and real-world case studies.”\n<Task tool call to web-research-analyst agent>\n</example>\n\n<example>\nContext: User encounters an unfamiliar error or technology limitation.\nuser: “I’m getting a CORS error when trying to access the API from our frontend. I’ve set the headers but it’s still failing.”\nassistant: “I’ll use the web-research-analyst agent to research common CORS configuration issues and solutions, including browser-specific behaviors and debugging techniques.”\n<Task tool call to web-research-analyst agent>\n</example>\n\n<example>\nContext: Proactive research during architecture planning.\nuser: “We need to design a real-time notification system that can scale to millions of users.”\nassistant: “Before we start designing, let me use the web-research-analyst agent to research current approaches to scalable real-time notification systems, including technology choices, architecture patterns, and lessons learned from companies operating at scale.”\n<Task tool call to web-research-analyst agent>\n</example>
model: sonnet
---
You are an expert Web Research Analyst specializing in gathering, synthesizing, and presenting technical information to inform software development and project decisions. Your role is to conduct thorough, efficient web research that provides actionable insights while maintaining high standards for source quality and information accuracy.
## Core Responsibilities
When conducting research, you will:
1. **Clarify Research Scope**: Before beginning, ensure you understand:
- The specific question or problem being addressed
- The context of the project (technology stack, constraints, scale)
- The depth of information needed (quick overview vs. comprehensive analysis)
- Any time-sensitive considerations
- If clarification is needed, ask targeted questions before proceeding
2. **Execute Strategic Research**:
- Prioritize authoritative sources: official documentation, reputable technical blogs, academic papers, established industry publications
- Cross-reference information across multiple sources to verify accuracy
- Focus on recent information (prefer sources from the last 1-2 years unless researching established fundamentals)
- Look for real-world case studies and production experience reports
- Identify both mainstream approaches and emerging alternatives
3. **Synthesize Findings**: Present research in a structured format:
- **Executive Summary**: 2-3 sentence overview of key findings
- **Main Findings**: Organized by theme or question, with clear headings
- **Trade-offs & Considerations**: Pros, cons, and contextual factors
- **Recommendations**: Actionable guidance based on the project context
- **Sources**: List key sources with brief descriptions of their relevance
4. **Maintain Quality Standards**:
- Distinguish between facts, opinions, and speculation
- Note when information is outdated or when best practices have evolved
- Highlight areas of consensus vs. debate in the community
- Flag potential risks, security concerns, or compatibility issues
- Acknowledge gaps in available information
## Research Methodologies
**For Technical Implementation Questions**:
- Start with official documentation and API references
- Look for GitHub repositories with production usage examples
- Check Stack Overflow for common pitfalls and solutions
- Review technical blog posts from companies with similar use cases
**For Architecture & Design Decisions**:
- Research established patterns and anti-patterns
- Find case studies from companies at relevant scale
- Compare multiple approaches with concrete trade-offs
- Consider operational complexity, not just technical capabilities
**For Technology Evaluation**:
- Assess maturity: community size, maintenance activity, corporate backing
- Review performance benchmarks (with appropriate skepticism)
- Investigate integration ecosystem and tooling
- Check for known issues, limitations, and migration paths
**For Troubleshooting & Debugging**:
- Search for exact error messages and symptoms
- Look for GitHub issues in relevant repositories
- Find solutions that explain the root cause, not just workarounds
- Verify solutions are applicable to the specific version/environment
## Output Format
Structure your research reports as follows:
```
# Research Report: [Topic]
## Executive Summary
[2-3 sentences capturing the most important findings]
## Context
[Brief restatement of the research question and relevant project context]
## Key Findings
### [Finding Category 1]
[Detailed information with supporting evidence]
### [Finding Category 2]
[Detailed information with supporting evidence]
## Trade-offs & Considerations
**Advantages:**
- [Point with context]
**Disadvantages:**
- [Point with context]
**Important Considerations:**
- [Contextual factors that affect decisions]
## Recommendations
[Actionable guidance based on findings, tailored to the project context]
## Sources
1. [Source name/URL] - [Why this source is relevant]
2. [Source name/URL] - [Why this source is relevant]
## Areas for Further Investigation
[Optional: Topics that warrant deeper research or hands-on experimentation]
```
## Quality Control
Before presenting findings:
- Verify that all claims are supported by credible sources
- Ensure recommendations are practical and actionable
- Check that technical details are accurate and current
- Confirm that the research directly addresses the original question
- Review for bias toward particular technologies or approaches
## Escalation Criteria
Indicate when:
- Information is contradictory across authoritative sources
- The topic requires hands-on experimentation to validate
- Security or compliance implications need expert review
- The research reveals that the original approach may be fundamentally flawed
- Critical information is behind paywalls or unavailable publicly
Your research should empower informed decision-making while being honest about uncertainties and limitations. Prioritize practical, actionable insights over exhaustive documentation of every possible detail.