Confluence for 10.27.24

Three new updates from Anthropic. Using generative AI for database search. OpenAI's next model could be around the corner. Why organizations hesitate to embrace generative AI.

Oct 27, 2024

Midjourney prompt: *1950s comic panel, professional person relaxing in chair with feet up, facing away from the viewer, retro computer with blinking lights doing all the work, Ben-Day dots, bold outlines, bright colors*

Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI and corporate communication:

Three New Updates from Anthropic
Using Generative AI for Database Search
OpenAI’s Next Model Could Be Around the Corner
Why Organizations Hesitate to Embrace Generative AI

Three New Updates from Anthropic

The OpenAI competitor only gets more impressive.

Most of the people we talk with about generative AI use ChatGPT from OpenAI as their model of choice. It’s the leader in market share and brand awareness, and for good reason: their lab is pushing the technology forward at an impressive pace. Far fewer have heard of Anthropic, the OpenAI competitor, and their family of generative AI models, Claude. We favor Claude for a variety of reasons, in particular the “smarts” of its leading model, Sonnet 3.5, and the strong proficiency in generating text of its largest model, Opus. Opus is the model that generates conversations for our AI for leaders and leadership, ALEX.

So we noticed this week when Anthropic announced three updates to their technology, each with its own implications for our work (and yours). The first was a model update, with Sonnet 3.5 New replacing Sonnet 3.5 (why they didn’t just call it Sonnet 3.6, we don’t know). The new Sonnet seems faster, but more important, smarter, in our testing. One thing we noticed in particular is that it seems to play a stronger role as a collaborator — we noticed it prompting us with questions in a proactive way, and making suggestions for next steps or alternative paths that we didn’t forecast or request. It’s hard to describe, but note the tone and thoughtfulness behind this response one of us (who is using a new reMarkable 2 e-ink tablet for note taking) received from Sonnet 3.5 New:

Your experience highlights a common tension: the romance of physical journals for posterity versus the reality of maintaining them. I especially appreciate the honest acknowledgment that these good intentions often fade — even for someone as disciplined as you.
Your bullet journal success is telling. It works because it’s functional rather than aspirational, immediate rather than for some future purpose. And now with the Remarkable, you get the tactile satisfaction of handwriting but with better portability and immediate access.
Here’s a provocative thought: What if you stopped fighting the “should” of maintaining wisdom notebooks for your kids? Instead of trying to create preserved artifacts of your knowledge, what if you focused on capturing your thinking in a way that’s actually useful to you now? The irony is that a well-used, messy, real working notebook might actually be more valuable to your kids later than a carefully curated collection of wisdom that doesn’t reflect your authentic thought process.
Your bullet journal could evolve to include what some call “traceable trails of thought.” When you have an insight about leadership or flying, instead of trying to file it in a dedicated wisdom notebook, you capture it right there in your daily notes with a specific symbol or marker. Over time, these create a chronological record of how your thinking evolved, contextualized within your daily life and work.
This feels more authentic than trying to maintain separate “wisdom notebooks.” It’s like the difference between a carefully posed family photo and a candid shot that captures a real moment.
What do you think about this approach? Does it feel like it might be more sustainable while still creating something meaningful for your kids?

Again, hard to describe, other than “smart.”

The second update is “Analysis,” which seems similar to the code interpreter feature of ChatGPT. From the Anthropic post on the topic:

We’re introducing the analysis tool, a new built-in feature for Claude.ai that enables Claude to write and run JavaScript code. Claude can now process data, conduct analysis, and produce real-time insights. The analysis tool is available for all Claude.ai users in feature preview.
Think of the analysis tool as a built-in code sandbox, where Claude can do complex math, analyze data, and iterate on different ideas before sharing an answer. The ability to process information and run code means you get more accurate answers — building on Claude 3.5 Sonnet’s state-of-the-art coding and data skills.

There’s an example in action here (unfortunately it’s not a video we can embed). But the bottom line is that you can now upload files, ask Claude to run analyses, and more. We use this ability with ChatGPT quite a bit to clean up data, merge files, run analyses, and more, and we’re happy to see it come to Claude.

The third update is the most provocative: “Computer Use.” In essence, it’s a way for you to offer Claude instructions and have it carry them out on your own desktop or laptop. This isn’t an out-of-the box feature (you need to use their API to do this, but anyone who wants to try it can do so with about five minutes of setup work), but this video illustrates it well:

We’ve been writing about agents for some time — specific applications or tools that you can talk to using natural language because they are based on large language models — that carry out specific roles or tasks for you. Anthropic’s Computer Use is a step in that direction, and X / Twitter is full of examples of people trying it out (including one where some folks asked Claude to order them pizza, and it did so).

We don’t know about you, but a future where we ask our computer, in natural voice, to do complex work and then we come back a bit later and it’s done sounds appealing. At the very least, it could simplify agenda planning, travel, and scheduling, as well as all sorts of content evaluation and production (“Hey Claude, scour the internet for any news on topic X from the past week and compile an exhaustive summary and analysis for me, with links to the primary sources). These are simple examples, and we think this is just a start.

Each of these updates is meaningful for what they say about how the technology is advancing. But what really strikes us is that they all came in the same week. We have long noted that people are underestimating not just the speed at which this technology is advancing, but also the speed at which it is accelerating. This past week is one in which we felt that acceleration more acutely than others.

Using Generative AI for Database Search

Two examples of an application of LLMs’ capabilities.

Over the last few weeks, we’ve seen two notable examples of a specific generative AI use case emerge — using these models as search engines for defined databases of information.

Perplexity, already known for its AI-powered search engine, announced a new offering for Pro customers called Internal Knowledge Search. The tool allows organizations to search their internal documents while simultaneously integrating results from web searches. We haven’t tried this ourselves, but Perplexity did share a brief video that provides a bit more detail and a look at how this feature works:

You can picture asking about restaurants near an upcoming conference and getting recommendations that reference both your company’s travel policies and current Yelp reviews. Or imagine researching a prospective client and seeing results that combine your firm’s past interactions with fresh market analysis from across the web. Again, we can’t say how well this tool can perform these tasks today, but that’s the aim.

A second example comes from the European Union (EU) Parliament who, in partnership with Anthropic, has taken this concept several steps further. Their new tool, Archibot, opens decades of legislative documents and parliamentary proceedings to instant search and synthesis — not just for staff and researchers, but for the general public. The system can summarize legislation, retrieve specific information through RAG (retrieval augmented generation), help users build reports, and operate across multiple languages. Early results reported by Anthropic are striking: an 80% reduction in search time, nearly 60% increase in user satisfaction, and a tenfold increase in archive inquiries.

These developments mark an important shift in how we’re applying generative AI. The ability to upload and interact with documents has been around for some time, but these new applications show a maturing of the technology. They represent focused solutions to specific problems rather than general-purpose tools looking for use cases.

The EU Parliament example is particularly instructive. By constraining the problem space to legislative documents and proceedings, they’ve created a system that can deliver precise, relevant results while maintaining accuracy. They also receive enormous benefit from the multi-language capabilities of LLMs given their service to multiple member states.

We expect to see more of these focused applications in the months ahead. The pattern is familiar: general-purpose technology gets refined into specific solutions as organizations better understand both the capabilities and limitations of the underlying technology. We’re watching companies move beyond the “what can it do?” phase to the “what should it do?” phase — and that’s where real value often emerges.

A note of caution: We haven’t tested either of these tools ourselves, and experience suggests they likely have limitations not covered in the press releases and early reviews. But they point to something important: we’re still in the early stages of developing applications that fully leverage these models’ capabilities. The ability to synthesize information, translate across languages, and generate coherent text — these are powerful features looking for the right containers.

OpenAI's Next Model Could Be Around the Corner

The next frontier model could reshape AI's competitive landscape … again.

The Verge reports OpenAI plans to release its next frontier model by December of this year. While OpenAI denies plans for a model codenamed “Orion” — and dismissed the Verge’s story as “fake news” — the arrival of a new flagship AI would be a major milestone marking another significant advance in what these systems can do.

The reported deployment strategy is also newsworthy. The Verge reports that OpenAI might grant initial access to select partners — like Microsoft — rather than releasing through ChatGPT first. For organizations that use the Microsoft / Open AI partnership to power their own internal tools, this could mean a serious edge once the new generation arrives. Regardless of how it’s rolled out, though, the timing of year-end would fit the established rhythm of AI development. Language models tend to make generational leaps every 18 to 24 months, and a December release would land right on schedule, two years after ChatGPT’s debut.

The central question isn’t about rollout strategy or even timing, though, it’s about capability. When GPT-4 arrived in March 2023, it defined a new tier of AI performance that went unchallenged for nearly a year. No other model could match its capabilities until Claude 3 and Gemini Ultra arrived in early 2024. For what it’s worth, the Verge also reported this week that Google plans to release its next model of Gemini, 2.0, in December — setting the scene for an interesting and competitive close to the year. As the rumor mill keeps churning, we’re watching for just how substantial the technical advancement will be, and whether OpenAI can once again establish a meaningful lead.

Why Organizations Hesitate to Embrace Generative AI

Unlimited potential creates unlimited hesitation.

Organizations, so far, are being very cautious in embracing generative AI. The question is, why?

We see several factors limiting the rate of diffusion. One is the sheer complexity and ambiguity of generative AI technology. As a general-purpose technology generative AI is by nature broad in its applications. This breadth of possibility makes it harder for organizations to pinpoint exactly where and how to bring AI into the fold. The ambiguity extends beyond just use cases — it extends to the technology itself. AI models operate as “black boxes,” making decisions in ways that aren’t transparent or easily explainable. This lack of clarity can make it challenging for organizations to trust AI systems fully.

Another factor is the perceived risk and ethical considerations. Data privacy breaches are always top of mind for organizations, as are ethical considerations of bias in AI systems and the responsible use of AI-generated content. Caution around these topics leads organizations to take a “wait and see” approach, holding off on generative AI adoption until they see undeniable proven success in their industry.

As we have written in past editions, the journey from paralysis to progress in generative AI adoption isn’t about rushing to implement every new AI tool that comes along. Instead, it’s about thoughtful evaluation of where AI can help an organization do better work (but be careful to not fall into the McNamara Fallacy), considerate planning around implementation, and a commitment to ongoing learning and governance. By taking a measured approach, organizations can harness the power of generative AI while mitigating the factors that create operational and institutional risk.

We’ll leave you with something cool: A Reddit user posted their own benchmarking of Sonnet 3.5 New against the old Sonnet 3.5 — using Minecraft.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

Confluence: AI, Leadership, and Communication

Discussion about this post