Confluence for 12.15.24

Google’s quiet revolution. Google launches "Deep Research." Managing "feature overload." Mollick’s timely reminders.

Dec 15, 2024

Midjourney prompt: *An evening midwinter scene in the style of George Henry Durrie. Pine trees in a forest, with gentle snow falling.*

Welcome to Confluence. Here’s what has our attention this week at the intersection of generative AI and corporate communication:

Google’s Quiet Revolution
Google Launches “Deep Research”
Managing “Feature Overload”
Mollick’s Timely Reminders

Google’s Quiet Revolution

Why Gemini 2.0 Matters More Than You Think.

We don’t write much about Google’s Gemini in this space. That’s not because it isn’t capable — it very much is — but because most of our work and that of our clients centers on GPT-4 and Claude. Yet Google’s announcement of Gemini 2.0 this week demands our attention, not just for what it delivers today, but for what it signals about the future of generative AI and the competitive landscape that will shape that future.

First, some context. When Google initially released Gemini 1.0 last December, followed by 1.5 in February, the response was measured. Yes, it demonstrated impressive capabilities, particularly in handling multiple types of input — text, images, audio, and code — and its massive context window of one million tokens set a new standard. But for many, including us, it felt like Google playing catch-up rather than leading the field.

Gemini 2.0 changes that narrative.

The new release brings three significant advances worth our attention. First is raw capability. Google claims 2.0 Flash (their base model) outperforms 1.5 Pro (their previous top-tier model) on key benchmarks while running at twice the speed. More interesting is that it does this while expanding its capabilities: 2.0 can now generate images and create synthesized speech in multiple languages, abilities that put it in direct competition with OpenAI’s GPT-4o in terms of multimodal generation. But in both cases, Google’s bringing something new to the table. For image generation, you’ll be able to describe and edit images just by typing in prompts. Watch this:

And for voice, you’ll be able to generate narration that not only says what you want the model to say, but how you want it to say it — again just using simple prompts. Watch:

These aren’t just technical achievements — they’re practical tools that provide significant, if mundane, utility.

The third and most significant advance is Google’s clear commitment to the development of agents (what they’re calling “the agentic era”). We’ve written more than once about AI agents — software that can understand your intent, plan multiple steps ahead, and take action on your behalf. Google is making a big bet here, introducing several agent-focused projects alongside Gemini 2.0. Project Mariner, for instance, is an experimental Chrome extension that can understand and interact with web content. Project Astra, which we’ve mentioned before, is being updated with improved dialogue capabilities and tool integration. It allows a device with a camera (like your phone) to see what you see, and to talk about it with you in real time. Watch this:

The implications for corporate communication and leadership are significant. As these models become more capable and more agentic, they will increasingly serve as interfaces between organizations and their stakeholders. Imagine an AI agent that can understand your company’s policies, procedures, and cultural context, using that understanding to help employees navigate complex organizational systems or help customers interact with your services. Or another that brokers introductions and relationships between subject matter experts across the organization based on their email traffic, Microsoft Office work, or SharePoint search queries. Or even one that just knows how to book all your travel for you. The potential is enormous.

So, pay attention to Google. While OpenAI and Anthropic may get more use in many corporations, Google is pushing the field forward in meaningful ways, particularly in multimodal capabilities and practical applications. Their work sets standards that other frontier labs will need to match or exceed.

Second, start thinking seriously about AI agents. The technology is maturing faster than many anticipated, and organizations need to begin considering how they might leverage (or compete with) these capabilities. This isn’t just about automation — it’s about augmenting and enhancing human capabilities in ways that could reshape how we work and communicate.

Finally, keep experimenting. Each of the frontier models — GPT-4, Claude, and now Gemini 2.0 — has distinct strengths and capabilities. Work with all three. Give them the same tasks. Use them for different things that represent different strengths. Understanding these differences, and how they might serve different needs within your organization, will be increasingly important as the technology continues to evolve.

The release of Gemini 2.0 marks another significant step forward in the rapid evolution of generative AI. While it may not immediately change how most of us work with these tools day-to-day, it signals important shifts in the competitive landscape and the capabilities we can expect to become standard in the months ahead. The future of AI isn’t just about better language models — it’s about more capable, more practical, and more responsible AI agents that can truly understand and act on our behalf.

This may all look like the future. But Google’s latest release suggests that future may be just months away.

Google Launches “Deep Research”

A Different Kind of AI Assistant.

We’ll keep going with Google. This week they launched Deep Research, a “personal AI research assistant” built into Gemini 1.5 Advanced. We’ve been testing it, and while it’s early days and some of our results have been uneven, we see immediate application for us, leaders, and corporate communication professionals.

Deep Research works differently than the other AI tools we’re using every day. Instead of crafting an immediate response, it instead presents you with a research plan for approval. You may edit this if you like, and then it goes to work, spending several minutes searching and analyzing information from across the web, creating new searches based on what it learns. Think of it as a research assistant that works methodically rather than a chatbot that responds instantly. We asked it to research the implications of the ironies of automation on corporate professionals, and it gave us this plan to do so:

It then spent several minutes searching and reading 60 (60!) websites, creating a four-page research summary backed by over two pages of citations. You may see that paper here:

Ironies of Automation Research Summary

153KB ∙ PDF file

Download

In our testing, we find Deep Research promising if uneven. That report, if you care to read it, is sort of a clunky combination of automation research, generative AI trends, and generative AI implications. We think that with a better research question we’d get a better answer. But, having a strong research assistant at your disposal, 24/7? We’re all in on that. Ask it to analyze emerging trends in a specific industry, and it often returns with a well-organized brief that would have taken hours to compile manually. (And, if you like, give that paper to NotebookLM and have it create a podcast for you to listen to about the findings on your morning walk). The citations are clear, making fact-checking straightforward — which is essential, as (like all large language models) we’ve found it can make factual errors that could carry reputational risk if not caught.

This isn’t entirely new territory. Perplexity offers a similar capability, albeit with less depth of research. But Deep Research’s methodical approach — planning, searching, synthesizing, and then searching again based on what it learns — feels different. For many tasks where you want to get smarter, faster, with authoritative citations to back things up, this kind of capability is invaluable. For us, Deep Research has immediately joined our growing AI toolkit. But we will remember to verify any facts that matter — this is still an AI tool, after all, and accuracy isn’t guaranteed — and you should, too.

Managing “Feature Overload”

Don’t let the flurry of end-of-year announcements and feature launches paralyze you.

We write at length about Google above, but Google isn’t the only generative AI pushing major updates as we wrap up 2024. OpenAI is now seven days into its “12 Days of OpenAI,” with each day featuring a new announcement. Thus far, those announcements have included the full o1 model and ChatGPT Pro (which we wrote about last week), public availability of its video generation model Sora, advanced voice mode with video, updates to ChatGPT’s integration with Apple Intelligence, and a lot more. If it feels like a lot to keep up with, it’s because it is. As readers know, we follow these developments closely, and even we’re having a hard time keeping up with all of these features, how to use them, and what they mean.

We’re not the only ones. In response to OpenAI’s announcement of new code-writing and code-execution capabilities of ChatGPT’s Canvas feature, Simon Willison — a well-respected programmer, entrepreneur, and blogger — wrote the following:

Do you find this all hopelessly confusing? I don’t blame you. I’m a professional web developer and a Python engineer of 20+ years and I can just about understand and internalize the above set of rules.
I don’t really have any suggestions for where we go from here. This stuff is hard to use. The more features and capabilities we pile onto these systems the harder it becomes to obtain true mastery of them and really understand what they can do and how best to put them into practice.
Maybe this doesn’t matter? I don’t know anyone with true mastery of Excel — to the point where they could compete in last week’s Microsoft Excel World Championship — and yet plenty of people derive enormous value from Excel despite only scratching the surface of what it can do.
I do think it’s worth remembering this as a general theme though. Chatbots may sound easy to use, but they really aren’t — and they’re getting harder to use all the time.

A large part of the “magic” of tools like ChatGPT early on was their ease of use. Anyone with a decent command of everyday language could use them to do powerful things. You didn’t need to “speak the machines’ language” — they spoke our language. With the addition of so many new features, it can feel like we’re drifting away from that — what was a very low barrier to using these tools is getting higher.

While we agree with Willison’s assessment of the situation (these features are getting more confusing and more complex), we don’t necessarily agree with his conclusion that the current generation of chatbots (ChatGPT, Claude, Gemini, etc.) are “not easy to use” or that “they’re getting harder to use all the time.” Parts of these tools are indeed hard to use and confusing, including the two different ways to write and execute code in Python using ChatGPT, which is what prompted Willison’s post.

Most of these complex and confusing features are for complex and highly-specific use cases, however, and the good news is that no one has to use them. This means that most users for most use cases can simply ignore them. The original “magic” is still there and is in no way diminished by the addition of new features, at least in our experience.

We often use the analogy of a car. You can think of ChatGPT (or Claude, or Gemini) as a car, with the underlying large language model as its engine. The engines are getting better all the time, and the cars are getting more powerful. The features — like ChatGPT’s Canvas, Projects, and others announced recently — are the car’s “bells and whistles.” They can be helpful, but they’re generally not necessary to get you from point A to point B. For most individuals, the real value is in just putting in the time to learn to drive the car. Once you reach a certain level of comfort and capability, then it may make sense to spend more time learning about the features. But don’t let a feeling of needing to understand and master all of the features stop you from getting in the car and learning to drive.

Mollick’s Timely AI Reminders

Guidelines worth keeping in mind.

As we’ve noted above, it can be daunting to keep up with the flurry of developments over the past few weeks. And as important as it is to “get in the car” and learn how to drive, it can also be helpful to know where you’re going and to have direction in hand on how to get there.

Ethan Mollick recently put out a piece that does just that. In “15 Times to use AI, and 5 Not to,” he offers clear guidelines for using this technology effectively. The piece provides valuable context, especially given all the noise of the past few weeks. While we encourage reading the full article to understand his reasoning, we’re sharing his key points here for reference.

Mollick writes that we should use generative AI for …

1. Work that requires quantity.
2. Work where you are an expert and can assess quickly whether AI is good or bad.
3. Work that involves summarizing large amounts of information, but where the downside of errors is low, and you are not expected to have detailed knowledge of the underlying information.
4. Work that is mere translation between frames or perspectives.
5. Work that will keep you moving forward.
6. Work where you know that AI is better than the Best Available Human that you can access, and where the failure modes of AI will not result in worse outcomes if it gets something wrong.
7. Work that contains some elements that you can understand but need help on the context or details.
8. Work where you need variance, and where you will select the best answer as an editor or curator.
9. Work that research shows that AI is almost certainly helpful in — many kinds of coding, for example.
10. Work where you need a first pass view at what a hostile, friendly, or naive recipient might think.
11. Work that is entrepreneurial, where you are expected to stretch your expertise widely over many different disciplines, and where the alternative to a good-enough partner is to not be able to act at all.
12. Work where you need a specific perspective, and where a simulated first pass from that perspective can be helpful, like reactions from fictional personas.
13. Work that is mere ritual, long severed from its purpose (like certain standardized reports that no one reads).
14. Work where you want a second opinion.
15. Work that AIs can do better than humans.

Using generative AI for any one of the above use cases is likely to yield good, usable results that allow us to spend less time on tactical work and more time on the deep, strategic work that’s uniquely human and adds value that AI can’t (yet). Mollick continues, writing that we should avoid using generative AI …

When you need to learn and synthesize new ideas or information.
When very high accuracy is required.
When you do not understand the failure modes of AI.
When the effort is the point.
When AI is bad.

The fourth point against AI use stands out as particularly significant — one that Matt Beane makes well in his book The Skill Code (which we recommended back in August). For example, in our work we see the act of writing itself as building our writing capacity and essential to putting forth the best possible output. The effort we put into how we craft phrases, sentences, and paragraphs helps us maintain and develop our writing skills each time we put ideas into prose. And writing is only one example — there are plenty of things that we all do where the process of working through challenges helps develop our expertise, our skills, and ultimately leads to better outputs.

There’s no question that we’ll continue to see developments that reshape how we work — and soon. Having these reminders at hand will prove useful in making continued informed choices about when to use AI tools and when to rely on human expertise. As we move forward, they offer a practical framework for evaluating new generative AI capabilities while staying focused on meaningful work.

We’ll leave you with something cool: OpenAI added “Santa Mode” to ChatGPT’s voice chat for the holiday season.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

Bette A. Ludwig, PhD 🌱

I'm definitely interested in the research aspect. Although I would be very skeptical not checking each and every source because I have used Gemini where it provides links but it's not linking to what it says it is. Obviously this isn't the updated version but I would caution anyone to be very careful sourcing from any of these AI platforms.

Expand full comment

Confluence: AI, Leadership, and Communication

Discussion about this post