Confluence for 12.10.23

A guide to which AI to use. Two updates on Microsoft's Copilot(s). Google announces Gemini. GenAI interfaces beyond chatbots. The paradox of AI disclosure in journalism.

Dec 10, 2023

Imagine with Meta AI prompt: *A 1970s photograph of someone using Facebook, on Kodachrome from that era, 35mm lens, 4:3 aspect ratio*

Hello and welcome to our second December issue of Confluence. Before we get to the primary items for this week, Meta has joined Midjourney, OpenAI, Adobe, and others in providing an AI image generator. You can try the Meta tool here, and our image up top this week is an example of its output. We’ve played with it just a little, and our initial reaction is that it’s very “DALL-E” like, though faster. It seems to replicate photography quite well, although it seems worse than DALL-E at numbers and text. It also adds a watermark. At least for now, Midjourney remains our preferred tool for creating truly impressive AI-generated images. With that out of the way, here’s what has our attention at the intersection of AI and communication this week:

An Opinionated Guide to Which AI to Use
Two Updates on Microsoft’s Copilot(s)
Google Announces Gemini
GenAI Interfaces Beyond Chatbots
The Paradox of AI Disclosure in Journalism

An Opinionated Guide to Which AI to Use

Ethan Mollick offers a summary that’s worth sharing with anyone whom you’re helping to understand and use generative AI.

Sometimes we worry that we reference Wharton’s Ethan Mollick too often, and that readers may become tired of our constantly pointing to his work. That said, he continues to be one of the best-informed and thoughtful sources available about generative AI, and we expect to continue to read, and refer to, him often. This includes his recent “An Opinionated Guide to Which AI to Use.” If you’re trying to understand and navigate the multiple large language models on the market, it’s a must-read. If you’re trying to help others do the same, it’s a must-share.

The shortest version is this: use OpenAI’s GPT-4, ideally through ChatGPT Plus. Mollick underscores the limitations of free large language models like ChatGPT 3.5 which, while popular, don’t provide the quality or advanced experiences that GPT-4 offers. As the ChatGPT Plus subscriptions that offer GPT-4 are not currently available (though you can and should get on the waiting list), Mollick suggests Microsoft Bing Chat (which you can also access directly through Bing or as a standalone application at copilot.microsoft.com) as the most accessible free option, particularly in its Precise or Creative settings, which use GPT-4. And as we will note below, more functionality for Bing Chat / Copilot is on the way that will bring it closer to OpenAI’s version of GPT-4.

The article also notes alternatives like Perplexity and Poe for specialized needs, briefly touches on other models like Claude 2.1 (which we tend to use quite a bit when working with text summarization) and Pi for specific contexts, and offers an overview of Google's AI efforts, particularly Bard, powered by the new Gemini Pro model (which we also note below). He concludes with a notion we remind clients of every day: today’s models are just the beginning, with more advanced AIs on the horizon that will redefine how we interact with information and knowledge. These current models—and applications—are the most basic we will ever use.

Read the full post here:

One Useful Thing

An Opinionated Guide to Which AI to Use: ChatGPT Anniversary Edition

The world of generative AI seems very confusing, with tons of Large Language Models released in just the past few months, including a major new announcement from Google. So, given all of this, a lot of people ask me which AI they should actually try. I want to give you an answer from the perspective of an individual user who wants to experiment with AI…

2 years ago · 263 likes · 28 comments · Ethan Mollick

Two Updates on Microsoft’s Copilot(s)

Microsoft releases its early findings on productivity with 365 Copilot, and announces more advanced features coming to Copilot (formerly Bing Chat).

Microsoft released the first report on its research on AI and productivity, with a focus on its 365 Copilot tools1. This initial report sought to address a question that’s been top of mind for us for some time:

What impact does Copilot have on productivity for common enterprise information worker tasks for which LLMs have been hypothesized to provide significant value?

This report focuses on tasks expected to be well within the “jagged frontier,” and its findings are striking. The data revealed a significant reduction in task completion time by 365 Copilot users, with tasks taking between 26% to 73% less than compared to those without access to the tool. Importantly, this efficiency didn’t come at the cost of quality, with performance levels of Copilot users being statistically comparable to non-users.

Just as notable is the perceived value of these AI tools among users. 70% of participants perceived an increased in productivity upon using Copilot and the reported willingness to pay for 365 Copilot was 35% greater when compared to those who had only heard a description of the tool.

Tools like 365 Copilot can create meaningful productivity gains for all knowledge workers and once workers begin using these tools, their perceived value grows. Despite these gains and the potential of 365 Copilot, however, it’s worth noting that its use is only just beginning. As of this writing, 365 Copilot is only available to enterprises who purchase at least 300 licenses.

While access to 365 Copilot remains relatively limited (at least for the near term), Microsoft also announced some major upgrades coming to its publicly-available Copilot web application. With the introduction of GPT-4 Turbo, Data Analytics, and other leading features into Copilot, more people than ever will have access to powerful generative AI tools for free. And as more people gain real experience using them, we can expect two things. First, common tasks will take most people significantly less time to complete. Second, as more people gain real experience with AI and how it can benefit them, the perceived value of these tools will increase.

Google Announces Gemini

The best version of Google’s frontier model is yet to come.

Google's recent announcement of the Gemini AI model marks a significant advancement in the realm of large language models (LLMs). The current version, Gemini Pro, which powers Bard, is akin to a GPT-3.5 level model, offering more capabilities than PaLM 2 but not quite reaching the sophistication of GPT-4. On that basis alone, Gemini wouldn’t merit too much attention, but there is more to the story.

Gemini's standout feature is its direct integration with Google's suite of services, including Google Search, Gmail, Google Maps, Drive, and others. This integration showcases the potential of a truly connected AI. We have already experimented with use cases like creating itineraries with embedded Google Maps and summarizing events from email contents. While the results have been variable, the glimpses of quality and accuracy demonstrate the transformative potential of Bard when powered by a more advanced model. The integration with Google's services could turn Bard into a substantial asset, unlocking new capabilities and giving users a clearer sense of the long-term potential of AI.

Gemini isn’t yet complete. Early next year Google plans to release Gemini Ultra, a model that aims to rival or even surpass GPT-4. Google shared benchmarks suggest that it has every right to be confident that Gemini Ultra will mark a significant advancement in AI and will put them right at the frontier with OpenAI. We will keep you informed with updates and insights as we learn more about all that Gemini brings to the table.

GenAI Interfaces Beyond Chatbots

Some new applications offer a glimpse of what might be coming.

As we’ve noted, the current generation of generative AI technology is the worst that we will ever use. This is true at both the model (e.g., GPT-4, Gemini, etc.) and application (e.g., ChatGPT, Bard, Copilot, etc.) levels. For the foreseeable future, we can expect to see continued development in both the models powering genAI applications and in the applications themselves.

If some new product announcements are any indication, one area to watch for future application innovation is user interfaces beyond chatbots. Microsoft appears to be building something called Notebook, which “removes dialogue so you can focus on prompt creation, refinement, and iterating on output as it remembers the previous version,” according to Microsoft executive Jordi Ribas. In the image generation space, Visual Electric has entered the market with an application it claims will “liberate AI art generation from chat interfaces” through an interface explicitly designed to optimize the creative process. Regarding model versus application development, the Visual Electric release notes that its executives “believe that image generation AI models are in the process of being commoditized, and that it is the front-end user interface that will most differentiate companies and separate the successes from the failures.”

As with all of the innovations to come, time will tell what sticks and what doesn’t. The chatbot interfaces that dominate today are effective, and we’re only just scratching the surface of their potential. That said, it is very likely that the future of genAI is not just a future of chatbots. These two products and others like them may offer a glimpse of what that could look like.

The Paradox of AI Disclosure in Journalism

A new working paper from Oxford suggests disclosure and trust may not be as simple as we might think.

We’ve been saying since the beginning that an area of second-order consequences for generative AI and communication involves trust, transparency, and disclosure. Last week we noted the use of undisclosed AI-generated journalism (and journalists, in fact) at Sports Illustrated, and this week we point to a working paper out of University of Oxford on the use of artificial intelligence in journalism. The paper2 — not yet peer-reviewed — “Or they could just not use it?”: The Paradox of AI Disclosure for Audience Trust by Benjamin Toff and Felix Simon, raises critical questions about trust, transparency, and the impact of AI disclosure on audience perceptions. Its findings, which focus on audience reactions to AI-generated news, suggest possibly subtle dynamics with implication for professionals in corporate communication, including PR, media relations, executive communications, internal communications, and marketing.

The study’s key finding is a paradox: audiences, on average, perceive news labeled as AI-generated as less trustworthy, even though they do not judge the content any less accurate or unfair. This perception is more pronounced among those with higher initial trust in news and those with greater journalism knowledge. While accompanying disclosure with a list of sources can mitigate some of these negative trust perceptions, the act of disclosing content as AI generated seems to reduce trust.

For corporate communication professionals, these findings, should they hold, could have several potential implications. Press releases or media content, executive communication, internal communication, marketing messages — all could be potentially less-credible should the use of AI in their creation be disclosed. That said, the Sports Illustrated example suggests that learning AI was used in creating content without disclosure may be even more threatening to trust and credibility. Our particular view is that things may be a bit more nuanced: disclosing that content was entirely created by AI is different, we suspect, that disclosing that AI was involved in the process of creating that content. The first draft of the post you’re reading, for example, was created by GPT-4. It was done so using a style guide that mimics our voice, and it underwent significant editing, and revision, including this sentence (and most of this paragraph, and the one that follows, in fact) which were written by human hand. Does that make this post any less trustworthy? You can tell us, but we don’t think so … and we disclose our use of AI at the bottom of every issue.

All that said, there’s clearly a challenge in using large language models for content generation. Disclose the use, and it may affect perceptions of authenticity, sincerity, trustworthiness, and more. Don’t disclose the use, and even greater reputational harm could result should the truth come out. The challenge will lie in leveraging AI’s benefits while navigating the complexities of audience trust and perception. As AI becomes more embedded in how we work, understanding and addressing these nuances will become a matter of skill and principle independent of any skill or principle implicit in using the tools themselves. In the short term, we will continue to live by our rule of thumb on the topic: If it’s going to be awkward should people find out, disclose the use.

Now we’ll leave you with something cool: an AI-powered resume builder.

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

We recognize that Microsoft’s Copilot nomenclature can be confusing. Microsoft Copilot for Microsoft 365, which we are calling “365 Copilot” for short, is the integration of the generative AI (specifically, GPT-4) into Microsoft 365 applications like Outlook, PowerPoint, Word, Excel, and more. The more generally named Microsoft Copilot is a standalone web application with a similar interface to ChatGPT.

We put this paper through our “Digest GPT,” which digests articles and other content. If you’d like to use this GPT to digest articles and other content, it’s free to use and share here. Here’s the summary:

Factual Summary

The preprint titled “‘Or they could just not use it?’: The Paradox of AI Disclosure for Audience Trust in News” by Benjamin Toff and Felix M. Simon, not peer-reviewed as of December 1st, 2023, explores the impact of disclosing AI involvement in news production on audience trust. The study focuses on the United States, where news trust is polarized along partisan lines.

Main Points:

AI in Journalism: The use of Generative AI in news has grown, with tools like LLMs (Large Language Models) reshaping news dynamics. However, this raises concerns about journalistic authority, autonomy, misinformation, and audience trust.
Experiment Design: The authors conducted a survey-experiment with U.S.-based participants, presenting them with actual AI-generated journalistic content. The content was labeled as AI-generated in some instances. Three news stories on varying topics were used, each with different political contentiousness.
Findings:
- Trustworthiness: On average, news labeled as AI-generated was perceived as less trustworthy, even if the content wasn't seen as less accurate or unfair.
- Impact on Different Audiences: The negative trustworthiness impact was mainly among those with higher pre-existing trust in news and higher journalism knowledge.
- Source Disclosure: Negative effects on perceived trustworthiness were counteracted when articles disclosed their sources.
Implications: These findings have implications for how news organizations use AI and disclose its involvement, impacting audience trust in journalism.
Audience Attitudes: The study included a post-experiment survey capturing audience attitudes towards AI in journalism and disclosure preferences. Most participants favored disclosure when AI is used in news creation.
Sample Characteristics: The sample was less racially diverse, more educated, and more politically engaged than the general public, with a substantial portion having heard about AI in news production.
Limitations and Future Research: The study acknowledges limitations like its focus on the U.S. and the nature of its sample. It suggests further research on audience reactions to AI disclosure in different contexts and news organizations.

Critique

The study conducted by Toff and Simon provides valuable insights into the complex relationship between AI-generated news and audience trust, yet it has several limitations and areas for potential improvement:

Experiment Design and Sample Bias: The survey-experiment, while methodologically sound in its approach to testing audience reactions to AI disclosure, is limited by its sample. The participants are more educated, politically engaged, and less racially diverse than the general U.S. population, potentially skewing results. This selection bias could influence the generalizability of the findings, as attitudes toward AI and news trust might differ significantly in a more representative sample.
AI-generated Content: The study uses actual AI-generated content, but it doesn't specify the sophistication or type of AI models used. The quality and nature of AI-generated journalism can vary greatly, which might have influenced audience perceptions. A clearer description of the AI's capabilities and limitations would provide a more nuanced understanding of the results.
Political Contentiousness: While the study varies the political contentiousness of the news stories, it doesn't delve deeply into how different political ideologies might interpret AI involvement in news production. Given the polarized nature of the U.S. media landscape, deeper analysis in this area could reveal more about how political biases influence trust in AI-generated content.
Source Disclosure Impact: The finding that source disclosure counteracts negative trust perceptions is intriguing but needs further exploration. Understanding why and how source disclosure mitigates distrust in AI-generated news could inform best practices for news organizations.
Future Research Directions: The authors rightly point out the need for further research in diverse contexts and among different news organizations. Expanding this research globally and across different types of news media could provide a more comprehensive understanding of the global implications of AI in journalism.
Broader Implications for Journalism: While the study focuses on audience trust, it touches less on the broader implications for journalistic practice, ethics, and the potential for AI to reshape news production. A more in-depth discussion in this area could be beneficial.
Methodology Transparency: For academic rigor, providing more detailed methodology, such as the exact survey questions and statistical methods used, would enhance the study's transparency and reproducibility.

Interpretations and Inferences

The findings and context of the study by Toff and Simon allow for several inferences and broader interpretations regarding the intersection of AI, journalism, and public trust:

The Paradox of AI in Journalism: The study highlights a paradox where the use of AI in journalism, aimed at enhancing content creation, may inadvertently erode trust among certain audience segments. This paradox suggests a complex balancing act for news organizations, wherein the efficiency and capabilities of AI must be weighed against potential trust implications.
Implications for News Consumption Patterns: The differential impact on audiences with higher pre-existing trust in news and greater journalism knowledge implies that AI's role in news production might contribute to changing news consumption patterns. Audiences more critical of media might further distance themselves from AI-integrated news sources, potentially exacerbating media polarization.
Ethical and Practical Considerations for Newsrooms: The study’s findings necessitate a reevaluation of ethical and practical considerations in newsrooms regarding AI usage. Transparency and disclosure practices become crucial in maintaining audience trust. News organizations might need to develop clear guidelines and ethical standards for AI use in news production.
Educational Aspect: The negative trust impact mainly among those with higher journalism knowledge points to a potential educational gap. There is an opportunity for news organizations and educators to inform the public about AI’s role and limitations in journalism, potentially mitigating distrust.
Future of AI in Journalism: The preference for AI disclosure among most participants signals an emerging normative expectation in journalism. As AI becomes more integral to news production, its acceptance by the public may hinge on how transparently its role is communicated.
Broader Societal Implications: The study indirectly touches on broader societal implications of AI integration in professions traditionally reliant on human judgment and expertise. It raises questions about how AI integration in various fields might be perceived and the importance of managing these perceptions.
Global Perspective: While this study is U.S.-centric, its implications have a global resonance. Different cultural contexts might yield varying reactions to AI in journalism, pointing to the need for culturally sensitive approaches to AI integration and disclosure practices.

Confluence: AI, Leadership, and Communication

Factual Summary

Main Points:

Critique

Interpretations and Inferences

Discussion about this post

Ready for more?