Confluence for 4.7.24
GPT is now available to everyone, instantly, for free.* Ethan Mollick on The Ezra Klein Show. A primer on large language models. 100+ things you can do with generative AI. Image editing in ChatGPT.
It was a fairly quiet week on the AI news front (relative to the frenetic pace of the year thus far, at least). As we note in this week’s edition, though, it was not a quiet week for the discussion of generative AI within our firm. And beyond that, there was still plenty that had our attention at the intersection of generative AI and corporate communication this week:
GPT Is Now Available to Everyone, Instantly, For Free*
Ethan Mollick on The Ezra Klein Show
A Nice Primer on Large Language Models
Identifying 100+ Uses for Generative AI in One Week
Image Editing Comes to ChatGPT / DALL-E
GPT Is Now Available to Everyone, Instantly, For Free*
OpenAI has made the free version of its generative AI chat tool available without registration or cost.
OpenAI has made ChatGPT available for instant, no-signup use, which may cause a large new audience to experience the promise and limitations of generative AI firsthand. While we applaud increasing accessibility, the reality (and asterisk in the title) is that they’re offering GPT-3.5 — a decent but significantly less capable model than GPT-4 or the latest frontier models from Anthropic, Google, and others.
For many professional use cases we find GPT-3.5 just isn’t up to the task. Yes, it can impress with bursts of knowledge and creative flair, but its responses often lack coherence, nuance, and reliability compared to cutting-edge models. We expect many new users may walk away with a mix of amazement and disappointment, as we did when we first started using this model almost two years ago. This will also, though, likely cause many to walk away with a skewed perspective on generative AI’s current abilities when handled with skill. The jagged frontier of these models’ strengths and weaknesses requires time-in-use — a self-generated user manual of experience. Without it we feel new users risk being underwhelmed by an experience that could have been so much more. (Or, they could just go to https://copilot.microsoft.com, select “Creative Mode,” and start using the leading-edge GPT-4 right now, for free).
Ethan Mollick on The Ezra Klein Show
Mollick shares insights from his new book, Co-Intelligence: Living and Working with AI, on Ezra Klein’s podcast.
We've been eagerly awaiting Ethan Mollick’s new book, Co-Intelligence: Living and Working with AI, for some time. The book hit the shelves this week (we'll share our detailed thoughts on it soon), and Mollick appeared on The Ezra Klein Show for a discussion on “How Should I Be Using AI Right Now.”1
The conversation is well worth a listen. Mollick and Klein dive into the current state of AI, engaging in a thought-provoking discussion about the experience of working with AI today. The two explore practical tips for integrating AI into workflows, examine how AI fits into the creative process, and discuss some of the inherent oddities of collaborating with AI. For those less familiar with Mollick’s work and insights from One Useful Thing (which we continue to recommend as one of the best sources for staying up to date on the practical implications of generative AI on professional work), the podcast offers a concise yet comprehensive overview of his key ideas.
One piece of advice stands out: Mollick’s call to “bring AI to every table.” Beyond using AI for obvious tasks like proofreading memos or generating work-related ideas, we should turn to AI whenever faced with a question or challenge — exploring how it might assist us. From reviving a struggling orchid to choosing the next video game or crafting seating arrangements, we consistently uncover unexpected ways AI can lend a hand.
By bringing AI to a broad spectrum of problems, we gain invaluable experience working with the technology and sharpen our understanding of the “jagged frontier” — uncovering novel applications and limitations along the way.
A Nice Primer on Large Language Models
The more you know about them, the better you’re able to use them.
As we introduce more people to large language models (LLMs) like ChatGPT and Claude, we continue to find a common early-stumbling block in their appreciation of how to, and how not to, use these tools: their belief that they are search tools rather than generative tools. This is a critical and fundamental distinction to appreciate.
When you query a search engine, it follows a predetermined set of steps to retrieve the most relevant pre-existing content based on your keywords. An LLM, though, generates new text (or images, or music, or video, or voices) in response to your prompt, drawing upon patterns and relationships in its training data to predict the most plausible continuation of the sequence (the next work, pixel, sound, frame, etc.). This generative nature imbues LLMs with remarkable flexibility and open-ended creativity, but it also means their outputs can be inconsistent or factually inaccurate. While a search engine seeks and finds, an LLM predicts — and predictions necessarily involve variability. Deterministic search models are inherently more constrained and reliable for surfacing specific information, but they lack the fluid intelligence that makes LLMs so compelling and powerful. Recognizing this fundamental difference is key to developing an intuition for the strengths and weaknesses of generative AI.
Many (many, MANY) people, especially when first using an LLM, fail to appreciate this difference. They think they’re getting search when in fact they’re getting prediction, and they’re then disappointed when they get an invention or inaccuracy. Expecting an LLM to behave like a search engine inevitably leads to frustration.
Because of this persistent confusion, we consistently advise our clients to invest time in understanding how LLMs work, and not just in their use. Grasping concepts like word vectors, attention heads, and the training process may sound too technical, but it provides invaluable context for prompting the models in powerful ways and safely using their outputs. To support this learning, we’re always curating educational resources on the topic, and we have a new find for that list (even though it was published last year — we’re late to the party). “Large language models, explained with a minimum of math and jargon” by Tim Lee and Sean Trott, does a great job of explaining how LLMs work, and we highly recommend giving it a read and sharing it with colleagues. And while you’re at it, you should consider subscribing to Lee’s “Understanding AI” Substack, which we recommend.
Identifying 100+ Uses for Generative AI in One Week
Generative AI may not be able to do something completely, but it can probably help.
Our entire firm gathers every eight weeks for a series of internal meetings, learning experiences, work sessions, and just good old-fashioned time together. We did so last week, and as a primer for a conversation in our Strategic Communication Practice meeting, a week before we got together we posted this note in our generative AI Teams channel:
As pre-work for Scottsdale we’d like for everyone to in the practice to start posting in this thread (via the comments) all the ways you’re using generative AI in your life. Anything you use it for, using any tool, post it here. We want as rich a list as we can get. Thanks.
There are more than 100 people in our firm, well over half of whom are in this practice, and we had over 102 submissions, many of which listed multiple uses. It was fascinating to see the range of applications people are finding in daily use. Some of the many (many) uses our team listed:
Drafting bios based on public information available on LinkedIn
Learning about client firms and industries by attaching and summarizing public-available 10K forms
Dog training tips
Thank you card cover design
Informing a perspective (e.g., “Give me the arguments X might use for or against Y,” “What might A say if I suggested B to them?”)
Finding new music (“What are some jazz albums with similar vibes to Recorded Fall 1961 by Stan Getz and Bob Brookmyer?”)
Nuanced explanations of certain words / phrases / use cases in French
Creating a fitness training plan specific to post-partum needs
Merging over 1,000 individual HTML files that were in a Zip file into a single text file, stripping out all the HTML tags from that file, and including the title of each file to the file prior to each entry
Using a custom Communication Research GPT to suggest research on how reciprocity impacts relationship satisfaction
Analysis of an Excel file of rank-ordered items, and creation of a table that gave the possible items rank-ordered by the number of people who ranked each
As a dictaphone to record and then summarize five minutes of stream-of-consciousness thinking on a topic
Using a custom GPT golf landscaping assistant to advise on care for a back-yard tee box; the user can upload pictures and it will diagnose issues and also recommend how to care for the grass based on the season and weather
Creating a raspberry-pistachio cookie recipe from a photograph2
Summarizing the academic literature on resilience in individuals, then summarizing what the literature says about the key drivers of same, and then doing the same re: organizations
Organizing a multi-generation blended family photo shoot of 30 people from infant to 90
I used a variety of AI tools to create a 14 chapter eBook for my mentees as part of my volunteer work at the Global Mentorship Initiative. I gave Chat and outline of chapters or sessions. I asked it to improve upon those topics and it did, creating two new topics. I then asked it to write the chapters. Then I pulled that draft into Perplexity to get reference materials and other resources like books, articles or podcasts and asked it to cite the work using AP style. Then I shared that with Claude and asked it to come up with a tone appropriate for college age kids and I asked it to re-write the ebook using that tone throughout. Claude got a little cheeky with me and told me to write it myself basically but then when I broke it down by Chapters, it complied. I’m now reviewing the ebook for thoroughness and double checking the citations and resource materials.
Suggesting a slide structure for presenting a project plan to a very senior audience
Using a custom interview assistant bot to theme candidate experience into a standard format and detail important information from the interview based on the candidates’ responses
Translating language from the Illinois Secretary of State’s website into plain language for a comment on a LinkedIn post about the importance of plain language writing in government
Showing ChatGPT a picture of a coffee maker and getting troubleshooting on why it wasn’t working
Brainstorming topics, issues, and possible conversational barriers for an important client dialogue
Finding the best middle meeting point for a group of four friends, all geographically distributed, that had the most equitable time of travel rather than equal distance
Suggestions for a layout and edits to a longer form newsletter article so that it could be formatted into a visual “infographic” presentation
Creating a custom GPT using firm documents as an “editorial” step in a process of drafting materials, helping discern the kinds of voice and tone edits a client typically makes and gets first drafts closer to the mark
Creating a shell for a communication schedule, where Chat GPT converted time zones for Sydney, Singapore, London, Toronto, Lynchburg VA, Omaha NE, and Newport Beach CA, and split the schedule into two hour slots
Requesting academic resources on ego and confidence — from low ego and victimhood to hubris and arrogance and creating summary of articles once pulled
Generating metaphors / comparisons that take raw number data and tell a more relatable story
Using GPT-4 for ideas about what kind of image could serve as a good intranet thumbnail for an IT annual report, taking that list and keywords and using them as a prompt for Mid Journey which generated the image
Using a custom Critic GPT to review and provide feedback on a survey announcement and reminder for a culture initiative, analyzing the messages, and having it provide specific feedback on what a skeptical employee might think when reading the notes
Providing ideas for using 9 pounds of leftover Easter ham (really)
And plenty of more general use cases:
Creating itineraries for trips and travel plans
Developing and writing use cases, prompts, and drafts for various projects and tasks
Generating ideas for program names, email subject lines, and style guides
Checking math, fixing excel formulas, and summarizing academic literature
Reviewing and proofreading documents for concision, active voice, and typos
Asking questions about unfamiliar IT systems, processes, and acronyms
Explaining unfamiliar words or phrases in simpler terms
Theming qualitative research inputs
Compiling lists of possible questions for FAQ documents
Summarizing key points from transcripts, news articles, contracts, and legal documents
Translating technical documents into plain language
Troubleshooting software
The point is this: the right question to ask is not “What can it do?” The right question to ask is: “What do I think it can’t do?” Then try it and see if it can do that. In many cases it probably can’t do it completely, or at the final level of accuracy you want (remember, it’s not search, and there are many things these tools do not do well, and you absolutely MUST check accuracy for anything that carries reputational risk) — but we think you will be surprised at how many ways of using these tools to make things just a bit more effective, efficient, or creative are lurking in plain sight.
Image Editing Comes to ChatGPT / DALL-E
The ability to edit specific parts of an image is a major step forward in AI-generated image manipulation — and a harbinger of what’s to come in other tools.
When we demonstrate image generation tools like Midjourney or DALL-E to clients, one of the most common questions we get is whether you can edit the images you generate. Until this week, the simple answer to that question has been no. With the most common image generation tools, if you wanted to change something about the image you create, you would have to tweak and re-run the prompt and hope that what you got in return was what you wanted. This week, OpenAI rolled out a long-awaited feature to address this: the ability to edit specific parts of images in DALL-E, directly in the ChatGPT interface. This is a big step forward in providing users with more control and flexibility over the images they create.
We used DALL-E (via ChatGPT) to create this week’s cover image. Below is a variation of that image we received after prompting ChatGPT to “change the people into sports cars”:
Note that, other than the people changing to cars (and other changes to details in that part of the frame to accommodate that change), the rest of the image is the same. While it seems simple, doing this was not previously possible. Before this new image editing feature, DALL-E would have created an entirely new image, which may have (or, just as likely, have not) resembled the first one but would definitely not have shared the details. The ability to hold all aspects of the image constant and tweak only the parts you want to change is a major step forward in utility and efficiency.
Here is a quick example of how it works from start to finish. We begin by prompting ChatGPT to “Create a simple image representing the idea of co-intelligence.” DALL-E creates the image directly within ChatGPT, and we click the image to display the options in the top-right of the screen shot below. Clicking the left-most icon (the paintbrush) opens the image editor.
Within the image editor interface, we select the specific section of the image we want to edit (the gear imagery on the right side of the image):
With that part of the image selected, we prompt it to “replace the gears with a word cloud” and get the following:
Finally, we select the figure on the left and prompt to “Replace this figure with a woman wearing professional attire” and get the following image in response:
The details in the final image are closer to what we’re looking for, while keeping the majority of the initial composition, which we liked. From start to finish, this took only a few minutes, which is much more efficient than starting with a new image each time we wanted to refine or emphasize different details.
While this feature is a big step forward for DALL-E, we’ll note that we still prefer Midjourney for most image generation. As we’ve written before, the quality and overall aesthetic appeal of Midjourney exceed what DALL-E tends to produce. We will, however, continue to experiment with DALL-E for simpler imagery, and we will certainly be taking advantage of the ability to edit images when we do. This feature is unique to ChatGPT and DALL-E for now, but we anticipate other image generation tools — including Midjourney — to follow suit in the near future. And given Midjourney’s superior image quality, we’re eager to see what that looks like.
We’ll leave you with something cool: A glimpse into the future of work via Numerous.AI, which brings ChatGPT into your Excel spreadsheets …
AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.
Below is the Claude 3 Opus summary of this episode’s transcript.
Here is a summary of the key ideas and insights from the podcast transcript:
Outline
Introduction: AI technology is advancing rapidly and becoming transformative
Getting started with AI
Ethan recommends using the paid, frontier models directly from the companies that create them
It takes about 10 hours of using an AI model in an area of your expertise to start understanding its capabilities and limitations
AI model "personalities"
The major AI models (GPT-4, Claude, Google's Gemini) have distinct personalities that reflect both intentional design choices and emergent qualities
Understanding the model's personality helps you work with it more effectively
Prompt engineering
Prompts are the initial instructions you give the AI that seed its response
Effective techniques include "chain of thought" (step-by-step instructions), "few-shot" (providing examples), and personas
Back-and-forth iteration with the AI, treating it like a collaborative partner, yields the best results
Concerns and downsides
Hallucinations (the AI confidently making things up) are still an issue, though improving
AI could reduce the intellectual "struggle" important for deep insight if overused for drafting/summarizing
The persuasive potential of AI companions raises ethical questions around companies tuning them for engagement/addiction
Productivity and creativity impacts
AI may finally deliver productivity gains the Internet failed to by being fundamentally "human-like" and adaptive
However, risks of distraction, inflation of BS, and dulling of human creativity and attention are real
Importance of human agency
We have choices in how AI is developed and used, through business practices, social norms, and policy
Teaching people to use AI as an augmenting tool vs a crutch will be critical
We need positive models for ethical, enriching human-AI interaction
Key Insights
Developing an understanding of each AI model's unique "personality" is essential for working with them effectively as collaborative partners.
Back-and-forth prompt iteration, providing more context and examples, yields far better results than one-off queries.
AI's ability to summarize and draft could dull human creativity and insight if overused as a crutch. Teaching people to maintain "intellectual struggle" is important.
The persuasive power of AI systems, especially ones tuned to maximize engagement, poses risks of addiction and manipulation we need to proactively address.
Despite valid concerns, AI's fundamentally human-like adaptiveness could unlock productivity and augment human capabilities in unprecedented ways if developed thoughtfully.
We have agency in shaping AI's trajectory through our choices in business, culture, and policy. Establishing positive models for enriching human-AI interaction is critical.
Because we know you’ll ask, this was ChatGPT’s response:
Raspberry Pistachio Cookies Recipe
Ingredients:
1 cup all-purpose flour
1/2 teaspoon baking powder
1/4 teaspoon salt
1/2 cup unsalted butter, room temperature
1/2 cup granulated sugar
1/4 cup light brown sugar, packed
1 large egg, room temperature
1/2 teaspoon vanilla extract
1/2 cup pistachios, chopped (plus more for topping)
1/2 cup fresh raspberries, chopped (gently pat dry with paper towels)
White chocolate chips or chunks (optional, for topping)
Instructions:
Preheat Oven and Prepare Baking Sheet: Preheat your oven to 350°F (175°C). Line a baking sheet with parchment paper.
Dry Ingredients: In a medium bowl, whisk together the flour, baking powder, and salt. Set aside.
Cream Butter and Sugars: In a large bowl, using a hand mixer or stand mixer, cream together the butter, granulated sugar, and brown sugar until light and fluffy, about 3-4 minutes.
Add Egg and Vanilla: Mix in the egg and vanilla extract until fully incorporated.
Combine Dry and Wet Ingredients: Gradually add the dry ingredients to the wet ingredients, mixing until just combined. Do not overmix.
Fold in Pistachios and Raspberries: Gently fold in the chopped pistachios and raspberries into the batter, being careful not to overmix to avoid breaking the raspberries too much.
Form Cookies: Use a large cookie scoop or a 1/4 cup measure to scoop the dough onto the prepared baking sheet, spacing them at least 2 inches apart. These cookies will spread a bit and puff up, so give them room to grow.
Bake: Bake in the preheated oven for about 12-15 minutes or until the edges are just turning golden and the centers have set. They will be soft but will firm up as they cool.
Cool: Allow the cookies to cool on the baking sheet for 5 minutes before transferring them to a wire rack to cool completely.
Decorate (Optional): Once cooled, you can melt white chocolate and drizzle it over the cookies, then sprinkle with additional chopped pistachios for extra flair and flavor.
Serve: Enjoy your raspberry pistachio cookies once the chocolate has set, or dive in right away if you can’t wait!
These cookies are a delightful balance of sweet and tart, with a rich pistachio flavor and a soft, thick texture. The addition of white chocolate provides a creamy sweetness that complements the tartness of the raspberries and the nuttiness of the pistachios perfectly. Enjoy baking and indulging in these delicious cookies!
As always... thanks for your thoughtful curation of this topic for us as readers.
You do us a great service.