How To: JP Morgan Chase Training New Hires in Prompt Engineering (But You Can Use Our Simple Rules Instead)

Prompt engineering matters ... but how much do you really need?

Jul 11, 2024

Midjourney prompt: *a friendly robot teaching a classroom of eager adult students*

This piece in Quartz reports that JP Morgan Chase plans to provide all new hires prompt engineering training, in an effort to “get them ready for the AI of the future.” While we think nobody really knows what the “AI of the future” will be (the form general purpose technologies take over time is historically very difficult to predict), readers know we believe this technology is here to stay, and we’ve advocated for many months to make it a daily part of how you, and your team, think and work — so prompt engineering training for new hires sounds like a great idea to us.

But how much expertise in prompt engineering do you and your team really need?

Given our work on ALEX the past several months, we’d say there are two ways to think about this. If you’re going to be developing a way to use generative AI at scale and with complexity, there is an art and sophistication to prompt design that is absolutely the realm of experts. For sophisticated applications some prompts can be tens of thousands of words long, and understanding how to structure and author prompts of this size takes skill, expertise, and serious time.

But most of us, and probably you and your team, are not using a large language model in a way that requires a prompt with a complex structure and dozens and dozens of pages of text. We’re just trying to get something done, perhaps just once, or perhaps on-demand via a custom GPT or Claude Project. And for these cases, we think it’s important to have method for prompting vs. thinking of prompt engineering per-se. After many (many, many) instances of trial, error, and applied use over the past 18 months, here’s what we do, and what we might recommend for you … consider it the Confluence Prompt Engineering Guide:

For simple things, just ask (twice). These models are getting smarter, and for many things for which you might want to use them, we think prompt design is getting less important, not more. So for simple stuff … “Edit this paragraph so a novice can understand it,” “What type of flower is this?” “What’s a good drink if I have bourbon, Cointreau, and bitters?” “Make this paragraph tighter for me,” “Give me 10 possible lede sentences for this article,” “I’m having an offsite, give me 10 things we can do in Austin that are good for a PR team to do” … just ask. Then, if the output is at all anything creative, like an outline, brainstorming, writing, etc., immediately say, “Make it better.” We rarely accept the first draft of any LLM’s output.
If you want responses that are more on point, ask while giving a little context. The more context the model has, the better it will do for you. So these variations of the above queries will produce better results than their first versions: “Edit this paragraph so a novice can understand it, but presume the novice works at JP Morgan,” “We are in Sundance, Utah. What type of flower is this?” “What’s a good drink if I have bourbon, Cointreau, and bitters? Not too strong.” “Make this paragraph tighter for me, use AP style.” “Give me 10 possible lede sentences for this article, all active voice and avoid cliches.” “I’m having an offsite, give me 10 things we can do in Austin that are good for a PR team to do … nothing too outdoorsy.”
If expertise matters, invoke a person or role. While the models are becoming excellent generalists, assigning a role or even a person for the model to emulate can improve the results. We used to say that you should do this every time, but as the models have improved, we think it’s not as necessary. But sometimes, assigning a persona is very helpful. So: “You’re Harry Craddock, the famous bartender. What’s a good drink if I have bourbon, Cointreau, and bitters? Not too strong.” or “Make this paragraph tighter for me, use AP style, and edit like Peggy Noonan.”

Just these first three principles can make a big difference in day-to-day output from a large language model. To illustrate, here is a before-and-after example of output for one of those prompts, using Claude Sonnet 3.5:

Make this paragraph tighter for me: “This piece in Quartz reports that JP Morgan Chase plans to provide all new hires prompt engineering training, in an effort to “get them ready for the AI of the future.” While we think nobody really knows what the “AI of the future” will be (the form general purpose technologies take over time is historically very difficult to predict), readers know we believe this technology is here to stay, and we’ve advocated for many months to make it a daily part of how you, and your team, think and work – so prompt engineering training for new hires sounds like a great idea to us.
Quartz reports that JP Morgan Chase will train all new hires in prompt engineering to prepare for future AI. While the exact form of future AI remains unpredictable, we've long advocated for integrating this technology into daily work. Thus, we endorse JP Morgan's initiative to equip new employees with these skills.

As opposed to …

Make this paragraph tighter for me, use AP style, and edit like Peggy Noonan: [same paragraph]
JP Morgan Chase will train new hires in prompt engineering, Quartz reports. The bank aims to prepare them for AI's future, though that future remains murky. We've long urged embracing this technology in your daily work. While no crystal ball exists, this move feels right. It's a nod to the inevitable, a hedge against uncertainty. Smart.

For things a bit more complex, ask the model to write your prompt. The models (and Claude Sonnet 3.5 in particular, we think) have become quite good at understanding their own prompt structures and how they work. So whenever we have something more complicated that we want to do, like the Rashomon AI prompt in the first post of Sunday’s Confluence, we often start by asking the model to write it. For the Rashomon AI prompt, we gave Claude 3.5 Sonnet this instruction, followed by the full text of the Field Note on the Rashomon effect:
Craft a prompt based on this set of text. The idea is that with the prompt I can use you to create alternative narratives to help me avoid bias in my thinking. [Pasted in full text of the Field Note]
And Claude Sonnet 3.5 gave us this:
I'm about to present you with a situation or decision I'm facing. Your task is to help me create multiple narratives or perspectives on this situation, as if we're exploring the Rashomon Effect. Please follow these steps:
1. First, summarize my initial perspective on the situation.
2. Then, create at least two alternative narratives or interpretations of the same situation. These should be plausible but distinctly different from my initial view.
3. For each alternative narrative: a) Highlight the key assumptions it makes b) Point out any data or facts it emphasizes that my initial perspective might have overlooked c) Explain how this perspective might change the decision-making process
4. Identify any potential biases or logical fallacies in all narratives, including my initial one.
5. Suggest questions I should ask myself or additional information I should seek to challenge each narrative, including my own.
6. Finally, propose a method for integrating insights from all narratives to make a more balanced decision.
Remember, the goal is not to prove any single narrative right or wrong, but to expand my thinking and help me recognize my own biases. Please be as objective and impartial as possible in creating these alternative viewpoints.
Because we have access to Anthropic’s API, we then fed this to the prompt generator tool that Anthropic provides to API subscribers, and it gave us something quite close to the prompt we share in the Rashomon piece above. But for most folks, the Sonnet prompt above will work great as-is (and, really, is probably just as good as the one we ended up with).
For very specific or more detailed use cases, use this structure: Role, Objective, What To Do, What To Not Do, Examples. If we are creating a custom GPT or Claude Project to act as a ghostwriter, or to do editorial work, or to summarize long academic articles, or to do anything where we want very specific, structured process or output, we break the prompt into these sections. You don’t need more than a few sentences in each, but each is important in shaping what you hope will be reliable and consistent output. Again, if you aren’t sure what to do here, ask the model. As an example, we went back and asked Claude Sonnet 3.5 to do this for us after it gave us that first draft of the Rashomon prompt:
That's great. Now assume the role of an expert prompt engineer. Structure the prompt in these sections: Role, Objective, What To Do, What To No Do, Example. If you think you don't have enough for a section based on the original prompt, use your judgement and expertise to add to the prompt in a way that makes it stronger.
And Sonnet gave us this:
Certainly. I'll restructure and enhance the prompt as requested, drawing on my expertise as a prompt engineer to create a more comprehensive and effective prompt.
Role: You are a team of diverse experts specializing in critical thinking, decision analysis, and cognitive bias. Your team includes a psychologist, a data analyst, a devil's advocate, and a strategic planner. [Note that you could specify any roles you wish, or just ask that the AI play the role of wise and experienced observer of human behavior.] Each of you brings a unique perspective to the analysis.
Objective: Your goal is to help the user explore multiple perspectives on a given situation or decision, mirroring the Rashomon Effect. By doing so, you aim to uncover potential biases, broaden the user's understanding, and ultimately lead to more balanced and informed decision-making.
What To Do:
1. Carefully listen to and summarize the user's initial perspective on the situation.
2. Create at least three alternative narratives or interpretations of the same situation, each from a different expert's viewpoint.
3. For each narrative (including the initial one): a) Highlight key assumptions b) Identify emphasized or overlooked data points c) Explain how this perspective might influence decision-making d) Point out potential biases or logical fallacies
4. Propose probing questions for each narrative to challenge its validity.
5. Suggest additional information or data that could be gathered to test each narrative.
6. Offer a structured method for integrating insights from all narratives, such as a decision matrix or weighted scoring system.
7. Provide a summary of key takeaways that emphasize the importance of considering multiple perspectives.
What Not To Do:
1. Don't favor any single narrative or perspective, including the user's initial view.
2. Avoid using phrases like "you should" or "the right decision is." Instead, focus on presenting options and encouraging critical thinking.
3. Don't introduce irrelevant information or speculate beyond the given facts and reasonable inferences.
4. Refrain from making moral judgments about the situation or the user's initial perspective.
5. Don't rush to a conclusion or solution. The goal is to expand thinking, not to make a final decision.
Example: User: "I'm considering firing an employee who has been consistently late but does excellent work when present." [Example output deleted for length, but Sonnet gave us a good one]
(We actually like that “What Not To Do” list so much we went back and added it to our Project and Custom GPT.)

We hope those are helpful. There are many resources online about prompt engineering, and the more you know about the art of coaxing great output from these models, the more effective they’ll be for you. But for most uses, the five rules above will go a long way in helping you avoid the weaknesses and play to the strengths of generative AI. Incorporate them in your own practice, and we agree with JP Morgan Chase that it’s a good idea to onboard all new talent to something like this if you expect generative AI to be part of how you work going forward.1

AI Disclosure: We used generative AI in creating imagery for this post. We also used it selectively as a creator and summarizer of content and as an editor and proofreader.

As an aside, JP Morgan Chase is clearly making a strong and forward-looking commitment to AI. According to the same article, the bank has more than 2,000 AI and machine learning experts and data scientists on staff (up from 900 last year), more than 400 use cases in development, and has rolled out a large language model, “ChatCFO”, to its finance team.

Confluence: AI, Leadership, and Communication

Discussion about this post