Eight Tips to Make GenAI Do What You Want

Article 5 MIN read

Key Takeaways

An AI-first consultant, who happens to have a PhD in astrophysics, shares what he has learned from querying GenAI chatbots up to 100 times a day. Three big takeaways
  • Try to write specific prompts that solve specific problems or answer specific questions.
  • By trying to do less with each prompt, you improve reliability.
  • These small building blocks can then become the foundation for tremendous productivity enhancement.
Saved To My Saved Content

I’ve been accused of being a futurist. My background as an astrophysicist speaks to my deep interest in the forces driving space and time. In my PhD research, I looked at clouds of gas billions of light years from Earth with the world’s largest optical telescopes to test whether the fundamental constants of physics differ depending on time and space. This work led me into the fields of statistics and programming that set me up for my career at BCG, deep in the AI revolution.

During the height of the pandemic, I led BCG’s epidemiology modeling work supporting the firm’s response to COVID-19. I’m a founding member of BCG X, the firm’s tech design and build unit, where I work to realize the potential from AI agents.

Given the tremendous leap in AI capabilities in recent years, I’m benefiting greatly from an AI-first approach to work and productivity. I always consider AI as the first option for most tasks. Can I do it better or faster with AI? Can I achieve more? On a typical day, I may have 30 to 100 conversations with a GenAI service.

If AI will reduce my cognitive load from, say, ten minutes of complex thinking to two, I’ll put a large language model (LLM) to work in the background while I solve other problems. If I’m in a taxi, I’ll converse with a GenAI chat service to sharpen my thinking as I prepare for an upcoming meeting.

But AI will not automatically do your work for you. Getting great answers from LLMs requires a rigorous approach. The following eight best practices break down my most useful insights from the last few years:

Always ask twice. If you query ChatGPT or other GenAI tool, and the output feels satisfying, reject the impulse to be done. Make a habit of prompting the model a second time, every time. “Answer /refine” is the name of the game.

Mathematically, your chances of an accurate output improve substantially after two queries. Let’s assume your first prompt generates an answer with an error rate of 30%, and your second prompt has a 20% error rate. The error rate then falls to 6%—not bad for a single extra query.

The lesson: people expect computer levels of precision from what is essentially a virtual person, and people are not perfect. People rework, edit, and improve most of their work output. Why should we not expect the same from a virtual person?

Keep your prompts simple. Just like people, LLMs cannot readily multitask. If you feed a model too many tasks at once, the quality of the output will suffer.

This observation is closely related to studies on air traffic controllers showing that their performance can degrade notably even if cognitive load increases only incrementally. Plan your work and implement it through a logical sequence of interactions.

Sequence your questions properly. The order in which you query the model significantly affects the answers it provides.
LLMs are autoregressive. They generate content based on what they were asked, not what you are about to ask. Because the model operates in a strictly left-to-right fashion, it does not know what comes next—it can only make a best guess based on the past. That’s another reason why a generate-and-refine or generate-critique-update approach makes sense.

Specify your tasks clearly. LLMs cannot read your mind. If your request is imprecise, you might get an unexpected output. When this happens, the most common cause is failure to provide sufficiently precise instructions and information.

To generate reliable outputs, be clear and precise in what you are asking. Don’t contradict yourself by, for example, asking for more detail and brevity at the same time. Define what is in and out of scope. Provide sample outputs. It’s okay if your prompt is several paragraphs long. By formalizing your task, you will improve the quality of the output.

Prompt for reasoning before recommending. You must ask the model to provide reasoning before making a recommendation, not after. If the recommendation comes first, the reasoning will be post hoc justification.

When AI first considers all potential choices and provides reasoning for them, its ultimate recommendation will be more grounded in logic and evidence. The latest batch of “reasoning models” try to do this—think before acting.

Consider the difference:

Prioritize output, then structure. If you impose strong constraints that focus on the structure of the output, the actual output is less likely to be accurate. The LLM is so focused on producing syntactically valid code or answers that it has less cognitive capacity to solve the actual task.

The key is to separate content from form. Get the content right, then fine-tune your overall output with a follow-up request for structural elegance.

LLMs tend to reason less effectively when asked to think and produce strict formats like JSON (a format for storing and exchanging data) at the same time. To avoid this trap, ask for a natural language response, and then query the model a second time to convert it to JSON.

Ask for bite-size answers. All models have limits in the length of output they can generate. Many of the key commercial LLMs will not output more than about 1,000 words, no matter how hard you beg. And quality and output length are inversely proportional.

To combat this reality, break down the writing or coding into manageable chunks. You can still create long documents by stitching together a series of shorter ones.

Check for hallucinations. While LLMs are trained on vast amounts of data, they can still fabricate details or misinterpret inputs. Review all output carefully.

Hallucinations can often be detected by simply asking if the LLM has erred. Take the LLM to task by asking: Are you sure your answer is factually correct? Have you made a claim that is untrue?


As a scientist, I base my hypotheses on a series of small observations—bite-size inquiries—that reveal larger patterns. As consultants, we take large, amorphous problems and break them down into small, solvable problems. The art of prompting is not that different. Try to write specific prompts that solve specific problems or answer specific questions. By trying to do less with each prompt, you improve reliability. These small building blocks can then become the foundation for tremendous productivity enhancement.

Subscribe to our Artificial Intelligence E-Alert.

Authors

Partner and Director, BCG X

Julian King

Partner and Director, BCG X
Sydney

Related Content

Saved To My Saved Content
Saved To My Saved Content