Basic Tenets of Prompt Engineering in Generative AI

Saptarshi Dutta 30 Aug, 2023 • 10 min read

Introduction

In this article, we shall discuss ChatGPT Prompt Engineering in Generative AI. ChatGPT has been one of the most discussed topics among tech and not-so-techies since November 2022. It is a type of intelligent conversation that marks the dawn of an era of intelligent conversation. One can ask almost anything ranging from science, arts, commerce, sports, etc., and can get an answer to those questions.

Prompt Engineering in Generative AI

This article was published as a part of the Data Science Blogathon.

ChatGPT

Chat Generative Pre-trained Transformer, commonly known as ChatGPT, represents the acronym for Chat Generative Pre-trained Transformer, signifying its role in generating new text based on user prompts. This conversational framework involves training on extensive datasets to create original content. Sam Altman’s OpenAI is credited with developing one of the most substantial language models, as exemplified by ChatGPT. This remarkable tool enables effortless execution of text generation, translation, and summarization tasks. It is the 3rd version of GPT. We shall not be discussing the interface, the modus operandi, etc., of ChatGPT, as most of us know how to use a chatbot. However, we shall discuss the LLMs.

What is Prompt Engineering?

Prompt Engineering in Generative AI is an advanced tool that leverages the capabilities of AI language models. It optimizes the performance of language models by developing tactical prompts, and the model is given clear and specific instructions. An illustration of giving instructions is as follows.

Prompt Engineering in Generative AI

Giving explicit instructions to the models is beneficial as this would make the answers precisely accurate.
Example – What is 99*555?Make sure that your response is accurate” is better than “What is 99*555?

Large Language Models (LLMs)

LLMs | Prompt Engineering in Generative AI

LLM is an AI-based algorithm that applies the techniques of neural networks on vast amounts of data to generate human-like texts using self-supervised learning techniques. Chat GPT of OpenAI and BERT of Google are some examples of LLM. There are two types of LLMs.

1. Base LLM – Predict the next word based on text training data.
Example – Once upon a time, a king lived in a palace with his queen and prince.
Tell me, the capital of France.
What is the largest city in France?
                   What is the population of France? 
Base LLM predicts the lines in italics.

2. Instruction-tuned LLM – follow the Instruction. It follows reinforcement learning with human feedback (RLHF).
Example – Do you know the capital of France?
Paris is the capital of France.
Instruction-tuned LLM predicts the line in italics.
Instruction-tuned LLM would be less likely to produce unwanted outputs. In this piece of work, the focus would be on instruction-tuned LL.

Guidelines for prompting

"

At the outset, we shall have to install openAI.

!pip install openai

This line of code will install openai as follows

"

Then, we shall load the API key and the relevant Python libraries. For this, we have to install python-dotenv. It reads key-value pairs from a .env file and helps develop applications incorporating the 12- factors principle.

pip install python-dotenv

This line of code will install python-dotenv as follows.

"

The openAI API uses an API key for authentication. The API key can be retrieved from the API keys page of the OpenAI website. It is a secret and don’t share. Now, we shall import openai

import openai
openai.api_key="sk-"

Then, we shall set the openai key, which is a secret key. Set it as an environment variable. In this piece of work, we have already set it in the environment.

import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())

openai.api_key  = os.getenv('OPENAI_API_KEY')

OpenAI’s GPT-3.5-turbo model and the chat completion endpoints will be used here. This helper function enables more effective usage of prompts and looks at the generated outputs.

def get_completion(prompt, model="gpt-3.5-turbo"):
    messages = [{"role": "user", "content": prompt}]
    response = openai.ChatCompletion.create(
        model=model,
        messages=messages,
        temperature=0, # this is the degree of randomness of the model's output
    )
    return response.choices[0].message["content"]

Principles of Prompting

There are two basic principles of prompting – writing clear and specific instructions and giving the model time to think. Tricks to implement these principles will be discussed now. The first trick would be to use delimiters to identify specific inputs distinctly. Delimiters are clear punctuations between prompts and specific pieces of text. Triple backticks, quotes, XML tags, and section titles are delimiters, and anyone could be used. So, in the following lines of code, we are trying to summarize a text extracted from Google News.

text = f"""
Apple's shipment of the iPhone 15 to its vast customer base might encounter delays \
due to ongoing supply challenges the company is currently addressing. These developments\
surfaced just a few weeks before Apple's upcoming event. While the iPhone 15 series' anticipated \
launch date is September 12, Apple has yet to officially confirm this date.\
"""
prompt = f"""
Summarize the text delimited by triple backticks \ have
into a single sentence.
```{text}```
"""
response = get_completion(prompt)
print(response)
"

JSON and HTML Output

From the output, we can see that the text has been summarized.
The next trick is asking for a structured JSON and HTML output. In the following illustration, we are trying to generate a list of five books written by Rabindranath Tagore in JSON format and see the corresponding output.

prompt = f"""
Generate a list of five books titles along \ 
with their authors as Rabindranath Tagore. 
Provide them in JSON format with the following keys: 
book_id, title, genre.
"""
response = get_completion(prompt)
print(response)
"

JSON Output

Similarly, in the next illustration, we are trying to get output in JSON format of three medical thrillers with book ID, title, and author.

prompt = f"""
Generate a list of three medical thriller book titles along \ 
with their authors. 
Provide them in JSON format with the following keys: 
book_id, title, author.
"""
response = get_completion(prompt)
print(response)
"

HTML Format

In both cases, we received output in the required format exactly the way we prompted. Now, we will find out the books written by Rabindranath Tagore in HTML format.

prompt = f"""
Generate a list of five books titles along \ 
with their authors as Rabindranath Tagore. 
Provide them in HTML format with the following keys: 
book_id, title, genre.
"""
response = get_completion(prompt)
print(response)
"

Load Libraries

Now, we have received the output in HTML format. To view HTML, we need to load libraries with the help of the following lines of code.

from IPython.display import display, HTML
display(HTML(response))
Rabindranath Tagore's Books | Prompt Engineering in Generative AI

The exact output we wanted is now on display. Another trick is “zero-shot prompting.” Here, we will not impart specific training to the model instead, it will rely on past knowledge, reasoning, and flexibility. The task is to calculate the volume of a cone where we know the height and radius. Let us see what the model does in the output.

prompt = f"""
Calculate the volume of a cone if height = 20 cm and radius = 5 cm
"""
response = get_completion(prompt)
print(response)
Prompt Engineering in Generative AI

It can be seen that the model gives a stepwise solution to the task. First, it writes the formula, puts the values, and calculates without specific training.

Few Shot Prompting

The final trick of the first principle is “few shot prompting.” Here, we are instructing the model to answer in a consistent style. The task of the model would be to answer in a consistent style. There is a conversation between a student and a teacher. The student asks the teacher to teach me about cell theory. So, the teacher responds. Now, we ask the model to teach about germ theory. The illustration is shown below.

prompt = f"""
Your task is to answer in a consistent style.

<student>: Teach me about cell theory .

<teacher>: Cell theory, fundamental scientific theory of biology according to which \
cells are held to be the basic units of all living tissues.\
First proposed by German scientists Theodor Schwann and Matthias Jakob Schleiden in 1838, \
the theory that all plants and animals are made up of cells.

<child>: Teach me about germ theory.
"""
response = get_completion(prompt)
print(response)
Prompt Engineering in Generative AI

So, the model has responded to us as instructed. It fetched germ theory and answered unfailingly. All the tricks or techniques discussed till now follow the first principle: writing clear and specific instructions. Now, we shall look into the techniques to put the second principle, i.e., giving the model time to think. The first technique is to specify the steps required to complete a task. In the following illustration, we have taken a text from a news feed to perform the steps mentioned in the text.

text = f"""
AAP leader Arvind Kejriwal on Sunday assured various "guarantees" including \
free power, medical treatment and construction of quality schools besides \
a monthly allowance of ₹ 3,000 to unemployed youths in poll-bound Madhya Pradesh.
Addressing a party meeting here, the AAP national convener took a veiled dig at \
MP chief minister Shivraj Singh Chouhan and appealed to people to stop believing \
in "mama" who has "deceived his nephews and nieces".
"""
# example 1
prompt_1 = f"""
Perform the following actions: 
1 - Summarize the following text delimited by triple \
backticks with 1 sentence.
2 - Translate the summary into French.
3 - List each name in the French summary.
4 - Output a json object that contains the following \
keys: french_summary, num_names.

Separate your answers with line breaks.

Text:
```{text}```
"""
response = get_completion(prompt_1)
print("Completion for prompt 1:")
print(response)
"

The output indicates that the model summarized the text, translated the summary into French, listed the name, etc. Another tactic is instructing the model not to jump to conclusions and do a self-workout on the problem. Following is an illustration of this tactic

prompt = f"""
Determine if the student's solution is correct or not.

Question:
I'm building a solar power installation and I need \
 help working out the financials. 
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \ 
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations 
as a function of the number of square feet.

Student's Solution:
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
"""
response = get_completion(prompt)
print(response)
"
prompt = f"""
Your task is to determine if the student's solution \
is correct or not.
To solve the problem do the following:
- First, work out your own solution to the problem. 
- Then compare your solution to the student's solution \ 
and evaluate if the student's solution is correct or not. 
Don't decide if the student's solution is correct until 
you have done the problem yourself.

Use the following format:
Question:
```
question here
```
Student's solution:
```
student's solution here
```
Actual solution:
```
steps to work out the solution and your solution here
```
Is the student's solution the same as actual solution \
just calculated:
```
yes or no
```
Student grade:
```
correct or incorrect
```

Question:
```
I'm building a solar power installation and I need help \
working out the financials. 
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.
``` 
Student's solution:
```
Let x be the size of the installation in square feet.
Costs:
1. Land cost: 100x
2. Solar panel cost: 250x
3. Maintenance cost: 100,000 + 100x
Total cost: 100x + 250x + 100,000 + 100x = 450x + 100,000
```
Actual solution:
"""
response = get_completion(prompt)
print(response)
Prompt Engineering in Generative AI

The output indicates that the model worked properly on the problem and produced the desired output.

Conclusion

Generative AI can revolutionize academics, medical science, the animation industry, the engineering sector, and many other areas. ChatGPT, with more than 100 million users, is a testimony that Generative AI has taken the world by storm. There is a high hope that we are in the dawn of an era of creativity, efficiency, and progress.

Key Takeaways

  • Generative AI can easily generate text, translate, summarization, data visualization, and model creation through ChatGPT.
  • Prompt Engineering in Generative AI is the tool that leverages various capabilities of Generative AI by developing tactical prompts and giving the model clear and specific instructions.
  • The Large Language Model is the algorithm that applies the techniques of neural networks on vast amounts of data to generate human-like texts.
  • Through principles of prompting, we carry out various tasks of data generation.
  • We can get the model to produce the desired output through proper prompts.

I hope this article could add value to your time going through it.

Frequently Asked Questions

Q1. What is ChatGPT?

A. The expansion of ChatGPT is Chat Generative Pre-trained Transformer. It is a conversational setting where new texts are generated based on prompts provided by the users by getting trained on large amounts of data.

Q2. What is LLM? Give some examples of LLM.

A. The full form of LLM is the Large Language Model. LLM is an AI-based algorithm that applies the techniques of neural networks on huge amounts of data to generate human-like texts using self-supervised learning techniques. Chat GPT of OpenAI and BERT of Google are some examples of LLM.

Q3. What are the types of LLM?

A. There are two types of LLMs: Base LLM and Instruction tuned LLM. It follows reinforcement learning with human feedback (RLHF).

Q4. What are Delimiters?

A. Delimiters are clear punctuations between prompts and specific pieces of text. Triple backticks, quotes, XML tags, and section titles are delimiters.

Q5. What is the function of few shot prompting?

A. To instruct the model to answer in a consistent style.

References

  • https://colinscotland.com/unleash-the-power-of-chatgpt-11-epic-prompt-engineering-tips/
  • Learn Prompt Engineering in 2 hours: Learn ChatGPT Prompt Engineering to Boost Efficiency and Output (GPT 4). (2023). (n.p.): Cryptoineer Inc.
  • https://etinsights.et-edge.com/leading-large-language-models-llms-shaping-real-life-applications-revealed/

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Saptarshi Dutta 30 Aug 2023

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

  • [tta_listen_btn class="listen"]