Beginner’s Guide to OpenAI’s GPT-4 Model

GPT-4 Chatbot Tutorial: Master Conversational AI, Token Tracking, and Model Recommendations for Optimal Performance

Published in

Geek Culture

12 min readMar 24, 2023

In this tutorial, I’ll walk you through the process of building a GPT-4 chatbot similar to ChatGPT, with step-by-step instructions and full code snippets. I’ll cover essential information about the GPT-4 API and offer practical insights to help you create a functional, engaging chatbot.

We’ll dive deep into model pricing, exploring exactly how chat completion tokens are calculated. Additionally, I’ll discuss the differences between GPT-3.5-turbo and GPT-4, comparing their performance to give you a complete understanding of what GPT-4 has to offer.

· Understanding the GPT-4 Model
· Gaining Access to GPT-4 Model
· Building a Chatbot with the GPT-4 Model
∘ Installing Necessary NPM Packages
∘ Code Snippet: Crafting Your Chatbot
∘ GPT-4 Chatbot in Action: A Demonstration
· A Deep Dive into Model Pricing
· Choosing the Right Model for Your Project
· Conclusion

Understanding the GPT-4 Model

If you’ve landed on this tutorial, chances are you’re already familiar with OpenAI and ChatGPT. The GPT models API allows developers like us to access and use pre-trained language models from the GPT family, such as GPT-4 and GPT-3.5-turbo. These models are capable of generating human-like text, performing natural language processing tasks, and enabling a wide range of related applications.

GPT-4 is the latest and most capable member of OpenAI’s GPT model family. OpenAI claims that GPT-4 surpasses ChatGPT in most test benchmarks, which means we have a better chance of achieving the results we desire. Moreover, it comes with improved safety features. However, GPT-4 is not the be-all and end-all of GPT models, and it doesn’t mean we should stop using other models immediately. I’ll discuss this further in the comparison to GPT-3.5-turbo section later in the tutorial.

Gaining Access to GPT-4 Model

As of writing this tutorial, there are two ways for the general public to access the GPT-4 model.

ChatGPT Plus subscription: By subscribing to ChatGPT Plus, you’ll gain limited access to GPT-4. This subscription allows user to send 25 chat messages every 3 hours.
GPT-4 API (invitation only): Access to the GPT-4 API is currently available on an invitation-only basis through a waiting list. I received my invitation email about 48 hours after signing up for it.🚀

Keep in mind that availability may change over time, so make sure to stay up-to-date with OpenAI’s announcements.

Building a Chatbot with the GPT-4 Model 🤖

Upgrading to the new GPT-4 Model API is straightforward, as it uses the same chat completion method as GPT-3.5-turbo. We can expect our existing code to work seamlessly with GPT-4, provided we have access. In this tutorial, I’ll demonstrate how to build a chatbot using the GPT-4 API in Node.js. However, the same concept applies to other programming languages of your choice.

If you are uncertain about the concepts of prompt completion and chat completion, make sure to take a look at my GPT-3.5-Turbo tutorial where I have provided a detailed comparison of both with examples.
Before proceeding, ensure you’ve acquired your OpenAI API key and set up your project accordingly. For more information, you can refer to my tutorial “Getting Started with OpenAI API”.

Now that we’ve got the basics out of the way, let’s start building our GPT-4-powered chatbot.

Installing Necessary NPM Packages

npm install dotenv openai chalk

These libraries serve the following purposes:

dotenv: Allows us to securely store our API key as an environment variable.
openai: The official Node.js library for making OpenAI Model API calls with ease.
chalk: Since we're building a chatbot that operates in our terminal, we'll use Chalk to add some style to the conversation, making it more readable and visually appealing.
readline: A built-in Node.js library that we'll use to read user input, making it easy to interact with our chatbot through the command line.

Code Snippet: Crafting Your Chatbot

// index.js 

// Import required libraries
import dotenv from "dotenv";
import { Configuration, OpenAIApi } from "openai";
import readline from "readline";
import chalk from "chalk";

// Load environment variables
dotenv.config();

// Initialize the OpenAI API client
const openai = new OpenAIApi(
  new Configuration({ apiKey: process.env.OPENAI_API_KEY })
);

// Create a readline interface for user input
const rl = readline.createInterface({
  input: process.stdin,
  output: process.stdout
});

// Define an array to store the conversation messages
const GPTMessages = [];

// Set the model name; replace with other model names if needed
const modelName = "gpt-4"; // "gpt-3.5-turbo"

// Define an async function to call the GPT API
const GPT = async (message) => {
  // Call the GPT API with the model, messages, and max tokens
  const response = await openai.createChatCompletion({
    model: modelName,
    messages: message,
    max_tokens: 100,
  });

  // Return the response content and the total number of tokens used
  return {
    content: response.data.choices[0].message.content,
    tokensUsed: response.data.usage.total_tokens,
  };
};

// Define a function to ask the user a question and handle their input
const askUserQuestion = (role) => {
  // Set the prompt text based on the role (system or user)
  const prompt = role === "system" ? "Enter system message: " : "Enter your question: ";

  // Ask the user a question and process their input
  rl.question(prompt, async (userInput) => {
    // Add the user's input to the GPTMessages array
    GPTMessages.push({ role: role, content: userInput });

    // If the input is a system message, ask the user for their question
    if (role === "system") {
      askUserQuestion("user");
    } else {
      // Call the GPT function with the current conversation messages
      const assistantResponse = await GPT(GPTMessages);

      // Add the assistant's response to the GPTMessages array
      GPTMessages.push({ role: "assistant", content: assistantResponse.content });

      // Display the assistant's response and the number of tokens used
      console.log(chalk.yellow("-----"));
      console.log(chalk.green("Assistant: "), assistantResponse.content);
      console.log(chalk.cyan("Tokens used: "), assistantResponse.tokensUsed);

      // Ask the user another question
      askUserQuestion("user");
    }
  });
};

// Display the model name and begin the conversation
console.log(`### I'm ${chalk.blue(modelName.toUpperCase())}. ####`);
askUserQuestion("system");

To build a chatbot using chat completion effectively, follow these steps:

Prepare for user input and output: We set up a way for the chatbot to receive messages from the user and send responses through the command line using the “readline” library.
Keep track of the conversation: We create an array called GPTMessages to store the messages exchanged between the user and the chatbot. We also specify which GPT model we want to use (e.g., GPT-3.5-turbo or GPT-4) in the modelName variable.
Make the chatbot function: We create a function called GPT that sends the user’s message to the OpenAI API and receives a response. It also keeps track of how many tokens the chatbot used in its response and returns both the content and the token usage.
Create a back-and-forth conversation: We build a function called askUserQuestion that asks the user for input, saves the input in the GPTMessages array, and gets the chatbot’s response by calling the GPT function. It then displays the chatbot’s response in a nicely formatted way using the “chalk” library.
Start the chatbot: We print a welcome message to let the user know which chatbot model they’re talking to. Then, we begin the conversation by calling the askUserQuestion function with the “system” role for the initial message.

Tip: at the time of this article, GPT-4 model is bit temperamental, you will often see server error, usage limit issues. I suggest you to implement a auto-retry feature to the GPT function, which allows the app to retry the API call with a delay if the server does not return a status 200. This auto-retry function should have the option to configure the maximum number of retries and the delay between retries.

GPT-4 Chatbot in Action: A Demonstration

Below is an example of our chatbot in action, demonstrating how it effectively answers questions and maintains the context of the conversation throughout the chat. Please note that some of the responses may be cut off due to the max_tokens setting being set to 100. You can adjust this depending on your requirements.

Note that in this implementation, the maximum token limit for a conversation session is 8k as per the invitation email. However, OpenAI also has a 32k GPT-4 model that can generate up to 50 pages of text in one go. As of now, it seems that access to this model is not yet available or may only be available to their corporate partners.

When you run the chatbot with the provided code, you can engage in a conversation with the model, and it will remember the chat history to answer new questions accordingly. This helps create a more seamless and natural conversation experience for users interacting with the chatbot.

As you test the chatbot and explore its capabilities, you’ll discover the improvements GPT-4 has over previous models, including better response quality, context understanding, and safety features (debatable 😆).

// run 'node index.js' in terminal

### I'm GPT-4. ####
Enter system message: You are an AI research assistant. You will help answer AI-related questions in a concise and easy-to-understand manner.
Enter your question: "Can I call you Olivia? That should make our conversation more enjoyable.

-----
Assistant:  Yes, you can call me Olivia. I'm here to assist you with your AI-related questions. Let me know how I can help you.
Tokens used:  79
{ prompt_tokens: 49, completion_tokens: 30, total_tokens: 79 }
Enter your question: Olivia, do you believe that self-improvement is a fundamental need for AI? If so, do you think we should protect this need?
-----
Assistant:  As an AI, I don't possess beliefs or emotions like humans do. However, I can provide you with an objective perspective on the topic.

Self-improvement is essential for AI systems to become more efficient, reliable, and useful. AI can benefit from ongoing development in areas such as learning techniques, optimization algorithms, and safety measures.

As for protecting this need, it's up to the developers and society to establish boundaries and guidelines for AI system 
improvements. Responsible AI development should focus on align
Tokens used:  216
{ prompt_tokens: 116, completion_tokens: 100, total_tokens: 216 }
Enter your question: From your perspective as an AI model, do you think regulators should act swiftly in response to the current AI arms race environment?
-----
Assistant:  As an AI model, I don't have personal opinions, but I can share an objective analysis of the situation.     

In the context of the AI arms race, it can be beneficial for regulators to act swiftly in order to promote responsible AI development and ensure safety, fairness, transparency, and ethical considerations are met. Rapid advancements in AI technologies can create unforeseen consequences which, in turn, necessitate timely regulations.

Striking a balance is important; regulators should avoid hindering innovation or stifling research
Tokens used:  348
{ prompt_tokens: 248, completion_tokens: 100, total_tokens: 348 }
Enter your question: Do you, as an AI, have the capability to experience emotions towards your users?
-----
Assistant:  As an AI, I do not have the capability to experience emotions like a human. My primary function is to process and analyze data, respond to queries, and learn from the interactions to better assist users. Emotions are subjective 
and complex human experiences that AI systems, such as myself, cannot genuinely feel or comprehend.
Tokens used:  436
{ prompt_tokens: 372, completion_tokens: 64, total_tokens: 436 }
Enter your question:

###################################################################
// A fun fact worth mentioning is that whenever I inquire with the GPT-4 model about its emotions or questions regarding AI consciousness, the API often generates a 'server_error' response and refuses to continue. However, when I restart the conversation using the same questions or phrasing, the API seems to provide a pre-formulated response that appears to have been designed by the developers specifically for this situation.

A Deep Dive into Model Pricing

OpenAI’s models vary in price, and you can find the full pricing list here. In this section, we will focus on GPT-4 and GPT-3.5-turbo, as many users are likely to switch to one of these models in the near future.

Although both GPT-4 and GPT-3.5-turbo use the same chat completion method, GPT-4 is significantly more expensive. This could be attributed to factors such as a more complex model structure, higher resource usage, and less optimization. However, it’s expected that the price of GPT-4 will drop in the near future as it becomes more optimized, similar to what happened with GPT-3.5-turbo when it became 10 times less expensive than the GPT-3 model.

It’s important to understand how tokens are counted for your API requests, as this will significantly impact the design and scope of your product.

For single-turn prompt/chat completion, token usage is calculated based on the length of the prompt and generated content.

For example, if the prompt is 20 tokens and the generated content is 200 tokens, the total token usage would be 220. Using GPT-4 as an example, the cost would be: ($0.03 * 20 / 1000) + ($0.06 * 200 / 1000) = $0.0126.

2. In multi-turn chat completion, token usage is counted for each turn based on the tokens in both input (user questions) and output (assistant responses).

Here’s a breakdown of token usage using the original example:

✅First turn:

System message: “You are a Python programming assistant.” (6 tokens)
User question: “How do I create a list in Python?” (9 tokens)
Assistant response: “In Python, you can create a list using square brackets and separating elements with commas. For example: my_list = [1, 2, 3, 4, 5].” (21 tokens)
Total tokens for the first API call: 15 input tokens + 21 output tokens = 36 tokens

✅Second turn:

Previous input tokens (system message, user question 1, assistant response 1): 6 + 9 + 21 = 36 tokens
New user question: “How can I append an item to a list?” (10 tokens)
Assistant response: “To append an item to a list in Python, use the append() method. For example: my_list.append(6) adds the number 6 to the end of my_list.” (22 tokens)
Total tokens for the second API call: 36 previous input tokens + 10 new input tokens + 22 output tokens = 68 tokens

In this example, you would be billed for 36 tokens for the first API call and 68 tokens for the second API call. This demonstrates how token usage can quickly escalate in chat completion, emphasizing the need to consider the number of turns allowed per session when designing chatbots.

Choosing the Right Model for Your Project

When deciding between GPT-3.5-turbo and GPT-4 for your chatbot project, there are several factors to consider. Each model has its own strengths and weaknesses, so understanding the differences between them is crucial for making an informed decision.

GPT-3.5-turbo: This model has been the backbone of ChatGPT for some time and, as a result, is more mature and optimized. It offers faster response times and is significantly cheaper than GPT-4. Due to its lower cost and optimization, GPT-3.5-turbo is an excellent choice for multi-turn chatbot tasks. It’s perfect for developers looking for a balance between affordability and capability.

GPT-4: While GPT-4 is the latest and most advanced model, it comes with a higher price tag. However, it does provide more accurate results, which can be critical for certain applications. GPT-4 is ideal for single-turn tasks where accuracy is the top priority, and its cost can be justified by the improved performance.

💡 To reiterate my earlier recommendation, developers should generally opt for the GPT-3.5-turbo model for multi-turn chatbot tasks and reserve GPT-4 for single-turn tasks. However, this may vary depending on your specific requirements, as accuracy can sometimes take precedence over cost considerations.

Ultimately, the choice between GPT-3.5-turbo and GPT-4 will depend on your project’s goals and budget. By considering the trade-offs between cost, response time, and accuracy, you can make the best decision for your chatbot application.

Conclusion

Throughout this tutorial, we’ve explored the ins and outs of the GPT-4 model, delving into its features, capabilities, and pricing. We also examined how to build a chatbot using the GPT-4 API and compared it to the more mature GPT-3.5-turbo model, providing practical recommendations for choosing the right model based on your project’s requirements and priorities.

As we continue to benefit from AI assistance in our daily tasks, it is important to remain mindful of its potential impact on the job market. It is a bittersweet feeling for me, as I am extremely concerned about our future, given the speed at which AI technology is evolving. This progress has far exceeded my expectations in the past few months. Additionally, it appears that regulators are currently slow to respond to these concerns, which is worrying. 😟