ChatGPT is a versatile language model that developers can embed in apps, bots, and data pipelines. This guide walks you through the core concepts, how to set up the OpenAI API, typical request‑response flows, advanced patterns like function calling, and the pitfalls that trip up most newcomers.
ChatGPT is built on the GPT‑4 architecture. It predicts the next token based on the prompt you send. Tokens are roughly 4 characters of English text, so a 100‑word sentence uses about 75 tokens.
Stateless calls treat each request as independent. You include the entire conversation in the messages array. Stateful designs keep a short‑lived session ID on the client and only send new user messages, reducing payload size.
| Model | Input ($/1k tokens) | Output ($/1k tokens) |
|---|---|---|
| gpt‑4o | 0.005 | 0.015 |
| gpt‑4‑turbo | 0.003 | 0.006 |
| gpt‑3.5‑turbo | 0.0005 | 0.0015 |
Choosing the right model balances cost and capability. For most dev tools, gpt‑4‑turbo offers the best trade‑off.
Log in to platform.openai.com, generate a key, and store it in .env as OPENAI_API_KEY. Never commit the key to source control.
# Python
pip install openai
# Node.js
npm install openai
Run a quick test to ensure the key works.
# Python
import openai, os
openai.api_key = os.getenv("OPENAI_API_KEY")
print(openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[{"role":"user","content":"Hello"}]
))
Send a single user message and receive a reply.
# Node.js
import OpenAI from "openai";
const client = new OpenAI({apiKey: process.env.OPENAI_API_KEY});
const response = await client.chat.completions.create({
model: "gpt-4-turbo",
messages: [{role:"user",content:"Explain HTTP status 418"}],
max_tokens: 120
});
console.log(response.choices[0].message.content);
Maintain context by appending prior messages.
let history = [
{role:"system",content:"You are a helpful coding assistant."},
{role:"user",content:"How do I sort an array in JavaScript?"}
];
history.push({role:"assistant",content:"Use Array.prototype.sort()."});
history.push({role:"user",content:"Show me an example with numbers."});
const result = await client.chat.completions.create({
model:"gpt-4-turbo",
messages:history,
max_tokens:150
});
Enable real‑time UI updates by setting stream:true. The SDK returns an async iterator.
for await (const chunk of client.chat.completions.create({
model:"gpt-4-turbo",
messages:[{role:"user",content:"Write a haiku about code."}],
stream:true
})) {
process.stdout.write(chunk.choices[0].delta.content || "");
}
Define a JSON schema and let the model return data that matches it.
const functions = [{
name:"get_weather",
description:"Fetch current weather for a city",
parameters:{
type:"object",
properties:{
city:{type:"string",description:"City name"},
unit:{type:"string",enum:["celsius","fahrenheit"]}
},
required:["city"]
}
}];
const resp = await client.chat.completions.create({
model:"gpt-4o",
messages:[{role:"user",content:"What’s the weather in Berlin?"}],
functions,
function_call:"auto"
});
The response includes function_call with arguments ready for your backend.
Combine vector search (e.g., Pinecone) with ChatGPT to answer domain‑specific questions.
text-embedding-3-large.When generating many short completions (e.g., bulk summarization), use Promise.all in Node or asyncio.gather in Python to send up to 20 requests concurrently without hitting rate limits.
max_tokens based on UI constraints.gpt-3.5-turbo for drafts, upgrade to gpt-4‑turbo only for final polishing.logprobs once per day to monitor token efficiency.Embedding the key in source files leads to accidental leaks. Always read from environment variables or secret managers.
Requests that exceed the model’s context window (e.g., 128k tokens for gpt‑4o) are truncated, causing loss of important context. Trim older messages or summarize them.
Higher temperature yields creativity but reduces consistency. For deterministic answers, set temperature:0 and optionally top_p:1.
The API returns HTTP 429 for rate limits and 500 for internal errors. Implement exponential backoff and retry logic.
A vague system prompt leads to unpredictable tone. Example of a good prompt: “You are a concise, friendly developer assistant. Answer in plain JavaScript unless otherwise requested.”
Yes. The API key authenticates every request. You can create one in the OpenAI dashboard and store it securely as an environment variable.
Python, Node.js, Java, .NET, Go, and Ruby have first‑party libraries. Community SDKs exist for Rust, PHP, and Swift.
Set the max_tokens parameter in the request payload. For example, max_tokens: 150 will stop generation after roughly 150 tokens.
Yes. Include stream: true in the request body. The API returns a Server‑Sent Events (SSE) stream you can read line‑by‑line.
Exceeding the per‑minute request quota, sending too many tokens, or using a shared API key without a higher tier plan.
With this guide you can start building with ChatGPT today, avoid typical pitfalls, and scale your AI features responsibly.