The OpenAI API is a recurring feature on our website, tilburg.ai, and seeing Python code can feel immediately overwhelming. That’s why, in this article, we lend a hand and provide templates. All you need to do is copy the code into your favorite IDE, such as VSCode, and with a few small adjustments, you’ll be ready to go! That’s it, we promise!
We will start with the most basic example discussed in our beginner’s tutorial. We begin here because we want to build from the ground up and guide you on how to use the application as quickly and efficiently as possible. Additionally, we will provide some examples of how to use this shortcut. After that, we will explore the multimodal
functions of the API. Multimodal means that we move beyond just using text models. This is something we have already touched upon in this tutorial, but now we will present it in a template format.
In short, we have a lot to cover, so let’s get started quickly!
Requirements:
- Python is installed on your computer. Check out this article from Tilburg Science Hub for a tutorial on this.
- An IDE, such as VSCode, is installed on your computer. Look at this tutorial from Tilburg Science Hub how to install it.
- Familiarity with the OpenAI API at a beginner level. Everything you need to know and set up can be found in this tutorial on Tilburg.ai.
Wrapping the API inside a Function
Manually typing the API function whenever you want to use it during your study is quite time-consuming. Fortunately, there is a simple solution for this: using a function. By wrapping the code inside a function, we transform the process of making API requests to be efficient, allowing for easy reuse throughout your projects.
# Set your API key
client = OpenAI(api_key="YOUR API KEY")
def ChatGPT_API(prompt):
# Create a request to the chat completions endpoint
response = client.chat.completions.create(
model="gpt-4o",
# Assign the role and content for the message
messages=[{"role": "user", "content": prompt}],
temperature = 0) # Temperature: Controls the randomness of the response (0 is deterministic, 2 is highly random).
return response.choices[0].message.content
# Specify the function with your prompt
response = ChatGPT_API("Specify your prompt here") # Example: "What is Artifical Intelligence?"
print(response)
So now you only need to specify your model once, and for the rest of the time, you can use the function name. In this case ChatGPT_API
with your specific prompt.
Have Your Text’s Grammar, Spelling, and Punctuation Checked in One Go
Here we make use of our earlier defined function. Instead of just sending specific prompts to the function, we can also build our prompt and add files to it using something called an f-string
. Here we can specify files by placing ``{}``
. Okay, enough technical jargon. Let’s take a look at an example.
The only thing you are required to do is to add a Word document, and the model will assist you with spelling, grammar, and punctuation checks. Perfect!
import docx2txt
# read in word file
text = docx2txt.process("Path/to/your/file.docx")
# Craft a prompt to transform the text for writing improvement
prompt = f"""Proofread the text delimited by triple backticks without changing its structure.
```{text}```"""
# Using the fuction we defined earlier.
response = ChatGPT_API(prompt)
print("Before transformation:\n", text)
print("After transformation:\n", response)
Have a Coding Snippet Explained in One Go
This method is not limited to just text. It can also be used to explain code. Here, we use the f-string
combination with the text prompt again. This time, we add code instead of text. We also use a chain-of-thought
prompt, which explains everything step by step, providing a structured guide to understanding the code.
A chain-of-thought prompt breaks down complex concepts into manageable steps. It makes the explanation more digestible and gives you a more elaborate explanation of each step. Each step logically follows the previous one, creating a coherent and comprehensive explanation that guides you through the entire process.
code = """
income_split <- initial_split(income_frac, prop = 0.80, strata = income)
income_train <- training(income_split)
income_test <- testing(income_split)
"""
# Use a chain-of-thought prompt that asks the model to explain what the code does
prompt = f"""Explain what the code delimited by triple backticks does. Let's think step by step.```{code}```"""
response = ChatGPT_API(prompt)
print(response)
Sure, let's break down the code step by step to understand what it does.
### Step 1: `initial_split`
```R
income_split <- initial_split(income_frac, prop = 0.80, strata = income)
```
- **Function**: `initial_split`
- **Arguments**:
- `income_frac`: This is the dataset that you want to split.
- `prop = 0.80`: This specifies that 80% of the data should be used for training, and the remaining 20% will be used for testing.
- `strata = income`: This argument indicates that the split should be stratified based on the `income` variable. Stratified sampling ensures that the proportion of different levels of the `income` variable is maintained in both the training and testing sets.
....
### Summary
1. **Splitting the Data**: The `initial_split` function splits the `income_frac` dataset into training and testing sets, with 80% of the data allocated to training and 20% to testing. The split is stratified based on the `income` variable to ensure proportional representation.
2. **Extracting Training Set**: The `training` function extracts the training set from the split object and assigns it to `income_train`.
3. **Extracting Testing Set**: The `testing` function extracts the testing set from the split object and assigns it to `income_test`.
By the end of this code, you have two datasets: `income_train` (the training set) and `income_test` (the testing set), both derived from the original `income_frac` dataset with an 80-20 split, stratified by the `income` variable.
Using Multimodality in the API
Speech-to-Text Transcription with Whisper
Audio recordings or videos can be transformed into text using OpenAI’s model Whisper
. Using Whisper
for transcription can be beneficial for a student or teacher. Whisper saves time by automating the transcription process, instead of manually transcribing lengthy audio recordings. It can be used in practice in the following ways:
- Note-taking during lectures: Transcribe online lectures, allowing you to focus on understanding the content instead of taking extensive notes. Teachers can also benefit from this feature when recording their lectures or presentations.
- Improved documentation and follow-up during meetings: Transcribing meetings allows you to have a detailed and accurate written record of the meeting. This makes it easier for you to track action items, decisions, and key points discussed during the meeting. This can lead to efficient follow-up, ensuring that tasks and responsibilities are properly documented and assigned. It can also be beneficial for you if you missed a meeting and can easily read back the transcripts.
The template supports audio files up to 25 MB. For larger files, check the following tutorial on our website!
from openai import OpenAI
# Set your API key
client = OpenAI(api_key="YOUR API KEY")
# Open the openai-audio.mp3 file
audio_file = open("PATH/TO/Your/FILE.mp3", "rb") # RB stands for Read binary arguments, which is often used for non-text files
# Create a transcript from the audio file
response = client.audio.transcriptions.create(model="whisper-1", file = audio_file)
# Extract and print the transcript text
print(response.text)
Again just copy your code, and here there are 2 steps you need to take to have your transcription. First, enter your api_key
and second, load in your file audio_file
and that’s it! Those transcribing days are finally over ; )
From Audio to Transcript to Summary to PowerPoint in one Go
So far, when we’ve been using the OpenAI API, we have specified an input file or prompt to the model and received an output. This opens up a lot of opportunities, from asking a question and getting an answer to transcribing an audio or video file. But what if we could do more with the model’s output?
Enter model chaining
. Chaining is when models are combined by feeding the output from one model directly into another model as an input. We can chain multiple calls to the same model together or use different models. If we chain two text models together, we can ask the model to perform a task in one call to the API and send it back with an additional instruction. We can also combine two different types of models, like the Whisper
model and a text model
, to perform tasks like summarizing lectures and videos and converting them into a PowerPoint slide deck.
For the code below, all you need to do is the same two-step procedure: specify your API key
and your file
, and that’s it!
In this instance, we have used a specific example of prompt engineering: multi-step-prompting
. This approach breaking down a complex task into smaller, manageable steps, with each step incrementally contributing to the final desired outcome. By doing so, we can tackle complex or here problems that requires several transformations more efficiently and systematically. Naturally, you are encouraged to adjust the prompt for your specific needs!
from openai import OpenAI
client = OpenAI(api_key="YOUR-API-KEY")
# Open the audio.wav file
audio_file = open("/Users/matthijstentije/Downloads/Brand equity in the hearts and minds of consumers.mp4", "rb")
# Create a transcription request using audio_file
audio_response = client.audio.transcriptions.create(
model = "whisper-1",
file = audio_file
)
# Save the transcript
transcript = audio_response.text
# Your prompt is now chained with the transcript
## Here we have engineered our prompt: Multi-step Prompting
prompt = """Transform the uploaded transcript with the following two steps:
Step 1 - Proofread it without changing its structure
Step 2 - If it is not in English, translate it to English
Step 3 - Summarize it, at an depth level that is appropriate for a university exam
Step 4 - Generate a powerpoint slidedeck about the material""" + transcript
# Create a request to the API to identify the language spoken
chat_response = client.chat.completions.create(
model = "gpt-4o",
messages = [
{"role":"system",
"content": "Act as a helpful assistant"},
{"role":"user",
"content":prompt}
],
temperature = 0)
print(chat_response.choices[0].message.content)
Wow, in exactly 36.58 seconds:
- We have transcribed a video file and extracted the text.
- This text has been proofread, checked for spelling errors, and corrected.
- If the video had been in a different language than English, it would have been translated.
- After that, the video has been summarised.
- Finally, a PowerPoint slide deck has been created.
Conclusion
In this article, we take a careful, providing templates so non-coding users can use them, but deep, the template highlight how to use the API effectively, dive into the unknown waters of the OpenAI API. Along the way, we also touch on prompt engineering with chain-of-thought
and multi-step prompting
. Our advice:
- Copy and paste the code, add your API key, and save it as a file.
- If you ever need it, grab the template code, add your prompt and your file, and press run code.