Find helpful tutorials or share inspiring use cases on AI technology for higher education.

Shared content may not reflect the policies of Tilburg University on the use of AI. 

How to use ChatGPT vision feature? 3 Study Hacks for Students

Ever had a way too difficult graph in your lecture slides or hand-written notes that goes beyond your understanding? Now, ChatGPT’s capability to understand images can help! In this article, we will explore three study hacks demonstrating how to use ChatGPT Vision feature effectively. Using ChatGPT for your studies is no longer bound to text-based interactions.

In our previous post, we mentioned the release of ChatGPT’s new feature: the ability to upload images and have ChatGPT recognize and describe the contents of those images. This article will expand on the relatively simple example presented in the previous article and delve into the study options provided by ChatGPT’s vision. It will show that ChatGPT, with this new vision tool, is able to not only see images but also (i) make connections between multiple uploaded images, making it able to provide more contextually relevant responses and (ii) understand the meaning of parts in images, such as highlighted notes. Its understanding of highlighted notes or specific parts in images allows for nuances within visual data.

Showing ChatGPT your lecture materials

Before ChatGPT gained vision, using ChatGPT as your guide was limited to the text you received during your lectures and providing these snippets of text to ChatGPT. Now, with ChatGPT’s vision capabilities, you can also provide visual information from your lectures. This means that ChatGPT can analyze and understand images, diagrams, and other visuals, allowing it to provide more comprehensive and accurate guidance, since it now is based on both text and visual content.

Example

Imagine you are studying a complex graph that comes with an also complex formula in your lecture slides like the one shown in the prompt below. You can simply upload the image of the graph and formula, and ask specific questions about it.

Uploaded image(s)
Prompt
Output

As you can see, it shows a very detailed explanation of the graph and the corresponding picture of the formula. You can see that it also understands that the graph is a representation of the utility function; ‘The graph is a representation of this utility function’.

Showing ChatGPT your handwritten notes

It actually does not stop at showing ChatGPT your lecture slides. It can also read photos of quick handwritten notes you made during your lecture in which the teacher talked way too fast. You may end up with a jumble of half-written sentences and quickly sketched diagrams. ChatGPT’s vision can read your hurried handwriting but also understand the significance of those highlighted portions. It can discern the essence behind marked sections, whether they signify crucial concepts or are highlighted because another reason (as in the example below).

Example

Uploaded image(s)

Prompt
Can you explain these economic games to me? Please explain all of them one by one

Output

As you can see, all games are explained one by one and even the highlighted parts are being understood as being the choices the players make.

Making code from your sketches

ChatGPT’s vision can also be used to transform your sketches into actual code. This allows you to convert your design ideas into functional code, as can be seen in the example below. It gives you a whole directory structure and the basis of some code.

Uploaded image(s)

Prompt
These are two pages I want to create with the help of Flask in Python. Can you write the code for me?

Output

Limitations

Of course, ChatGPT’s vision also does have it’s limitations:

  • The first one simply being it still making mistakes, even on seemingly very simple tasks. This can be shown by the example below, where I asked ChatGPT to read the time of a clock.
  • The second limitation is the recognition of real people. This is actually not a limitation per se, since it is a good practice from ChatGPT to not talk about public figures and make sure privacy rules are followed, but it can limit you in the questions you can ask. When, for example, showing ChatGPT a picture of the inauguration of Obama, it will not specify anything about the person on the picture, in this case Obama.


Overall, ChatGPT’s vision allows complex concepts to now be easily visualized, shared, and understood. Visual representations of the course material now seamlessly merge with textual explanations.