Gronk, Artificial Intelligence chatbot The ability to analyze images has been quietly added to the functionality built into X.com. I’ve been testing it and it does seem to do a pretty good job until you hit the usage limit on the free account, which at the time of writing is set at a rather low upload count of three.
To use Grok’s new image analysis capabilities on your mobile device, simply load the X app, then click the Grok label (the square with a line through it) at the bottom of the screen, then click the + button to upload the image. In your browser, go to X.com and click Grok in the left menu, then use the paperclip button to attach the image you want to upload. Once uploaded, you can ask Grok some questions about it
Analyze images
First, I uploaded a cartoon of Odysseus, the king in Greek mythology who appears in Homer’s The Odyssey (which I just watched returnso please be patient) and see if Grok recognizes him. Grok does a good job of recognizing that this is a historical figure from the cartoon style, and I can even get it to generate more images of a similar nature by entering prompts like “Redo the image but make it a cartoon instead of a woman”.
Being able to analyze the content of an image so that it can be reproduced through changes is a useful capability, but not to the liking of its competitors Chat GPT Can’t do the same thing. But how to understand the text in the image?
Analyze text in images
I uploaded a picture of a flyer for a local fitness class and asked Grok to tell me what text it found in the picture. It extracted all the text perfectly and even provided clickable links to the URLs it found. However, it does not appear to provide a link to the Instagram account name; however Chat GPT I didn’t do this when testing either.
Being able to extract text from an image is one thing, but Grok also needs to be able to analyze that text. To test Grok, I uploaded the schedule of a local martial arts gym and asked if there were any BJJ classes I could go to every Thursday. It gave the perfect answer: “Yes, there are BJJ classes on Thursdays at 7:00 AM (Adult and Youth BJJ Do-Ji) and 8:00 PM (Adult and Youth BJJ No-Jitsu). “Such a feature is really useful for people who have difficulty processing visual information.
To further Grok’s image analysis, I tried uploading the academic text as a PDF to see what it contained, but it turns out that PDF uploads are not supported on Grok unless you upgrade to Premium. I calmly took a screenshot of the first page of the document and asked Grok to summarize the text. Once again, it does an excellent job of breaking down its answers into subheadings such as “Research Results,” “Scholarly Contributions,” and “Historical Background,” whereas ChatGPT simply generated a few paragraphs of summary.
Grok and ChatGPT
The biggest problem with Grok right now is that you quickly hit the free usage limit for uploaded images – and to be fair, you hit it pretty quickly on ChatGPT’s free tier too. Uploading three files a day is not too much. Beyond that, Grok is very good at analyzing imagery, even beating ChatGPT in some areas, and it’s well worth looking into if this feature sounds useful to you.