OpenAI has released a new model, GPT-4o with Vision capabilities built right into its API. It is advertised as more accurate, faster and half the cost of the vision capabilities in the previous model. In this video we put that to the test, and try out using a python script to extract text off invoices (even handwritten ones). Also, I will show some tricks to get consistent output from the API for different types of images.
GitHub Link to Starter code:
https://github.com/AI-Unleashed/GPT4o_Vision
Download
0 formats
No download links available.
GPT-4o Vision API: How to Copy Text from Image (OCR in Python) | NatokHD