The most insightful computer vision project

10 Feb 2022

I am a tech-enthusiast, and I love exploring new technologies, so I was looking forward to learning some new ideas and innovations about computer vision and machine learning.

Computer Vision is a field of study traditionally reserved for researchers or engineers with advanced degrees, and because that, in the beginning, I was feeling very intimidated. As I was researching, I’ve become a lot more comfortable with the topic. While uncovering some of the mystery behind the “magic”, it was giving me a lot of insights, and I think lots of people could benefit from knowing a bit more about how computer vision works and especially know that there is no reason to be intimidated. And that’s when I decided to kick off this little project — an opensource project containing a collection of different solutions for different real problems. This project has one major goal, share insights and empower developers to leverage from computer vision in their projects.

##Computer vision The core application for computer vision is image understanding. This also implies videos, as it is technically a collection of images (frames). Understanding an image is quite a complex and lengthy problem.

There are several tasks in image understanding; some are low-level tasks that are used in various others, while some are high-level tasks. Some of the low-level tasks are:

Image cleaning
Image segmentation
Histogram analysis
Image colour space translation
Image transformation
Image edge detection and contours, lines approximation
Image convolutions
Etc

Some of the high-level tasks (that usually uses the low-level ones) are:

Object detection
Object recognition
Object segmentation and localization
Object tracking
Feature extraction
Feature, colour correction
Feature reconstruction, approximation
Etc

The application of CV is usually based on high-level tasks. Some of the apps are:

Face detection in cameras
Pedestrians, cars, road detection in smart (self-driving) cars
Terrain detection in drones and airplanes
Vehicle license plate scanners at security checkpoints
Etc

However, due to advances in machine learning most of the CV applications are now using deep learning to get better accuracy. Even then, CV’s low-level tasks are being used for image pre-processing (processing images before feeding into deep learning networks).

##The opensource project

This project will be addressing all the points mentioned above, and it is structured to help you on the journey of learning the potential of the computer vision and how we can leverage from it on our projects. It’ll begin solving real problems using Image Processing and then move onto tasks that require Machine Learning models.

During our journey, we will also have projects exploring some critical concept of computer vision and ML, such as: what is an image; what are convolutions; how to implement a vanilla neural network; how back-propagation works; how to use transfer learning and more

All examples are written in Python and Jupyter notebooks with tons of comments to help you to follow the implementation. Even if you don’t know Python well, you will be able to follow the code and learn from the examples.

The advanced part of this project will require GPU but don’t worry because those examples are ready to run on Google Colab with just one click, and it is free! You will only need to have a Google account. Because some examples will require access to your camera(video), we can not use Colab to all examples; therefore, you will need to set up a Python environment on your machine for those.

In the notebooks, you will find links to some articles, and I have also prepared some videos providing an overview of the project. Everything to make your life easier!

Project overview

##Conclusion One of the most powerful types of AI is computer vision which you’ve almost surely experienced in any number of ways without even knowing it. Thanks to advances in artificial intelligence and innovations in deep learning and neural networks, the field has been able to take great leaps. My mission with this project is to To make AI accessible. To raise digital awareness and competence. Following this project until the end will give you insights and you will feel empowered to leverage from all recent innovations in this field to improve the experience of your projects.

Source code

Github project