Getting started with the basic tasks of Computer Vision – News Couple
ANALYTICS

Getting started with the basic tasks of Computer Vision


If you are interested in or planning to do anything related to photos or videos, then you should definitely consider using computer vision. Computer Vision (CV) is a branch of artificial intelligence (AI) that enables computers to extract useful information from images, videos, and other visual inputs and also take necessary actions. Examples could be self-driving cars, automatic traffic management, monitoring, image-based quality checks, the list goes on.

What is OpenCV?

OpenCV is a library primarily aimed at computer vision. It contains all the tools you will need while working with Computer Vision (CV). symbolize “Open” to Open Source and “CV” stands for Computer Vision.

What will I learn?

The article contains everything you need to get started with computer vision with the OpenCV library. You will feel more confident and efficient in computer vision. All code and data are present Here.

Read and view photos

Let’s first understand how to read and view a photo, the basics of a resume.

Read image:

import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
img=cv2.imread('../input/images-for-computer-vision/tiger1.jpg')

Contains “img” On the image in the form of a small matrix. Let’s print its type and shape,

print(type(img))
print(img.shape)

Numeric matrix form (667, 1200, 3), where,

667 – image height, 1200 – image width, 3 – number of channels,

In this case there are RGB channels so we have 3. The original image is in RGB format but OpenCV reads the image as BGR by default, so we have to convert it back to RGB before displaying it.

view photo:

# Converting image from BGR to RGB for displaying
img_convert=cv.cvtColor(img, cv.COLOR_BGR2RGB)
plt.imshow(img_convert)

drawing over the picture

We can draw lines, shapes and image text.

# Rectangle
color=(240,150,240) # Color of the rectangle
cv.rectangle(img, (100,100),(300,300),color,thickness=10, lineType=8) ## For filled rectangle, use thickness = -1
## (100,100) are (x,y) coordinates for the top left point of the rectangle and (300, 300) are (x,y) coordinates for the bottom right point

# Circle
color=(150,260,50)
cv.circle(img, (650,350),100, color,thickness=10) ## For filled circle, use thickness = -1
## (250, 250) are (x,y) coordinates for the center of the circle and 100 is the radius

# Text
color=(50,200,100)
font=cv.FONT_HERSHEY_SCRIPT_COMPLEX
cv.putText(img, 'Save Tigers',(200,150), font, 5, color,thickness=5, lineType=20)

# Converting BGR to RGB
img_convert=cv.cvtColor(img, cv.COLOR_BGR2RGB)
plt.imshow(img_convert)

Basic tasks of computer vision |  From picture to picture

mix photos

We can also merge two or more images using OpenCV. The picture is just numbers, and you can add, subtract, multiply and divide numbers and thus pictures. One thing to note is that the size of the images should be the same.

# For plotting multiple images at once
def myplot(images,titles):
    fig, axs=plt.subplots(1,len(images),sharey=True)
    fig.set_figwidth(15)
    for img,ax,title in zip(images,axs,titles):
        if img.shape[-1]==3:
            img=cv.cvtColor(img, cv.COLOR_BGR2RGB) # OpenCV reads images as BGR, so converting back them to RGB
        else:
            img=cv.cvtColor(img, cv.COLOR_GRAY2BGR)
        ax.imshow(img)
        ax.set_title(title)

img1 = cv.imread('../input/images-for-computer-vision/tiger1.jpg')
img2 = cv.imread('../input/images-for-computer-vision/horse.jpg')

# Resizing the img1
img1_resize = cv.resize(img1, (img2.shape[1], img2.shape[0]))

# Adding, Subtracting, Multiplying and Dividing Images
img_add = cv.add(img1_resize, img2)
img_subtract = cv.subtract(img1_resize, img2)
img_multiply = cv.multiply(img1_resize, img2)
img_divide = cv.divide(img1_resize, img2)

# Blending Images
img_blend = cv.addWeighted(img1_resize, 0.3, img2, 0.7, 0) ## 30% tiger and 70% horse
myplot([img1_resize, img2], ['Tiger','Horse'])
myplot([img_add, img_subtract, img_multiply, img_divide, img_blend], ['Addition', 'Subtraction', 'Multiplication', 'Division', 'Blending'])
mix photos

The multiplied image is almost white and the segmentation image is black, because white means 255 and black means 0. When we multiply the two-pixel values ​​of the images, we get a higher number, so that they are white or close to white and vice versa for the segmentation image.

Transform the image

Image transformation includes image translation, rotation, scaling, cropping and flipping.

img=cv.imread('../input/images-for-computer-vision/tiger1.jpg')

width, height, _=img.shape




# Translating

M_translate=np.float32([[1,0,200],[0,1,100]]) # 200=> Translation along x-axis and 100=>translation along y-axis

img_translate=cv.warpAffine(img,M_translate,(height,width)) 




# Rotating

center=(width/2,height/2)

M_rotate=cv.getRotationMatrix2D(center, angle=90, scale=1)

img_rotate=cv.warpAffine(img,M_rotate,(width,height))




# Scaling

scale_percent = 50

width = int(img.shape[1] * scale_percent / 100)

height = int(img.shape[0] * scale_percent / 100)

dim = (width, height)

img_scale = cv.resize(img, dim, interpolation = cv.INTER_AREA)




# Flipping

img_flip=cv.flip(img,1) # 0:Along horizontal axis, 1:Along verticle axis, -1: first along verticle then horizontal




# Shearing

srcTri = np.array( [[0, 0], [img.shape[1] - 1, 0], [0, img.shape[0] - 1]] ).astype(np.float32)

dstTri = np.array( [[0, img.shape[1]*0.33], [img.shape[1]*0.85, img.shape[0]*0.25], [img.shape[1]*0.15, img.shape[0]*0.7]] ).astype(np.float32)

warp_mat = cv.getAffineTransform(srcTri, dstTri)

img_warp = cv.warpAffine(img, warp_mat, (height, width))




myplot([img, img_translate, img_rotate, img_scale, img_flip, img_warp],

       ['Original Image', 'Translated Image', 'Rotated Image', 'Scaled Image', 'Flipped Image', 'Sheared Image'])
convert image |  Basic functions of computer vision

Image Processing

threshold: In Threshold, pixel values ​​less than the threshold value become 0 (black), and pixel values ​​greater than the threshold value become 255 (white).

I consider the minimum to be 150, but you can choose any other number as well.

# To visualize the filters, import plotly.graph_objects as from plotly.subplots import make_subplots def plot_3d (img1, img2, title): fig = make_subplots (rows = 1, cols = 2, specs =[['is_3d': True, 'is_3d': True]]subplot_titles =[titles[0], titles[1]],) x, y = np.mgrid[0:img1.shape[0], 0: img1.shape[1]]fig.add_trace (go.Surface (x = x, y = y, z = img1[:,:,0]), row = 1, column = 1) fig.add_trace (go.Surface(x = x, y = y, z = img2[:,:,0]), row = 1, column = 2) fig.update_traces (contours_z = ict (show = True, usecolormap = True, lightcolor ="green lemon", project_z = true)) fig.show()
img=cv.imread('../input/images-for-computer-vision/simple_shapes.png')

# Pixel value less than threshold becomes 0 and more than threshold becomes 255

_,img_threshold=cv.threshold(img,150,255,cv.THRESH_BINARY)

plot_3d(img, img_threshold, ['Original Image', 'Threshold Image=150'])
Image Processing

After applying thresholding, the values which are 150 becomes equal to 255

Filtering: Image filtering is changing the appearance of an image by changing the values of the pixels. Each type of filter changes the pixel value based on the corresponding mathematical formula. I am not going into detail math here, but I will show how each filter work by visualizing them in 3D. If you are interested in the math behind the filters, you can check this.

img=cv.imread('../input/images-for-computer-vision/simple_shapes.png')

# Gaussian Filter
ksize=(11,11) # Both should be odd numbers
img_guassian=cv.GaussianBlur(img, ksize,0)
plot_3d(img, img_guassian, ['Original Image','Guassian Image'])

# Median Filter
ksize=11
img_medianblur=cv.medianBlur(img,ksize)
plot_3d(img, img_medianblur, ['Original Image','Median blur'])

# Bilateral Filter
img_bilateralblur=cv.bilateralFilter(img,d=5, sigmaColor=50, sigmaSpace=5)
myplot([img, img_bilateralblur],['Original Image', 'Bilateral blur Image'])
plot_3d(img, img_bilateralblur, ['Original Image','Bilateral blur'])
biased filter

Gaussian filter: NSFlatter the image by removing detail and noise. For more details, you can read this.

medium filter: Non-linear process useful in reducing impulse noise or salt and pepper noise

dual filter: HMaintain dge, smoothing and noise reduction.

In simple words, filters help reduce or remove noise which is a random variation of brightness or color, this is called smoothing.

Feature detection

Feature discovery is a method for making local decisions at each image point by calculating image information abstractions. For example, for a facial image, the features are eyes, nose, lips, ears, etc., and we try to identify these features.

Let’s first try to define the edges of the image.

limit detection

img=cv.imread('../input/images-for-computer-vision/simple_shapes.png')
img_canny1=cv.Canny(img,50, 200)
# Smoothing the img before feeding it to canny
filter_img=cv.GaussianBlur(img, (7,7), 0)
img_canny2=cv.Canny(filter_img,50, 200)
myplot([img, img_canny1, img_canny2],
       ['Original Image', 'Canny Edge Detector(Without Smoothing)', 'Canny Edge Detector(With Smoothing)'])
limit detection

Here we are using Canny Edge Detector which is an edge detection engine that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. I don’t go into much detail on how Canny works, but the point here is that it is used for edge extraction. To know more about her work, you can check it out.

Before detecting an edge using the Canny edge method of edge detection, we smooth the image to remove noise. As you can see from the photo, after smoothing we get clear edges.

features

img=cv.imread('../input/images-for-computer-vision/simple_shapes.png')
img_copy=img.copy()
img_gray=cv.cvtColor(img,cv.COLOR_BGR2GRAY)
_,img_binary=cv.threshold(img_gray,50,200,cv.THRESH_BINARY)
#Edroing and Dilating for smooth contours
img_binary_erode=cv.erode(img_binary,(10,10), iterations=5)
img_binary_dilate=cv.dilate(img_binary,(10,10), iterations=5)
contours,hierarchy=cv.findContours(img_binary,cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
cv.drawContours(img, contours,-1,(0,0,255),3) # Draws the contours on the original image just like draw function
myplot([img_copy, img], ['Original Image', 'Contours in the Image'])
Basic tasks of computer vision |  features

shrink The process of stripping that uses a structural element to examine and reduce the shapes in an image.

expansion: adds pixels for the borders of objects in an image, simply reverse erosion

erosion and demand |  Basic functions of computer vision

structures

img=cv.imread('../input/images-for-computer-vision/simple_shapes.png',0)
_,threshold=cv.threshold(img,50,255,cv.THRESH_BINARY)
contours,hierarchy=cv.findContours(threshold,cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
hulls=[cv.convexHull(c) for c in contours]
img_hull=cv.drawContours(img, hulls,-1,(0,0,255),2) #Draws the contours on the original image just like draw function
plt.imshow(img)
Basic functions of computer vision |  structures

summary

We have seen how to read and view an image, draw shapes, text over an image, blend two images, transform an image such as rotate, scale, translate, etc., filter images using Gaussian blur, average blur, binary blur, and detect features using edge detection Canny and finding features in a picture.

I tried to scratch the surface of the computer vision world. This field is developing every day but the basics will remain the same, so if you try to understand the basic concepts, you will definitely excel in this field.

The media described in this article is not owned by Analytics Vidhya and is used at the author’s discretion.



Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button