Geeks With Blogs
Josh Reuben


Overview

OpenCV is a Computer Vision / Image Processing library. It used in ffmpeg video processing / streaming and ROS robot operating system.

you can install it from here: http://opencv.org/downloads.html 


Modules

in the lib directory:

  • opencv_core - data structures , maths

  • opencv_imgproc -image processing.

  • opencv_highgui - image / video reading / writing

  • opencv_features2d - feature point detectors , descriptors , matchers

  • opencv_calib3d -camera calibration, 2-view geometry estimation, stereo

  • opencv_video - motion estimation, feature tracking, foreground extraction

  • opencv_objdetect -object detection

  • opencv_ml - machine learning

  • opencv_flann - computational geometry

  • opencv_contrib - contributed code ()

  • opencv_gpu - gpu acceleration


Creating an OpenCV project

Basic functionality - load an input image from file, process it, display it on a window region, store output image to disk.


#include <opencv2/core/core.hpp>

using namespace cv;

declare a cv::Mat variable to contain an image pixel matrix and use cv::imread to read it

Mat input = imread(“myimage.png”); // read an image

image transformations – not all can be performed in-place – some ops populate an output matrix

Mat output;

flip(input, output, -1); // positive for horizontal, 0 for vertical, negative for both

write to disk:

imwrite("output.png", output);

testing – OpenCV has a rudimentary windowing API for quick feedback:

#include <opencv2/highgui/highgui.hpp>

...

namedWindow(“my image”); // create a named window

imshow(“my image”, output); //show image on named window

The cv::Mat class

  • mechanisms like a shared_ptr: reference counting , shallow copy

  • size -exposes height and width properties

  • data - pointer to the allocated memory block for the 2D array - set to nullptr when no image has been read.

  • constructor - can specify initial size, element data type

cv::Mat image(600, 800,CV_8U,cv::Scalar(100));

  • copyTo - create a cv::Mat independent copy with separate underlying image data storage

  • reshape - change matrix dimensions without requiring any memory copy or re-allocation.

  • setTo - assigns a value to all elements of a matrix.

image.row(0).setTo(cv::Scalar(0));

Integrating with Qt


Convert to a Qt Qimage - order of 3 color channels needs to be inverted (from BGR in cv::Mat to RGB in QImage)

cv::cvtColor(image, image, CV_BGR2RGB); // change color channel ordering

// Qt image – cast internal data to unsigned char array

auto qimg = QImage((const unsigned char*)(image.data), image.cols, image.rows, QImage::Format_RGB888);

Image data types


grayscale image pixels are unsigned 8-bit values (0 to 255). color image, have 3 primary color channels (Red, Green, Blue).

CV_8U - 1-byte pixel images. Letter U means unsigned. As opposed to S.

For a color image - specify 3 channels (CV_8UC3).

Declare signed / unsigned integers /floats of size 8/ 16 / 32 : CV_16SC3, CV_32F

to access use a class derived from template class cv::Vec<T,N> N can be 2/3/4. postfix letter: b for byte, s for short, i for int, f for float, and d for double. -eg cv::Vec3b - vector of 3 unsigned chars.

Image Iteration - Accessing Pixel Values

Column first matrix

if (image.channels() == 1) { // gray-level image

    image.at<uchar>(j,i)= 255; // white

} else if (image.channels() == 3) { // BGR color image channels

    image.at<Vec3b>(j,i)[0]= blue;

    image.at<Vec3b>(j,i)[1]= green;

    image.at<Vec3b>(j,i)[2]= red;

}

Eg

  • simulate salt-and-pepper noise – replace random pixels with 0,0,0 or 255,255,255

  • sharpen: subtract the Laplacian from an image amplifies edges

iterate in 2 loops and use cv::Mat::at<T>(int y, int x) - random access to image pixel data

if matrix type is known, use cv::Mat_ template subclass of cv::Mat -No need for casting, operator() provides direct access to matrix elements.

cv::Mat_<uchar> image2= image; // im2 refers to image

image2(50,100)= 0; // access to row 50 and column 100

cv::MatIterator_ - STL style collection item pointer. underscore indicates it is a template. iterator returns a cv::Vec3b for a color image.

in-place transforms – directly applied to the input image, dont require an output image. Use cv::Mat::clone for a deep copy. For some algos, when in-place processing is required, specify same image as input and output.

channel element is accessed using cv::Mat::operator[]

cv::Mat::isContinuous - whether image is unpadded → can be accessed as a 1-D array. A shorter loop with few statements is more efficient than a longer loop over a single statement.

cv::filter2D convolution filtering - define a kernel matrix and apply it to an image to specify a computation over a pixel neighborhood. Used in signal processing. cannot be accomplished in-place.

cv::Rect - select a sub-matrix – define a region of interest to transform only a portion of an image - specify x,y, Mat::cols / rows or 2 vals cv::Range , Mat::rowRange / colRange

Color Space

OpenCV uses BGR channel order, padded to be multiples of 8.

RGB is not a perceptually uniform color space. Consider -CIE L*a*b*

cv::cvtColor. - Convert to a different color space eg CV_BGR2Lab

YCrCb - color space used in JPEG compression CV_BGR2YCrCb.

The HSV , HLS color spaces decompose the colors into their hue and saturation components, plus the value or luminance component, which is a more natural way for humans to describe colors.

can also convert color images to gray-level. 1-channel image: CV_BGR2Gray

An image of width W and height H requires a memory block of WxHx3 uchars.

Splitting image channels - process the different channels of an image independently – eg perform an operation only on one channel of the image. cv::split - copy 3 channels of a color image into 3 distinct cv::Mat instances. cv::merge – reverse operation

cv::saturate_cast - clamp range of permitted pixel values

Color Reduction

Reduce the complexity of an analysis by reducing #colors in an image. A 3-channel 8-bit pixel image == 256x256x256 ==16 million colors. Reduce by 8 --> 32x32x32== 32 thousand colors.

double iteration - for each pixel in the image and for each channel of this pixel.

  • integer division - divide the value by N (floors division result to the nearest lower integer), then multiply by N to get floor and add N/2 to get centroid: data[i]= data[i]/div*div + div/2;

  • using the modulo operator - slower - requires reading each pixel value twice: data[i]= data[i] – data[i]%div + div/2;

  • bitwise shift - reduction factor must be a power of 2. mask the first n bits of the pixel value by a bit shift – efficient but error prone: data[i]= (data[i] & (0xFF<<n)) + div/2;


Image Matrix Linear Algebra

Images matrices that have the same size and type can be combined using arithmetic operations. 2 input matrices, 1 output matrix + optional weight, optional mask (op is performed only on pixels for which mask value is not null )

  • Adding images - cv::add, cv::addWeighted (weighted sum), cv::scaleAdd (add scalar value).

  • cv::subtract,cv::absdiff, cv::multiply, , cv::divide

  • Bit-wise operators: cv::bitwise_and, cv::bitwise_or, cv::bitwise_xor, cv::bitwise_not.

  • cv::min / cv::max - find optimal pixel value

  • single image functions: cv::sqrt, cv::pow,cv::abs, cv::cuberoot, cv::exp, cv::log.

cv::Mat supports C++ arithmetic operator overloads - more compact form , easier to read: arithmetic, comparison , bitwise operators , linear algebra operators( matrix inversion cv::Mat::inv, transpose cv::Mat::t, determinant cv::Mat::ideterminant, vector norm, cv::Mat::norm, cross-product cv::Mat::cross(v), dot product cv::Mat::dot(v)


Camera Calibration


Use a set of scene points of known 3D positions (usually corners of each square of a chessboard pattern), then determine where on the image these points project. Take several images from different viewpoints of a scene point set, computing the position of each camera view.

cv::findChessboardCorners - show a chessboard pattern to camera from different viewpoints. automatically detects the corners, given their count

cv::drawChessboardCorners - draw detected square corners on the chessboard image with lines connecting them

cv::cornerSubPix - Get subpixel accuracy on the corners to obtain a more accurate image point location

cv::TermCriteria - defines termination criterion - a maximum number of iterations and a minimum accuracy in sub-pixel coordinates. rule of thumb: 10 to 20 chessboard images taken from different viewpoints at different depths is sufficient. 2 outputs of this function: 1) camera matrix 2) distortion parameters.

cv::initUndistortRectifyMap - remove distortion in an image (after calibration)

cv::remap - remaps all of the points of an input image to a new image.


Video Processing

Video is just a sequence of images (frames), taken at regular time intervals (frame rate).

Can read, process, and store video sequences - once the individual frames of a video sequence have been extracted, regular image processing functions can be applied to them.

Temporal analysis - compare adjacent frames to track objects, or cumulate image statistics over time in order to extract foreground objects.

cv::VideoCapture object - frame extraction from video file via codec, or directly from camera driver output. Methods: \

  • isOpened

  • get – query properties eg CV_CAP_PROP_FRAME_COUNT , CV_CAP_PROP_FRAME_RATE, CV_CAP_PROP_FOURCC (the codec type)

  • read or operator>> – read a frame into a cv::Mat

  • release

  • set – set properties – eg go to a specific frame position number via CV_CAP_PROP_POS_FRAMES , set position in milliseconds via CV_CAP_PROP_POS_MSEC, set relative position between the beginning and end of video via CV_CAP_PROP_POS_AVI_RATIO

  • grab + retrieve – decompose read into separate ops

video processing chain: read an input video stream, process its frames using a callback function for each frame of a video sequence, and then write result frames back to a video file

cv::VideoWriter class for writing video sequences to a file – methods:

  • open – specify filename, codec, framereate, framesize, color. codec is specified using a standard 4-char code.

  • write or operator<< – call for each frame


An Example of a Naive Calculation

problem: identify all image pixels that are close to a target color.

Iterate over pixels, comparing color with the target color - check if distance is within a min distance tolerance - If true then set the output image pixel to 255, else set it to 0.

the distance from the target color is the euclidean distance between the colors of the 2 vec3 pixels - sum the absolute difference of the 3 BGR values:

auto d = abs(pixelColor[0]-targetColor[0])

    + abs(pixelColor[1]-targetColor[1])

    + abs(pixelColor[2]-targetColor[2]);

compute the same euclidean norm of a vector using optimized OpenCV math functions:

auto norm = static_cast<int>(cv::norm<int,3>(cv::Vec3i(

    pixelColor[0]-targetColor[0],

    pixelColor[1]-targetColor[1],

    pixelColor[2]-targetColor[2])));


Conclusion

In future blog posts I will examine applying more efficient mechanisms such as histograms, filters, and look at techniques – such as edge detections and object recognition.

Posted on Monday, February 23, 2015 3:39 PM Graphics Programming , Artificial Intelligence , C++ | Back to top


Comments on this post: Getting Started with OpenCV

# re: Getting Started with OpenCV
Requesting Gravatar...
Thanks for this information. This really helps a lot in understanding the process. - Mark Zokle
Left by George Thomas on Dec 20, 2016 6:01 PM

Your comment:
 (will show your gravatar)


Copyright © JoshReuben | Powered by: GeeksWithBlogs.net