『壹』 運動目標檢測——光流法與opencv代碼實現
運動目標的檢測的其主要目的是 獲取目標對象的運動參數(位置、速度、加速度等)及運動軌跡 ,通過進一步分析處理,實現對目標行為更高層級上的理解。
運動目標檢測技術目的是 從序列圖像中將變化區域從背景圖像中提取出來 ,常用於視頻監視、圖像壓縮、三維重構、異常檢測等。
運動目標檢測主流方法有幀差法、背景差法、光流法等。光流法源於 仿生學 思想,更貼近於直覺,大量昆蟲的視覺機理便是基於光流法。
二十世紀五十年代心理學家Gibson在他的著作「The Perception of Visual World」中首次提出了以心理學實驗為基礎的光流法基本概念,而直到八十年代才由Horn、Kanade、Lucash和Schunck創造性地將灰度與二維速度場相聯系,引入光流約束方程的演算法,對光流計算做了奠基性的工作。
光流(optical flow):由於目標對象或者攝像機的移動造成的圖像對象在連續兩幀圖像中的移動。
小球在連續五幀運動構成的光流 小球在連續五幀運動構成的光流通俗說,對於一個圖片序列,把每張圖像每個像素在連續幀之間的運動速度和方向( 某像素點在連續兩幀上的位移矢量 )找出來就是光流場。
第t幀的時A點的位置是(x1, y1),第t+1幀時A點位置是(x2,y2),則像素點A的位移矢量:(ux, vy) = (x2, y2) - (x1,y1)
如何知道第t+1幀的時候A點的位置涉及到不同的光流計算方法,主要有四種:基於梯度的方法、基於匹配的方法、基於能量的方法、基於相位的方法。
光流法依賴於三個假設:
根據所形成的光流場中 二維矢量的疏密程度 ,光流法可分為稠密光流與稀疏光流。
基於區域匹配生成的稠密光流場 基於區域匹配生成的稠密光流場 稀疏光流只對有 明顯特徵的組點 (如角點)進行跟蹤,計算開銷小。
http://www.opencv.org.cn/opencvdoc/2.3.2/html/moles/video/doc/motion_analysis_and_object_tracking.html#calcopticalflowfarneback
(1)calcOpticalFlowPyrLK
基於金字塔LK光流演算法,計算某些點集的稀疏光流。
參考論文《Pyramidal Implementation of the Lucas Kanade Feature TrackerDescription of the algorithm》
(2)calcOpticalFlowFarneback
基於Gunnar Farneback 的演算法計算稠密光流。
參考論文《Two-Frame Motion Estimation Based on PolynomialExpansion》
(3)CalcOpticalFlowBM
通過塊匹配的方法來計算光流
(4)CalcOpticalFlowHS
基於Horn-Schunck 的演算法計算稠密光流。
參考論文《Determining Optical Flow》
(5)calcOpticalFlowSF
論文《SimpleFlow: A Non-iterative, Sublinear Optical FlowAlgo》的實現
『貳』 視頻檢索的智能視頻
智能視頻處理成為視頻監控的「救命稻草」
智能視頻源自計算機視覺技術,計算機視覺技術是人工智慧研究的分支之一,它能夠在圖像及圖像內容描述之間建立映射關系,從而使計算機能夠通過數字圖像處理和分析來有限理解視頻畫面中的內容。運用智能視頻分析技術,當系統發現符合某種規則的行為(如定向運動、越界、游盪、遺留等)發生時,自動向監控系統發出報警信號(如聲光報警),提示相關工作人員及時處理可疑事件。
智能視頻演算法的實現
目前,智能視頻技術實現對移動目標的實時檢測、識別、分類以及多目標跟蹤等功能的主要演算法分為以下五類:目標檢測、目標跟蹤、目標識別、行為分析、基於內容的視頻檢索和數據融合等。 目標檢測(Object Detection)是按一定時間間隔從視頻圖像中抽取像素,採用軟體技術來分析數字化的像素,將運動物體從視頻序列中分離出來。運動目標檢測技術是智能化分析的基礎。常用的目標檢測技術可以分為背景減除法(Background Subtraction)、時間差分法(Temporal Difference)和光流法(Optic Flow)三類。
背景減除法利用當前圖像與背景圖像的差分檢測運動區域。背景減除法假設視頻場景中有一個背景,而背景和前景並未給出嚴格定義,背景在實際使用中是變化的,所以背景建模是背景減除法中非常關鍵的一步。常用的背景建模方法有時間平均法、自適應更新法、高斯模型等。背景減除法能夠提供相對來說比較完全的運動目標特徵數據,但對於動態場景的變化,如光線照射情況、攝像機抖動和外來無關事件的干擾特別敏感。
時間差分法充分利用了視頻圖像的時域特徵,利用相鄰幀圖像的相減來提取出前景移動目標的信息。該方法對於動態環境具有較強的自適應性,不對場景做任何假設,但一般不能完全提取出所有相關的特徵像素點,在運動實體內部容易產生空洞現象,只能夠檢測到目標的邊緣。當運動目標停止時,一般時間差分法便失效。 光流法通過比較連續幀為每個圖像中的像素賦予一個運動矢量從而分割出運動物體。
光流法能夠在攝像機運動的情況下檢測出獨立的運動目標,然而光流法運算復雜度高並且對雜訊很敏感,所以在沒有專門硬體支持下很難用於實時視頻流檢測中。 目標跟蹤(Object Tracking)演算法根據不同的分類標准,有著以下兩種分類方法:根據目標跟蹤與目標檢測的時間關系分類和根據目標跟蹤的策略分類。 根據目標跟蹤與目標檢測的時間關系的分類有三種:
一是先檢測後跟蹤(Detect before Track),先檢測每幀圖像上的目標,然後將前後兩幀圖像上目標進行匹配,從而達到跟蹤的目的。這種方法可以藉助很多圖像處理和數據處理的現有技術,但是檢測過程沒有充分利用跟蹤過程提供的信息。
二是先跟蹤後檢測(Track before Detect),先對目標下一幀所在的位置及其狀態進行預測或假設,然後根據檢測結果來矯正預測值。這一思路面臨的難點是事先要知道目標的運動特性和規律。三是邊檢測邊跟蹤(Track while Detect),圖像序列中目標的檢測和跟蹤相結合,檢測要利用跟蹤來提供處理的對象區域,跟蹤要利用檢測來提供目標狀態的觀察數據。
根據目標跟蹤的策略來分類,通常可分為3D方法和2D方法。相對3D方法而言,2D方法速度較快,但對於遮擋問題難以處理。基於運動估計的跟蹤是最常用的方法之一。 目標識別(Object Recognize)利用物體顏色、速度、形狀、尺寸等信息進行判別,區分人、交通工具和其他對象。目標識別常用人臉識別和車輛識別。
視頻人臉識別的通常分為四個步驟:人臉檢測、人臉跟蹤、特徵提取和比對。人臉檢測指在動態的場景與復雜的背景中判斷是否存在面像,並分離出這種面像。人臉跟蹤指對被檢測到的面貌進行動態目標跟蹤。常用方法有基於模型的方法、基於運動與模型相結合的方法、膚色模型法等。
人臉特徵提取方法歸納起來分為三類:第一類是基於邊緣、直線和曲線的基本方法;第二類是基於特徵模板的方法;第三類是考慮各種特徵之間幾何關系的結構匹配法。單一基於局部特徵的提取方法在處理閉眼、眼鏡和張嘴等情景時遇到困難,相對而言,基於整體特徵統計的方法對於圖像亮度和特徵形變的魯棒性更強。人臉比對是將抽取出的人臉特徵與面像庫中的特徵進行比對,並找出最佳的匹配對象。
車輛識別主要分為車牌照識別、車型識別和車輛顏色識別等,應用最廣泛和技術較成熟的是車牌照識別。 車牌照識別的步驟分別為:車牌定位、車牌字元分割、車牌字元特徵提取和車牌字元識別。
車牌定位是指從車牌圖像中找到車牌區域並把其分離出來。字元分割是將漢字、英文字母和數字字元從牌照中提取出來。車牌特徵提取的基本任務是從眾多特徵中找出最有效的特徵,常用的方法有逐像素特徵提取法、骨架特徵提取法、垂直水平方向數據統計特徵提取法、特徵點提取法和基於統計特徵的提取法。車牌字元識別可以使用貝葉斯分離器、支持向量機(SVM)和神經網路分類器(NNC)等演算法。 基於內容的圖像檢索技術是由用戶提交檢索樣本,系統根據樣本對象的底層物理特徵生成特徵集,然後在視頻庫中進行相似性匹配,得到檢索結果的過程。現有基於內容的檢索方法主要分為:基於顏色的檢索方法、基於形狀的檢索方法和基於紋理的檢索方法等。數據融合是將來自不同視頻源的數據進行整合,以獲得更豐富的數據分析結果。
『叄』 如何使用opencv實現金字塔光流lk跟蹤演算法
#include <stdio.h>
#include <windows.h>
#include "cv.h"
#include "cxcore.h"
#include "highgui.h"
#include <opencv2\opencv.hpp>
using namespace cv;
static const double pi = 3.14159265358979323846;
inline static double square(int a)
{
return a * a;
}
/*該函數目的:給img分配內存空間,並設定format,如位深以及channel數*/
inline static void allocateOnDemand(IplImage **img, CvSize size, int depth, int channels)
{
if (*img != NULL) return;
*img = cvCreateImage(size, depth, channels);
if (*img == NULL)
{
fprintf(stderr, "Error: Couldn't allocate image. Out of memory?\n");
exit(-1);
}
}
/*主函數,原程序是讀取avi視頻文件,然後處理,我簡單改成從攝像頭直接讀取數據*/
int main(int argc, char *argv[])
{
//讀取攝像頭
VideoCapture cap(0);
//讀取視頻文件
//VideoCapture cap; cap.open("optical_flow_input.avi");
if (!cap.isOpened())
{
return -1;
}
Mat frame;
/*
bool stop = false;
while (!stop)
{
cap >> frame;
// cvtColor(frame, edges, CV_RGB2GRAY);
// GaussianBlur(edges, edges, Size(7, 7), 1.5, 1.5);
// Canny(edges, edges, 0, 30, 3);
// imshow("當前視頻", edges);
imshow("當前視頻", frame);
if (waitKey(30) >= 0)
stop = true;
}
*/
//CvCapture *input_video = cvCaptureFromFile( "optical_flow_input.avi" );
//cv::VideoCapture cap = *(cv::VideoCapture *) userdata;
//if (input_video == NULL)
// {
// fprintf(stderr, "Error: Can't open video device.\n");
// return -1;
// }
/*先讀取一幀,以便得到幀的屬性,如長、寬等*/
//cvQueryFrame(input_video);
/*讀取幀的屬性*/
CvSize frame_size;
frame_size.height = cap.get(CV_CAP_PROP_FRAME_HEIGHT);
frame_size.width = cap.get(CV_CAP_PROP_FRAME_WIDTH);
/*********************************************************/
/*用於把結果寫到文件中去,非必要
int frameW = frame_size.height; // 744 for firewire cameras
int frameH = frame_size.width; // 480 for firewire cameras
VideoWriter writer("VideoTest.avi", -1, 25.0, cvSize(frameW, frameH), true);
/*開始光流法*/
//VideoWriter writer("VideoTest.avi", CV_FOURCC('D', 'I', 'V', 'X'), 25.0, Size(640, 480), true);
while (true)
{
static IplImage *frame = NULL, *frame1 = NULL, *frame1_1C = NULL,
*frame2_1C = NULL, *eig_image = NULL, *temp_image = NULL,
*pyramid1 = NULL, *pyramid2 = NULL;
Mat framet;
/*獲取第一幀*/
// cap >> framet;
cap.read(framet);
Mat edges;
//黑白抽象濾鏡模式
// cvtColor(framet, edges, CV_RGB2GRAY);
// GaussianBlur(edges, edges, Size(7, 7), 1.5, 1.5);
// Canny(edges, edges, 0, 30, 3);
//轉換mat格式到lpiimage格式
frame = &IplImage(framet);
if (frame == NULL)
{
fprintf(stderr, "Error: Hmm. The end came sooner than we thought.\n");
return -1;
}
/*由於opencv的光流函數處理的是8位的灰度圖,所以需要創建一個同樣格式的
IplImage的對象*/
allocateOnDemand(&frame1_1C, frame_size, IPL_DEPTH_8U, 1);
/* 把攝像頭圖像格式轉換成OpenCV慣常處理的圖像格式*/
cvConvertImage(frame, frame1_1C, 0);
/* 我們需要把具有全部顏色信息的原幀保存,以備最後在屏幕上顯示用*/
allocateOnDemand(&frame1, frame_size, IPL_DEPTH_8U, 3);
cvConvertImage(frame, frame1, 0);
/* 獲取第二幀 */
//cap >> framet;
cap.read(framet);
// cvtColor(framet, edges, CV_RGB2GRAY);
// GaussianBlur(edges, edges, Size(7, 7), 1.5, 1.5);
// Canny(edges, edges, 0, 30, 3);
frame = &IplImage(framet);
if (frame == NULL)
{
fprintf(stderr, "Error: Hmm. The end came sooner than we thought.\n");
return -1;
}
/*原理同上*/
allocateOnDemand(&frame2_1C, frame_size, IPL_DEPTH_8U, 1);
cvConvertImage(frame, frame2_1C, 0);
/*********************************************************
開始shi-Tomasi演算法,該演算法主要用於feature selection,即一張圖中哪些是我
們感興趣需要跟蹤的點(interest point)
input:
* "frame1_1C" 輸入圖像.
* "eig_image" and "temp_image" 只是給該演算法提供可操作的內存區域.
* 第一個".01" 規定了特徵值的最小質量,因為該演算法要得到好的特徵點,哪就
需要一個選擇的閾值
* 第二個".01" 規定了像素之間最小的距離,用於減少運算復雜度,當然也一定
程度降低了跟蹤精度
* "NULL" 意味著處理整張圖片,當然你也可以指定一塊區域
output:
* "frame1_features" 將會包含fram1的特徵值
* "number_of_features" 將在該函數中自動填充上所找到特徵值的真實數目,
該值<= 400
**********************************************************/
/*開始准備該演算法需要的輸入*/
/* 給eig_image,temp_image分配空間*/
allocateOnDemand(&eig_image, frame_size, IPL_DEPTH_32F, 1);
allocateOnDemand(&temp_image, frame_size, IPL_DEPTH_32F, 1);
/* 定義存放frame1特徵值的數組,400隻是定義一個上限 */
CvPoint2D32f frame1_features[400];
int number_of_features = 400;
/*開始跑shi-tomasi函數*/
cvGoodFeaturesToTrack(frame1_1C, eig_image, temp_image,
frame1_features, &number_of_features, .01, .01, NULL);
/**********************************************************
開始金字塔Lucas Kanade光流法,該演算法主要用於feature tracking,即是算出
光流,並跟蹤目標。
input:
* "frame1_1C" 輸入圖像,即8位灰色的第一幀
* "frame2_1C" 第二幀,我們要在其上找出第一幀我們發現的特徵點在第二幀
的什麼位置
* "pyramid1" and "pyramid2" 是提供給該演算法可操作的內存區域,計算中間
數據
* "frame1_features" 由shi-tomasi演算法得到的第一幀的特徵點.
* "number_of_features" 第一幀特徵點的數目
* "optical_flow_termination_criteria" 該演算法中迭代終止的判別,這里是
epsilon<0.3,epsilon是兩幀中對應特徵窗口的光度之差的平方,這個以後的文
章會講
* "0" 這個我不知道啥意思,反正改成1就出不來光流了,就用作者原話解釋把
means disable enhancements. (For example, the second array isn't
pre-initialized with guesses.)
output:
* "frame2_features" 根據第一幀的特徵點,在第二幀上所找到的對應點
* "optical_flow_window" lucas-kanade光流演算法的運算窗口,具體lucas-kanade
會在下一篇詳述
* "5" 指示最大的金字塔層數,0表示只有一層,那就是沒用金字塔演算法
* "optical_flow_found_feature" 用於指示在第二幀中是否找到對應特徵值,
若找到,其值為非零
* "optical_flow_feature_error" 用於存放光流誤差
**********************************************************/
/*開始為pyramid lucas kanade光流演算法輸入做准備*/
CvPoint2D32f frame2_features[400];
/* 該數組相應位置的值為非零,如果frame1中的特徵值在frame2中找到 */
char optical_flow_found_feature[400];
/* 數組第i個元素表對應點光流誤差*/
float optical_flow_feature_error[400];
/*lucas-kanade光流法運算窗口,這里取3*3的窗口,可以嘗試下5*5,區別就是5*5
出現aperture problem的幾率較小,3*3運算量小,對於feature selection即shi-tomasi演算法來說足夠了*/
CvSize optical_flow_window = cvSize(5, 5);
// CvSize optical_flow_window = cvSize(5, 5);
/* 終止規則,當完成20次迭代或者當epsilon<=0.3,迭代終止,可以嘗試下別的值*/
CvTermCriteria optical_flow_termination_criteria= cvTermCriteria(CV_TERMCRIT_ITER | CV_TERMCRIT_EPS, 20, .3);
/*分配工作區域*/
allocateOnDemand(&pyramid1, frame_size, IPL_DEPTH_8U, 1);
allocateOnDemand(&pyramid2, frame_size, IPL_DEPTH_8U, 1);
/*開始跑該演算法*/
cvCalcOpticalFlowPyrLK(frame1_1C, frame2_1C, pyramid1, pyramid2,frame1_features, frame2_features, number_of_features,
optical_flow_window, 5, optical_flow_found_feature,optical_flow_feature_error, optical_flow_termination_criteria, 0);
/*畫光流場,畫圖是依據兩幀對應的特徵值,
這個特徵值就是圖像上我們感興趣的點,如邊緣上的點P(x,y)*/
for (int i = 0; i< number_of_features; i++)
{
/* 如果沒找到對應特徵點 */
if (optical_flow_found_feature[i] == 0)
continue;
int line_thickness;
line_thickness = 1;
/* CV_RGB(red, green, blue) is the red, green, and blue components
* of the color you want, each out of 255.
*/
CvScalar line_color;
line_color = CV_RGB(255, 0, 0);
/*畫箭頭,因為幀間的運動很小,所以需要縮放,不然看不見箭頭,縮放因子為3*/
CvPoint p, q;
p.x = (int)frame1_features[i].x;
p.y = (int)frame1_features[i].y;
q.x = (int)frame2_features[i].x;
q.y = (int)frame2_features[i].y;
double angle;
angle = atan2((double)p.y - q.y, (double)p.x - q.x);
double hypotenuse;
hypotenuse = sqrt(square(p.y - q.y) + square(p.x - q.x));
/*執行縮放*/
q.x = (int)(p.x - 5 * hypotenuse * cos(angle));
q.y = (int)(p.y - 5 * hypotenuse * sin(angle));
/*畫箭頭主線*/
/* "frame1"要在frame1上作畫.
* "p" 線的開始點.
* "q" 線的終止點.
* "CV_AA" 反鋸齒.
* "0" 沒有小數位.
*/
cvLine(frame1, p, q, line_color, line_thickness, CV_AA, 0);
/* 畫箭的頭部*/
p.x = (int)(q.x + 9 * cos(angle + pi / 4));
p.y = (int)(q.y + 9 * sin(angle + pi / 4));
cvLine(frame1, p, q, line_color, line_thickness, CV_AA, 0);
p.x = (int)(q.x + 9 * cos(angle - pi / 4));
p.y = (int)(q.y + 9 * sin(angle - pi / 4));
cvLine(frame1, p, q, line_color, line_thickness, CV_AA, 0);
}
/*顯示圖像*/
/*創建一個名為optical flow的窗口,大小自動改變*/
cvNamedWindow("Optical Flow", CV_WINDOW_NORMAL);
cvFlip(frame1, NULL, 2);
cvShowImage("Optical Flow", frame1);
/*延時,要不放不了*/
cvWaitKey(33);
/*寫入到文件中去*/
// cv::Mat m = cv::cvarrToMat(frame1);//轉換lpimgae到mat格式
// writer << m;//opencv3.0 version writer
}
cap.release();
cvWaitKey(33);
system("pause");
}
『肆』 用C#可以寫光流法嗎
自己實現的一個光流演算法,通過模式搜索匹配的方式計算相鄰兩張圖片的平移量。
模式匹配:選擇方塊模式或者X形模式,在兩個圖片中,將該模式像素灰度值做差並求和,相差最小的認為上最匹配的。
多模式匹配:在圖片中選擇多個位置,檢索最符合的位置,最後將多個位置的匹配結果作平均值。
經過測試,在草地、柏油路面、地毯等非規則圖形的粗糙表面上表現良好。
// optical flow use Multi-Pattern-Match algrithm, use the big inner diff pattern to do multi pattern match, then averge the result// already implemented: square-pattern, X-patternclass COpticalFlow_MPM
{public:
COpticalFlow_MPM(){} virtual ~COpticalFlow_MPM(){} static bool AddImplementation(COpticalFlow_MPM* imp)
{ if(m_impNum < c_maxImpNum){
m_impTbl[m_impNum++] = imp; return true;
} return false;
} static void SetImageDimesion(int width, int height, int lineBytes)
{ for(int i = 0; i < m_impNum; ++i){
m_impTbl[i]->m_width = width;
m_impTbl[i]->m_height = height;
m_impTbl[i]->m_lineBytes = lineBytes;
m_impTbl[i]->GenerateSearchTable();
m_impTbl[i]->GeneratePatternTable();
}
} // auto choose the pattern to do optical flow
static void AutoOpticalFlow(uint8_t* image1, uint8_t* image2)
{
m_impTbl[m_impCurr]->calcOpticalFlow(image1, image2); // check if need switch pattern
static int s_goodCount = 0; static int s_badCount = 0; if(m_quality > 0){
s_goodCount++;
}else{
s_badCount++;
} if(s_goodCount + s_badCount > 30){ if(s_badCount * 2 > s_goodCount){
m_impCurr = m_impCurr < (m_impNum - 1) ? m_impCurr + 1 : 0;
}
s_goodCount = s_badCount = 0;
}
} // the result
static uint8_t m_quality; // 0 ~ 255, 0 means the optical flow is invalid.
static float m_offset_x; // unit is pixel
static float m_offset_y;protected: virtual const char* Name() = 0; virtual void GeneratePatternTable() = 0; // prepare the address offset tables, that can make the calculation simple and fast.
void GenerateSearchTable()
{ // generate the search offset from corresponding location to the max distance
int index = 0; int yNum, ay[2]; for (int dist = 1; dist <= c_searchD; ++dist){ for (int x = -dist; x <= dist; ++x){ // for each x, only have 1 or 2 dy choices.
ay[0] = dist - abs(x); if (ay[0] == 0){
yNum = 1;
} else{
yNum = 2;
ay[1] = -ay[0];
} for (int iy = 0; iy < yNum; ++iy){
m_searchOffsets[index++] = ay[iy] * m_lineBytes + x;
}
}
} // generate the watch points.
index = 0; int center = m_width * m_height / 2 + m_width / 2; for (int y = -c_watchN; y <= c_watchN; ++y){ for (int x = -c_watchN; x <= c_watchN; ++x){
m_watchPoints[index++] = center + y * c_watchG * m_lineBytes + x * c_watchG * m_width / m_height;
}
}
} void ResetResult()
{
m_quality = 0;
m_offset_x = 0;
m_offset_y = 0;
} void calcOpticalFlow(uint8_t* image1, uint8_t* image2)
{
ResetResult(); int betterStart; int matchedOffset; int x1, y1, x2, y2; int matchedCount = 0; int offset_x[c_watchS]; int offset_y[c_watchS]; for (int i = 0; i < c_watchS; ++i){ if (SearchMaxInnerDiff(image1, m_watchPoints[i], betterStart)){
int32_t minDiff = SearchBestMatch(image1 + betterStart, m_patternOffsets, c_patternS, image2, betterStart, matchedOffset); if (minDiff < c_patternS * c_rejectDiff){
x1 = betterStart % m_lineBytes; y1 = betterStart / m_lineBytes;
x2 = matchedOffset % m_lineBytes; y2 = matchedOffset / m_lineBytes;
m_offset_x += (x2 - x1);
m_offset_y += (y2 - y1);
offset_x[matchedCount] = (x2 - x1);
offset_y[matchedCount] = (y2 - y1);
matchedCount++;
}
}
} if (matchedCount >= 4){
m_offset_x /= matchedCount;
m_offset_y /= matchedCount; // calculate the variance, and use the variance to get the quality.
float varX = 0, varY = 0; for (int i = 0; i < matchedCount; ++i){
varX += (offset_x[i] - m_offset_x) * (offset_x[i] - m_offset_x);
varY += (offset_y[i] - m_offset_y) * (offset_y[i] - m_offset_y);
}
varX /= (matchedCount - 1);
varY /= (matchedCount - 1); float varMax = varX > varY ? varX : varY;
m_quality = (uint8_t)(varMax > 2 ? 0 : (2-varMax) * 255 / 2); if(m_quality == 0){
ResetResult();
}
}
} // get the pattern inner diff, the pattern is center of the area.
inline int32_t InnerDiff(const uint8_t* center, const int* patternPoints, const int patternSize)
{
int32_t sum = 0;
int32_t mean = 0; for (int i = 0; i < patternSize; ++i){
sum += center[patternPoints[i]];
}
mean = sum / patternSize;
int32_t sumDiff = 0; for (int i = 0; i < patternSize; ++i){
sumDiff += abs(center[patternPoints[i]] - mean);
} return sumDiff;
} // get the sum diff between two pattern, the pattern is the center of the area.
inline int32_t PatternDiff(const uint8_t* center1, const uint8_t* center2, const int* patternPoints, const int patternSize)
{
int32_t sumDiff = 0; for (int i = 0; i < patternSize; ++i){
sumDiff += abs(center1[patternPoints[i]] - center2[patternPoints[i]]);
} return sumDiff;
} // search the max inner diff location, image is the full image begining, the return value searchOffset is base on the image begining.
inline bool SearchMaxInnerDiff(const uint8_t* image, int searchStart, int& betterStart)
{ // if the inner diff is less than this number, cannot use this pattern to do search.
const int c_minInnerDiff = c_patternS * 4; const int c_acceptInnerDiff = c_patternS * 12; const uint8_t* searchCenter = image + searchStart;
int32_t currDiff = InnerDiff(searchCenter, m_patternOffsets, c_patternS);
int32_t maxDiff = currDiff;
betterStart = 0; for (int i = 0; i < c_searchS; ++i){
currDiff = InnerDiff(searchCenter + m_searchOffsets[i], m_patternOffsets, c_patternS); if (currDiff > maxDiff){
maxDiff = currDiff;
betterStart = m_searchOffsets[i];
} if (maxDiff > c_acceptInnerDiff){ break;
}
} if (maxDiff < c_minInnerDiff){ return false;
}
betterStart += searchStart; return true;
} // get the minnmum pattern diff with the 8 neighbors.
inline int32_t MinNeighborDiff(const uint8_t* pattern)
{ const int32_t threshDiff = c_patternS * c_acceptDiff; // eight neighbors of a pattern
const int neighborOffsets[8] = { -1, 1, -m_lineBytes, m_lineBytes, -m_lineBytes - 1, -m_lineBytes + 1, m_lineBytes - 1, m_lineBytes + 1 }; int minDiff = PatternDiff(pattern, pattern + neighborOffsets[0], m_patternOffsets, c_patternS); if (minDiff < threshDiff){ return minDiff;
} int diff; for (int i = 1; i < 8; ++i){
diff = PatternDiff(pattern, pattern + neighborOffsets[i], m_patternOffsets, c_patternS); if (diff < minDiff){
minDiff = diff; if (minDiff < threshDiff){ return minDiff;
}
}
} return minDiff;
} // search the pattern that have max min_diff with neighbors, image is the full image begining, the return value betterStart is base on the image begining.
inline bool SearchMaxNeighborDiff(const uint8_t* image, int searchStart, int& betterStart)
{ const uint8_t* searchCenter = image + searchStart;
int32_t currDiff = MinNeighborDiff(searchCenter);
int32_t maxDiff = currDiff;
betterStart = 0; for (int i = 0; i < c_searchS; ++i){
currDiff = MinNeighborDiff(searchCenter + m_searchOffsets[i]); if (currDiff > maxDiff){
maxDiff = currDiff;
betterStart = m_searchOffsets[i];
}
} if (maxDiff <= c_patternS * c_acceptDiff){ return false;
}
betterStart += searchStart; return true;
} // match the target pattern in the image, return the best match quality and matched offset; the pattern is the center, image is the full image begining.
inline int32_t SearchBestMatch(const uint8_t* target, const int* patternPoints, const int patternSize, const uint8_t* image, int searchStart, int& matchedOffset)
{ const int thinkMatchedDiff = patternSize * c_acceptDiff; const uint8_t* searchCenter = image + searchStart; const uint8_t* matched = searchCenter;
int32_t currDiff = PatternDiff(target, matched, patternPoints, patternSize);
int32_t minDiff = currDiff; for (int i = 0; i < c_searchS; ++i){
currDiff = PatternDiff(target, searchCenter + m_searchOffsets[i], patternPoints, patternSize); if (currDiff < minDiff){
minDiff = currDiff;
matched = searchCenter + m_searchOffsets[i];
} if (minDiff < thinkMatchedDiff){ break;
}
}
matchedOffset = matched - image; return minDiff;
} int m_width, m_height, m_lineBytes; static const int c_acceptDiff = 2; // if the average pixel error is less than this number, think already matched
static const int c_rejectDiff = 8; // if the average pixel error is larger than this number, think it's not matched // all address offset to the pattern key location, the size is according to the square pattern.
static const int c_patternN = 3; static const int c_patternS = (2 * c_patternN + 1) * (2 * c_patternN + 1); int m_patternOffsets[c_patternS]; // the offsets to the image start for each seed point, the match is around these seed points.
static const int c_watchN = 2; static const int c_watchS = (2 * c_watchN + 1) * (2 * c_watchN + 1); static const int c_watchG = 30; // The gap of the watch grid in height direction
int m_watchPoints[c_watchS]; // the search offset to the search center, match the pattern from the corresponding location to the max distance. (not include distance 0.)
static const int c_searchD = 10; // search street-distance from the key location
static const int c_searchS = 2 * c_searchD * c_searchD + 2 * c_searchD; int m_searchOffsets[c_searchS]; // The implements table that use various pattern
static int m_impCurr; static int m_impNum; static const int c_maxImpNum = 16; static COpticalFlow_MPM* m_impTbl[c_maxImpNum];
};// save the optical flow resultuint8_t COpticalFlow_MPM::m_quality; // 0 ~ 255, 0 means the optical flow is invalid.float COpticalFlow_MPM::m_offset_x; // unit is pixelfloat COpticalFlow_MPM::m_offset_y;// the implements that use different patternint COpticalFlow_MPM::m_impCurr = 0;int COpticalFlow_MPM::m_impNum = 0;
COpticalFlow_MPM* COpticalFlow_MPM::m_impTbl[COpticalFlow_MPM::c_maxImpNum];// Multi-Pattern-Match-Squareclass COpticalFlow_MPMS : public COpticalFlow_MPM
{public:
COpticalFlow_MPMS(){} virtual ~COpticalFlow_MPMS(){} virtual const char* Name() { return "Square"; }protected: // prepare the address offset tables, that can make the calculation simple and fast.
virtual void GeneratePatternTable()
{ // generate the address offset of the match area to the center of the area.
int index = 0; for (int y = -c_patternN; y <= c_patternN; ++y){ for (int x = -c_patternN; x <= c_patternN; ++x){
m_patternOffsets[index++] = y * m_lineBytes + x;
}
}
}
};// Multi-Pattern-Match-Xclass COpticalFlow_MPMX : public COpticalFlow_MPM
{public:
COpticalFlow_MPMX(){} virtual ~COpticalFlow_MPMX(){} virtual const char* Name() { return "X"; }protected: // prepare the address offset tables, that can make the calculation simple and fast.
virtual void GeneratePatternTable()
{ // generate the address offset of the match area to the center of the area.
int index = 0; int armLen = (c_patternS - 1) / 4; for (int y = -armLen; y <= armLen; ++y){ if(y == 0){
m_patternOffsets[index++] = 0;
}else{
m_patternOffsets[index++] = y * m_lineBytes - y;
m_patternOffsets[index++] = y * m_lineBytes + y;
}
}
}
};static COpticalFlow_MPMS of_mpms;static COpticalFlow_MPMX of_mpmx;void OpticalFlow::init()
{ // set the optical flow implementation table
COpticalFlow_MPM::AddImplementation(&of_mpms);
COpticalFlow_MPM::AddImplementation(&of_mpmx);
COpticalFlow_MPM::SetImageDimesion(m_width, m_height, m_lineBytes);
}
uint32_t OpticalFlow::flow_image_in(const uint8_t *buf, int len, uint8_t *quality, int32_t *centi_pixel_x, int32_t *centi_pixel_y)
{ static uint8_t s_imageBuff1[m_pixelNum]; static uint8_t s_imageBuff2[m_pixelNum]; static uint8_t* s_imagePre = NULL; static uint8_t* s_imageCurr = s_imageBuff1; *quality = 0; *centi_pixel_x = 0; *centi_pixel_y = 0;
memcpy(s_imageCurr, buf, len); // first image
if(s_imagePre == NULL){
s_imagePre = s_imageCurr;
s_imageCurr = s_imageCurr == s_imageBuff1 ? s_imageBuff2 : s_imageBuff1; // switch image buffer
return 0;
}
COpticalFlow_MPM::AutoOpticalFlow(s_imagePre, s_imageCurr); if(COpticalFlow_MPM::m_quality > 0){ *quality = COpticalFlow_MPM::m_quality; *centi_pixel_x = (int32_t)(COpticalFlow_MPM::m_offset_x * 100); *centi_pixel_y = (int32_t)(COpticalFlow_MPM::m_offset_y * 100);
}
s_imagePre = s_imageCurr;
s_imageCurr = s_imageCurr == s_imageBuff1 ? s_imageBuff2 : s_imageBuff1; // switch image buffer
return 0;
}