問題描述
我有一個校準的相機(內在矩陣和失真系數),我想知道相機的位置,知道圖像中的一些 3d 點及其對應點(2d 點).
I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points).
我知道 cv::solvePnP
可以幫助我,并且在閱讀了 this 和this 我知道solvePnP rvec
和tvec
的輸出是對象在相機坐標系中的旋轉和平移.
I know that cv::solvePnP
could help me, and after reading this and this I understand that I the outputs of solvePnP rvec
and tvec
are the rotation and translation of the object in camera coordinate system.
所以我需要在世界坐標系中找出相機的旋轉/平移.
So I need to find out the camera rotation/translation in the world coordinate system.
從上面的鏈接看來,代碼很簡單,在 python 中:
From the links above it seems that the code is straightforward, in python:
found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs)
rotM = cv2.Rodrigues(rvec)[0]
cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)
我不知道 python/numpy 的東西(我使用的是 C++)但這對我來說沒有多大意義:
I don't know python/numpy stuffs (I'm using C++) but this does not make a lot of sense to me:
- rvec, tvec 從solvePnP 輸出是3x1 矩陣,3 個元素向量
- cv2.Rodrigues(rvec) 是一個 3x3 矩陣
- cv2.Rodrigues(rvec)[0] 是一個 3x1 矩陣,3 個元素向量
- cameraPosition 是一個 3x1 * 1x3 矩陣乘法,它是一個.. 3x3 矩陣.如何通過簡單的
glTranslatef
和glRotate
調用在 opengl 中使用它?
- rvec, tvec output from solvePnP are 3x1 matrix, 3 element vectors
- cv2.Rodrigues(rvec) is a 3x3 matrix
- cv2.Rodrigues(rvec)[0] is a 3x1 matrix, 3 element vectors
- cameraPosition is a 3x1 * 1x3 matrix multiplication that is a.. 3x3 matrix. how can I use this in opengl with simple
glTranslatef
andglRotate
calls?
推薦答案
如果用世界坐標"表示對象坐標",則必須得到 pnp 算法給出的結果的逆變換.
If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm.
有一個反轉變換矩陣的技巧,它允許您保存反轉操作,這通常很昂貴,并且解釋了 Python 中的代碼.給定一個變換 [R|t]
,我們有 inv([R|t]) = [R'|-R'*t]
,其中 R'
是 R
的轉置.因此,您可以編寫代碼(未經測試):
There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation [R|t]
, we have that inv([R|t]) = [R'|-R'*t]
, where R'
is the transpose of R
. So, you can code (not tested):
cv::Mat rvec, tvec;
solvePnP(..., rvec, tvec, ...);
// rvec is 3x1, tvec is 3x1
cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3
R = R.t(); // rotation of inverse
tvec = -R * tvec; // translation of inverse
cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4
T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T
T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T
// T is a 4x4 matrix with the pose of the camera in the object frame
更新:稍后,要將 T
與 OpenGL 一起使用,您必須牢記 OpenCV 和 OpenGL 的相機框架軸不同.
Update: Later, to use T
with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.
OpenCV 使用計算機視覺中常用的引用:X 指向右側,Y 向下,Z 指向前面(如 這張圖片).OpenGL 中相機的框架是:X 指向右側,Y 向上,Z 指向后(如 這張圖片).因此,您需要繞 X 軸旋轉 180 度.這個旋轉矩陣的公式在維基百科.
OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.
// T is your 4x4 matrix in the OpenCV frame
cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X
cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame
這些轉換總是令人困惑,我可能在某些步驟上是錯的,所以請持保留態度.
These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.
最后,考慮到 OpenCV 中的矩陣以行優先順序存儲在內存中,而 OpenGL 中的矩陣以列優先順序存儲.
Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.
這篇關于來自 cv::solvePnP 的世界坐標中的相機位置的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!