Camera Calibration: Direct Linear Transform

Credits go to the following video. The rest is my note.

Direct Linear transform (DLT) maps world coords to image coords

(1)x3×1=K3×3R3×3[I3×3|X03×1]X4×1=P3×4X4×1

Given M=6 or more pairs of world coord Xi and image coord xi, figure out intrinsic parameter K, Rotation matrix R, and projection center X0. For each pair, xi=PXi=[ATBTCT]Xi can be rewritten as:

(2)[XiT,0T,xiXiT0T,XiT,yiXiT][ABC]=0

We can stack M of such equation together and get

(3)M2M×12p12×1=02M×1

Now the problem becomes:

Given M, find p^=argmaxppTMTMp, such that |p|=1.

The solution to this is do a SVD of M, and set p=v12. v12 is the eigenvector corresponds to the smallest singular value s12 (all singular values are non-negative). Try proof this yourself. Hint: decompose p into the space defined by vi.

There are two corner cases that there is no solution: 1. All world coords are on the same plane. 2. all world coords and the projection center are locate on a twisted cubic curve.

After this, we get matrix P^.

(4)P^=[H^|h^]=[K^R^|K^R^X0^]

Therefore, X0^=H^1h^. QR decomposition gives a orthogonal matrix times a upper diagonal matrix, while what we need is a upper diagonal matrix times a orthogonal matrix. In order to achieve that, we QT decompose H^1 to get R^1=R^T and K^1

Some final touchups: times K^ with 1/K33^ to make K33^=1. K11^ and K22^ need to be positive, so throw the sign to rotation matrix R^.

Date: 2019-12-24 Tue 00:00