Hemant Vishwakarma: How to estimate the extrinsic matrix of a chessboard image and project it to bird's eye view such it presents pixel size in meters?

I want to generate an Occupancy Grid (OG) like image with a Bird's Eye View (BEV), i.e., each image pixel has a constant unit measure and everything on the final grid is floor (height=0).

I don't know what I'm missing, I'm newbie on the subject and I'm trying to follow a pragmatic step by step to get on the final results. I have spent a huge time on this and I'm still getting poor results. I'd appretiate any help. Thanks.

To get on my desired results, I follow the pipeline:

Estimate the extrinsic matrix with cv2.solvePnP and a chessboard image.
Generate the OG grid XYZ world coordinates (X=right, Y=height, Z=forward).
Project the OG grid XYZ camera coordinates with the extrinsic matrix.
Match the uv image coordinates for the OG grid camera coordinates.
Populate the OG image with the uv pixels.

I have the following intrinsic and distortion matrices that I previously estimated from another 10 chessboard images like the one bellow:

1. Estimate the extrinsic matrix

import numpy as np
import cv2
import matplotlib.pyplot as plt


mtx = np.array([[2029,    0, 2029],
                [   0, 1904, 1485],
                [   0,    0,    1]]).astype(float)

dist = np.array([[-0.01564965,  0.03250585,  0.00142366,  0.00429703, -0.01636045]])

impath = '....'
img = cv2.imread(impath)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
CHECKERBOARD = (5, 8)
ret, corners = cv2.findChessboardCorners(gray, CHECKERBOARD, None)
corners = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)

objp = np.concatenate(
            np.meshgrid(np.arange(-4, 4, 1),
                        0,
                        np.arange(0, 5, 1), 
                        )
        ).astype(float)

objp = np.moveaxis(objp, 0, 2).reshape(-1, 3)

square_size = 0.029
objp *= square_size

ret, rvec, tvec = cv2.solvePnP(objp, corners[::-1], mtx, dist)
print('rvec:', rvec.T)
print('tvec:', tvec.T)

# img_withaxes = cv2.drawFrameAxes(img.copy(), mtx, dist, rvec, tvec, square_size, 3)
# plt.imshow(cv2.resize(img_withaxes[..., ::-1], (800, 600)))


# rvec: [[ 0.15550242 -0.03452503 -0.028686  ]]
# tvec: [[0.03587237 0.44082329 0.62490573]]

R = cv2.Rodrigues(rvec)[0]
RT = np.eye(4)
RT[:3, :3] = R
RT[:3, 3] = tvec.ravel()
RT.round(2)

# array([[-1.  ,  0.03,  0.04,  0.01],
#        [ 0.03,  0.99,  0.15, -0.44],
#        [-0.03,  0.16, -0.99,  0.62],
#        [ 0.  ,  0.  ,  0.  ,  1.  ]])

2. Generate the OG grid XYZ world coordinates (X=right, Y=height, Z=forward).

uv_dims = img.shape[:2] # h, w
grid_dims = (500, 500) # h, w

og_grid = np.concatenate(
                np.meshgrid(
                    np.arange(- grid_dims[0] // 2, (grid_dims[0] + 1) // 2, 1),
                    0, # I want only the floor information, such that height = 0
                    np.arange(grid_dims[1]),
                    1
                    )
                )
og_grid = np.moveaxis(og_grid, 0, 2)

edge_size = .1
og_grid_3dcoords = og_grid * edge_size
print(og_grid_3dcoords.shape)

# (500, 500, 4, 1)

3. Project the OG grid XYZ camera coordinates with the extrinsic matrix.

og_grid_camcoords = (RT @ og_grid_3dcoords.reshape(-1, 4).T)
og_grid_camcoords = og_grid_camcoords.T.reshape(grid_dims + (4,))
og_grid_camcoords /= og_grid_camcoords[..., [2]]
og_grid_camcoords = og_grid_camcoords[..., :3]

# Print for debugging issues
for i in range(og_grid_camcoords.shape[-1]):
    print(np.quantile(og_grid_camcoords[..., i].clip(-10, 10), np.linspace(0, 1, 11)).round(1))

# [-10.   -1.3  -0.7  -0.4  -0.2  -0.    0.2   0.4   0.6   1.2  10. ]
# [-10.   -0.2  -0.2  -0.2  -0.2  -0.2  -0.1  -0.1  -0.1  -0.1  10. ]
# [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]

4. Match the uv image coordinates for the OG grid coordinates.

og_grid_uvcoords = (mtx @ og_grid_camcoords.reshape(-1, 3).T)
og_grid_uvcoords = og_grid_uvcoords.T.reshape(grid_dims + (3,))
og_grid_uvcoords = og_grid_uvcoords.clip(0, max(uv_dims)).round().astype(int)
og_grid_uvcoords = og_grid_uvcoords[..., :2]

# Print for debugging issues
for i in range(og_grid_uvcoords.shape[-1]):
    print(np.quantile(og_grid_uvcoords[..., i], np.linspace(0, 1, 11)).round(1))

# [   0.    0.  665. 1134. 1553. 1966. 2374. 2777. 3232. 4000. 4000.]
# [   0. 1134. 1161. 1171. 1181. 1191. 1201. 1212. 1225. 1262. 4000.]

Clip to uv values to the image boundaries.

mask_clip_height = (og_grid_uvcoords[..., 1] >= uv_dims[0])
og_grid_uvcoords[mask_clip_height, 1] = uv_dims[0] - 1

mask_clip_width = (og_grid_uvcoords[..., 0] >= uv_dims[1])
og_grid_uvcoords[mask_clip_width, 0] = uv_dims[1] - 1

5. Populate the OG image with the uv pixels.

og = np.zeros(grid_dims + (3,)).astype(int)

for i, (u, v) in enumerate(og_grid_uvcoords.reshape(-1, 2)):
    og[i % grid_dims[1], i // grid_dims[1], :] = img[v, u]

plt.imshow(og)

I was expecting a top-down view of the test image.

from How to estimate the extrinsic matrix of a chessboard image and project it to bird's eye view such it presents pixel size in meters?

Hemant Vishwakarma

Monday, 2 January 2023

How to estimate the extrinsic matrix of a chessboard image and project it to bird's eye view such it presents pixel size in meters?

1. Estimate the extrinsic matrix

2. Generate the OG grid XYZ world coordinates (X=right, Y=height, Z=forward).

3. Project the OG grid XYZ camera coordinates with the extrinsic matrix.

4. Match the uv image coordinates for the OG grid coordinates.

5. Populate the OG image with the uv pixels.

No comments:

Post a Comment