Thursday, 15 November 2018

3d point reconstruction from 2d Images

I am trying understand basics of 3d point reconstruction from 2d stereo images. What I have understood so far can be summarized as below:

For 3d point (depth map) reconstruction, we need 2 images of the same object from 2 different view, given such image pair we also need Camera matrix (say P1, P2)

  • We find the corresponding points in the two images using methods like SIFT or SURF etc.

  • After getting corresponding key point, we find find the essential matrix (say K) using minimum 8 key points (used in 8-point algorithm)

  • Given we are at camera 1, calculate the parameters for camera 2 Using the essential matrix returns 4 possible camera parameters

  • Eventually we use corresponding points and both camera parameters for 3d point estimation using triangulation method.

After going through theory section, as my first experiment I tried to run the code available here, Which worked as expected. With a few modification in the example.py code I tried to run this example on all the consecutive image pairs and merge the 3-d point clouds for 3d reconstruction of object (dino) as below:

points3d = np.empty((0,0))
files = glob.glob("imgs/dinos/*.ppm")
len = len(files)

for item in range(len-1):
    print(files[item], files[(item+1)%len])
    points1, points2, intrinsic = dino(files[item], files[(item+1)%len])
    #print(('Length', len(points1))
    # Calculate essential matrix with 2d points.
    # Result will be up to a scale
    # First, normalize points
    points1n = np.dot(np.linalg.inv(intrinsic), points1)
    points2n = np.dot(np.linalg.inv(intrinsic), points2)
    E = structure.compute_essential_normalized(points1n, points2n)
    print('Computed essential matrix:', (-E / E[0][1]))

    # Given we are at camera 1, calculate the parameters for camera 2
    # Using the essential matrix returns 4 possible camera paramters
    P1 = np.array([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0]])
    P2s = structure.compute_P_from_essential(E)

    ind = -1
    for i, P2 in enumerate(P2s):
        # Find the correct camera parameters
        d1 = structure.reconstruct_one_point(
            points1n[:, 0], points2n[:, 0], P1, P2)

        # Convert P2 from camera view to world view
        P2_homogenous = np.linalg.inv(np.vstack([P2, [0, 0, 0, 1]]))
        d2 = np.dot(P2_homogenous[:3, :4], d1)

        if d1[2] > 0 and d2[2] > 0:
            ind = i

    P2 = np.linalg.inv(np.vstack([P2s[ind], [0, 0, 0, 1]]))[:3, :4]
    #tripoints3d = structure.reconstruct_points(points1n, points2n, P1, P2)
    tripoints3d = structure.linear_triangulation(points1n, points2n, P1, P2)

    if not points3d.size:
        points3d = tripoints3d
    else:
        points3d = np.concatenate((points3d, tripoints3d), 1)


fig = plt.figure()
fig.suptitle('3D reconstructed', fontsize=16)
ax = fig.gca(projection='3d')
ax.plot(points3d[0], points3d[1], points3d[2], 'b.')
ax.set_xlabel('x axis')
ax.set_ylabel('y axis')
ax.set_zlabel('z axis')
ax.view_init(elev=135, azim=90)
plt.show()

But I am getting very unexpected result. Please suggest me if above method is correct or how can i merge multiple 3d point clouds to construct a single 3-d structure.



from 3d point reconstruction from 2d Images

No comments:

Post a Comment