Tuesday, 20 December 2022

How to convert screen x,y (cartesian coordinates) to 3D world space crosshair movement angles (screenToWorld)?

Recently I've been playing around with computer vision and neural networks.
And came across experimental object detection within a 3D application.
But, surprisingly to me - I've faced an issue of converting one coordinates system to another (AFAIK cartesian to polar/sphere).

Let me explain.
For example, we have a screenshot of a 3D application window (some 3D game): enter image description here

Now, using Open-CV or neural network I'm able to detect the round spheres (in-game targets).
As well as their X, Y coordinates within the game window (x, y offsets).
enter image description here

And if I will programmatically move a mouse cursor within the given X, Y coordinates in order to aim one of the targets.
It will work only when I'm in desktop environment (moving the cursor in desktop).
But when I switch to the 3D game and thus, my mouse cursor is now within 3D game world environment - it does not work and does not aim the target.

So, I did a decent research on the topic.
And what I came across, is that the mouse cursor is locked inside 3D game.
Because of this, we cannot move the cursor using MOUSEEVENTF_MOVE (0x0001) + MOUSEEVENTF_ABSOLUTE (0x8000) flags within the mouse_event win32 call.

We are only able to move the mouse programmatically using relative movement.
And, theoretically, in order to get this relative mouse movement offsets, we can calculate the offset of detections from the middle of the 3D game window.
In such case, relative movement vector would be something like (x=-100, y=0) if the target point is 100px left from the middle of the screen.

The thing is, that the crosshair inside a 3D game will not move 100px to the left as expected.
And will not aim the given target.
But it will move a bit in a given direction.

After that, I've made more research on the topic.
And as I understand, the crosshair inside a 3D game is moving using angles in 3D space.
Specifically, there are only two of them: horizontal movement angles and vertical movement angles.

So the game engine takes our mouse movement and converts it to the movement angles within a given 3D world space.
And that's how the crosshair movement is done inside a 3D game.
But we don't have access to that, all we can is move the mouse with win32 calls externally.

Then I've decided to somehow calculate pixels per degree (amount of pixels we need to use with win32 relative mouse movement in order to move the crosshair by 1 degrees inside the game).
In order to do this, I've wrote down a simple calculation algorithm.
Here it is: enter image description here

As you can see, we need to move our mouse relatively with win32 by 16400 pixels horizontally, in order to move the crosshair inside our game by 360 degrees.
And indeed, it works.
16400/2 will move the crosshair by 180 degrees respectively.

What I did next, is I tried to convert our screen X, Y target offset coordinates to percentages (from the middle of the screen).
And then convert them to degrees.

The overall formula looked like (example for horizontal movement only):

w = 100  # screen width
x_offset = 10  # target x offset 
hor_fov = 106.26

degs = (hor_fov/2) * (x_offset /w)  # 5.313 degrees

And indeed, it worked!
But not quite as expected.
The overall aiming precision was different, depending on how far the target is from the middle of the screen.

I'm not that great with trigonometry, but as I can say - there's something to do with polar/sphere coordinates.
Because we can see only some part of the game world both horizontally & vertically.
It's also called the FOV (Field of view).

Because of this, in the given 3D game we are only able to view 106.26 degrees horizontally.
And 73.74 degrees vertically.

My guess, is that I'm trying to convert coordinates from linear system to something non-linear.
As a result, the overall accuracy is not good enough.

I've also tried to use math.atan in Python.
And it works, but still - not accurate.

Here is the code:

def point_get_difference(source_point, dest_point):
    # 1000, 1000
    # source_point = (960, 540)
    # dest_point = (833, 645)
    # result = (100, 100)

    x = dest_point[0]-source_point[0]
    y = dest_point[1]-source_point[1]

    return x, y

def get_move_angle__new(aim_target, gwr, pixels_per_degree, fov):
    game_window_rect__center = (gwr[2]/2, gwr[3]/2)
    rel_diff = list(point_get_difference(game_window_rect__center, aim_target))

    x_degs = degrees(atan(rel_diff[0]/game_window_rect__center[0])) * ((fov[0]/2)/45)
    y_degs = degrees(atan(rel_diff[1] / game_window_rect__center[0])) * ((fov[1]/2)/45)
    rel_diff[0] = pixels_per_degree * x_degs
    rel_diff[1] = pixels_per_degree * y_degs

    return rel_diff, (x_degs+y_degs)

get_move_angle__new((900, 540), (0, 0, 1920, 1080), 16364/360, (106.26, 73.74))
# Output will be: ([-191.93420990140876, 0.0], -4.222458785413539)
# But it's not accurate, overall x_degs must be more or less than -4.22...

Is there a way to precisely convert 2D screen X, Y coordinates into 3D game crosshair movement degrees?
There must be a way, I just can't figure it out ...



from How to convert screen x,y (cartesian coordinates) to 3D world space crosshair movement angles (screenToWorld)?

No comments:

Post a Comment