I have a scatter plot that gets sorted into 4 Bins. These are separated by two arcs and a line in the middle (see figure below).
There's a slight problem with the two arcs. If the X-Coordiante is greater than the ang2 it doesn't get attributed to the correct Bin. (Please see figure below)
import math
import matplotlib.pyplot as plt
import matplotlib as mpl
X = [24,15,71,72,6,13,77,52,52,62,46,43,31,35,41]
Y = [94,61,76,83,69,86,78,57,45,94,82,74,56,70,94]
fig, ax = plt.subplots()
ax.set_xlim(-100,100)
ax.set_ylim(-40,140)
ax.grid(False)
plt.scatter(X,Y)
#middle line
BIN_23_X = 0
#two arcs
ang1 = -60, 60
ang2 = 60, 60
angle = math.degrees(math.acos(2/9.15))
E_xy = 0,60
Halfway = mpl.lines.Line2D((BIN_23_X,BIN_23_X), (0,125), color = 'white', lw = 1.5, alpha = 0.8, zorder = 1)
arc1 = mpl.patches.Arc(ang1, 70, 110, angle = 0, theta2 = angle, theta1 = 360-angle, color = 'white', lw = 2)
arc2 = mpl.patches.Arc(ang2, 70, 110, angle = 0, theta2 = 180+angle, theta1 = 180-angle, color = 'white', lw = 2)
Oval = mpl.patches.Ellipse(E_xy, 160, 130, lw = 3, edgecolor = 'black', color = 'white', alpha = 0.2)
ax.add_line(Halfway)
ax.add_patch(arc1)
ax.add_patch(arc2)
ax.add_patch(Oval)
#Sorting the coordinates into bins
def get_nearest_arc_vert(x, y, arc_vertices):
err = (arc_vertices[:,0] - x)**2 + (arc_vertices[:,1] - y)**2
nearest = (arc_vertices[err == min(err)])[0]
return nearest
arc1v = ax.transData.inverted().transform(arc1.get_verts())
arc2v = ax.transData.inverted().transform(arc2.get_verts())
def classify_pointset(vx, vy):
bins = {(k+1):[] for k in range(4)}
for (x,y) in zip(vx, vy):
nx1, ny1 = get_nearest_arc_vert(x, y, arc1v)
nx2, ny2 = get_nearest_arc_vert(x, y, arc2v)
if x < nx1:
bins[1].append((x,y))
elif x > nx2:
bins[4].append((x,y))
else:
if x < BIN_23_X:
bins[2].append((x,y))
else:
bins[3].append((x,y))
return bins
#Bins Output
bins_red = classify_pointset(X,Y)
all_points = [None] * 5
for bin_key in [1,2,3,4]:
all_points[bin_key] = bins_red[bin_key]
Output:
[[], [], [(24, 94), (15, 61), (71, 76), (72, 83), (6, 69), (13, 86), (77, 78), (62, 94)], [(52, 57), (52, 45), (46, 82), (43, 74), (31, 56), (35, 70), (41, 94)]]
This isn't quite right. Looking at the figure output below, 4 coordinates are in Bin 3 and 11 are in Bin 4. But 8 are attributed to Bin 3 and 7 are attributed to Bin 4.
I think the problem is the blue coordinates. Specifically, when the X-Coordinate is greater than ang2, which is 60. If I alter these to be less than 60 they will be corrected into Bin 3.
I'm not sure if I should extend the arcs to be greater than 60 or if the code can be improved?
Please note this is just for Bin 4 and ang2. The issue will occur for Bin 1 and ang1. That is, if the X-Cooridnate is less than 60 it won't get attributed to Bin 1
Intended Output:
[[], [], [(24, 94), (15, 61), (6, 69), (13, 86)], [(71, 76), (72, 83), (52, 57), (52, 45), (46, 82), (43, 74), (31, 56), (35, 70), (41, 94), (77, 78), (62, 94)]]
Note: The intended output is preferred. The example uses one row of input data. However, my dataset is much larger. If we use numerous rows the output should be row by row. e.g
#Numerous rows
X = np.random.randint(50, size=(100, 10))
Y = np.random.randint(80, size=(100, 10))
Out:
Row 0 = [(x,y)],[(x,y)],[(x,y)],[(x,y)]
Row 1 = [(x,y)],[(x,y)],[(x,y)],[(x,y)]
Row 2 = [(x,y)],[(x,y)],[(x,y)],[(x,y)]
etc
from Allocate scatter plot into specific bins

No comments:
Post a Comment