The aim of this question is to obtain some advice. I've got a df that contains xy points. I want to remove these points if they are located within a polygon. This is exhibited as area
below. The points will come and go from this area, so I only want to remove when they are definitively placed within there.
The central dilemma is I don't want to pass a strict rule here. Because the points are fluid, I'm hoping to incorporate flexibility. For instance, some points may pass through this area temporarily and shouldn't be removed. While other points are located within the area long enough that they should be removed.
The obvious approach is to pass some method of threshold here. Using df1
below, A
is located within the area for 3 frames, while B
is located within the area for 7 frames. If I pass a threshold of >5 frames, B
should be removed for all frames within this area, while A
shouldn't be impacted.
The issue is, it has to be consecutive frames. The points will come and go, so I only want to remove after 5 consecutive frames.
Are there any other approaches I should consider?
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
import random
# general dataframe
df = pd.DataFrame(np.random.randint(-25,125,size=(500, 2)), columns=list('XY'))
labels = df['X'].apply(lambda x: random.choice(['A', 'B']) )
df['Label'] = labels
df['Time'] = range(1, len(df) + 1)
# edge cases
df1 = pd.DataFrame({
'X' : [20,10,0,-5,-5,-5,0,10,20,30,20,10,0,-5,-5,-5,-5,-5,-5,-5],
'Y' : [50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50,50],
'Label' : ['A','A','A','A','A','A','A','A','A','A','B','B','B','B','B','B','B','B','B','B'],
'Time' : [501,502,503,504,505,506,507,508,509,510,501,502,503,504,505,506,507,508,509,510],
})
fig, ax = plt.subplots()
ax.set_xlim(-50, 150)
ax.set_ylim(-50, 150)
# designated area
x = ([1.5,-0.5,-1.25,-0.5,1.5,-11,-11,1.5])
y = ([75,62.5,50,37.5,25,25,75,75])
line = plt.plot(x,y)
Oval_patch = mpl.patches.Ellipse((50,50), 100, 150, color = 'k', fill = False)
ax.add_patch(Oval_patch)
pts = df[['X','Y']]
mask = line[0].get_path().contains_points(pts)
df = df[~mask]
A_X = np.array(df1.groupby(['Time'])['X'].apply(list))
A_Y = np.array(df1.groupby(['Time'])['Y'].apply(list))
ax.scatter(A_X[0], A_Y[0], c = 'purple', alpha = 0.2)
from
Classify if point is within specified area using threshold - python