Hemant Vishwakarma: Measure structure of xy points

Wednesday 2 December 2020

Measure structure of xy points - python

I'm trying to measure the overall structure of xy points. Rather than take an average of the raw Cartesian coordinates, I'm hoping to determine the structure by the positioning relative to neighbouring points.

To achieve this, I want to calculate the vector between each point and the neighbouring points at every timestamp. The average of these vectors between each pair of points should then provide the overall structure.

Note: the structure won't be identified correctly if the vectors are hard-coded between specific points. If the points swap positions or different points get replaced but retain the same structure the end result won't be accurate. I'm hoping the function will be able to determine the overall structure based solely off the neighbouring points.

So the culminating structure should take a pairwise approach where the final spatial distribution, 1) sets the centroid of the structure as the position of the point in the densest part of the structure, as determined by the average distance to the third-nearest neighbour. 2) Identify the relative position of their nearest neighbour, the relative position of that points nearest neighbour and so on, until the positions of all points have been determined.

Using below as an example, frame 1 displays the vectors between points. Frame 2 does the same with new positioning for some points and swapping of position for others (points A and B swap positioning between frames). The final frame displays every vector for all frames, while the points display the average structure.

import pandas as pd
from scipy.spatial import distance
import itertools

df = pd.DataFrame({   
    'Time' : [1,1,1,1,1,2,2,2,2,2],             
    'id' : ['A','B','C','D','E','B','A','C','D','E'],                 
    'X' : [1.0,3.0,4.0,2.0,2.0,1.0,3.0,5.0,3.0,2.0],
    'Y' : [1.0,1.0,0.0,0.0,2.0,1.0,1.0,0.0,0.0,2.0],
    })

def calculate_distances(group):
    group_distances = pd.DataFrame(
        squareform(pdist(group[["X", "Y"]].to_numpy())),  # Default is Euclidean distance
        columns=group["id"],
        index=group["id"],
    )

    return group_distances

# Calculate the distances between the points, per timeframe
df_distances = df.groupby("Time").apply(calculate_distances)

# Take the mean distance across timeframes
df_distances_mean = pd.DataFrame(
    np.mean([group.to_numpy() for _, group in df_distances.groupby("Time")], axis=0),
    columns=df_distances.columns,
    index=df_distances.columns,
)

point structure frame 1: