Saturday 27 August 2022

Apply function to dataframe row use result for next row input

I am trying to create a rudimentary scheduling system. Here is what I have so far:

I have a pandas dataframe job_data that looks like this:

wc job start duration
1 J1 2022-08-16 07:30:00 17
1 J2 2022-08-16 07:30:00 5
2 J3 2022-08-16 07:30:00 21
2 J4 2022-08-16 07:30:00 12

It contains a wc (work center), job, a start date and duration for the job in hours.

I have created a function add_hours that takes the following arguments: start (datetime), hours (int).

It calculates the when the job will be complete based on the start time and duration.

The code for add_hours is:

def is_in_open_hours(dt):
    return (
        dt.weekday() in business_hours["weekdays"]
        and dt.date() not in holidays
        and business_hours["from"].hour <= dt.time().hour < business_hours["to"].hour
    )


def get_next_open_datetime(dt):
    while True:
        dt = dt + timedelta(days=1)
        if dt.weekday() in business_hours["weekdays"] and dt.date() not in holidays:
            dt = datetime.combine(dt.date(), business_hours["from"])
            return dt


def add_hours(dt, hours):
    while hours != 0:
        if is_in_open_hours(dt):
            dt = dt + timedelta(hours=1)
            hours = hours - 1
        else:
            dt = get_next_open_datetime(dt)
    return dt

The code to calculate the end column is:

df["end"] = df.apply(lambda x: add_hours(x.start, x.duration), axis=1)

The result of function is the end column:

wc job start duration end
1 J1 2022-08-16 07:30:00 17 2022-08-17 14:00:00
1 J2 2022-08-16 07:30:00 5 2022-08-17 10:00:00
2 J3 2022-08-16 07:30:00 21 2022-08-18 08:00:00
2 J4 2022-08-16 07:30:00 12 2022-08-18 08:00:00

Problem is, I need the start datetime in the second row to be the end datetime from the previous row instead of them all using the same start date. I also need to start this process over for each wc.

So the desired output would be:

wc job start duration end
1 J1 2022-08-16 07:30:00 17 2022-08-17 14:00:00
1 J2 2022-08-17 14:00:00 5 2022-08-17 19:00:00
2 J3 2022-08-16 07:30:00 21 2022-08-18 08:00:00
2 J4 2022-08-18 08:00:00 10 2022-08-18 18:00:00


from Apply function to dataframe row use result for next row input

No comments:

Post a Comment