I'm trying to append new or continually updated data to an existing data frame. The new data is obtained by a timer that imports the same dataset every minute.
I'm aiming to subset this dataset using a condition but am getting an error.
Using below, I import data from yahoo where the same data is pulled every minute. I'm then aiming to subset specific rows from this updated dataframe and return the data for future use.
The data is being downloaded using a while loop but I'm getting an error when trying to subset this df.
I've tried two attempts outlined in Edit 1 and Edit 2.
import pandas as pd
import yfinance as yf
import datetime
import pytz
from threading import Thread
from time import sleep
# end date
my_date = datetime.datetime.now(pytz.timezone('Etc/GMT-5'))
# start date
prev_24hrs = my_date - datetime.timedelta(hours = 25, minutes = 0)
# import data
data = yf.download(tickers = 'EURUSD=X',
start = prev_24hrs,
end = my_date,
interval = '1m'
).reset_index()
Edit 1:
# updated data
upd_data = []
def scheduled_update():
while datetime.datetime.now().minute % 1 != 0:
sleep(1)
data
while True:
sleep(60)
data
upd_data.append(data)
upd_data = upd_data[upd_data['High'] > 0.97000]
print(upd_data)
return upd_data
thread = Thread(target = scheduled_update)
thread.start()
Output:
Exception in thread Thread-12 (scheduled_update):
Traceback (most recent call last):
File "/opt/anaconda3/envs/gpd/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/opt/anaconda3/envs/gpd/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/Users/xxx/xxx/xxx/xxx/untitled5.py", line 43, in scheduled_update
upd_data.append(data)
UnboundLocalError: local variable 'upd_data' referenced before assignment
Edit 2:
# updated data
upd_data = []
def scheduled_update():
while datetime.datetime.now().minute % 1 != 0:
sleep(1)
data
while True:
sleep(60)
data
upd_data.append(data)
df_out = upd_data[upd_data['High'] > 0.97000]
print(df_out)
return df_out
thread = Thread(target = scheduled_update)
thread.start()
Output:
Exception in thread Thread-16 (scheduled_update):
Traceback (most recent call last):
File "/opt/anaconda3/envs/gpd/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/opt/anaconda3/envs/gpd/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
File "/Users/xxx/xxx/xxx/xxx/untitled5.py", line 44, in scheduled_update
df_out = upd_data[upd_data['High'] > 0.97000]
TypeError: list indices must be integers or slices, not str
Intended output:
Using data obtained on 6/10/22, these outputs are 10mins apart:
1st execution:
Datetime Open High Low Close Adj Close Volume
0 2022-10-05 18:28:00+01:00 0.988045 0.988045 0.988045 0.988045 0.988045 0
1 2022-10-05 18:29:00+01:00 0.988142 0.988142 0.988142 0.988142 0.988142 0
2 2022-10-05 18:30:00+01:00 0.988142 0.988142 0.988142 0.988142 0.988142 0
3 2022-10-05 18:31:00+01:00 0.987947 0.987947 0.987947 0.987947 0.987947 0
4 2022-10-05 18:32:00+01:00 0.988240 0.988240 0.988240 0.988240 0.988240 0
.. ... ... ... ... ... ... ...
280 2022-10-05 23:23:00+01:00 0.989022 0.989022 0.989022 0.989022 0.989022 0
281 2022-10-05 23:24:00+01:00 0.989120 0.989120 0.989120 0.989120 0.989120 0
282 2022-10-05 23:25:00+01:00 0.989022 0.989022 0.989022 0.989022 0.989022 0
283 2022-10-05 23:26:00+01:00 0.989120 0.989120 0.989120 0.989120 0.989120 0
284 2022-10-05 23:27:00+01:00 0.989022 0.989022 0.989022 0.989022 0.989022 0
If the code continues to run every minute, data should be appended if it meets the subset condition. e.g 10 mins later:
Datetime Open High Low Close Adj Close Volume
0 2022-10-05 18:38:00+01:00 0.987947 0.987947 0.987947 0.987947 0.987947 0
1 2022-10-05 18:39:00+01:00 0.987849 0.987849 0.987849 0.987849 0.987849 0
2 2022-10-05 18:40:00+01:00 0.988045 0.988045 0.988045 0.988045 0.988045 0
3 2022-10-05 18:41:00+01:00 0.987947 0.987947 0.987947 0.987947 0.987947 0
4 2022-10-05 18:42:00+01:00 0.987849 0.987849 0.987849 0.987849 0.987849 0
.. ... ... ... ... ... ... ...
278 2022-10-05 23:32:00+01:00 0.989022 0.989022 0.989022 0.989022 0.989022 0
279 2022-10-05 23:33:00+01:00 0.989120 0.989120 0.989120 0.989120 0.989120 0
280 2022-10-05 23:34:00+01:00 0.989218 0.989218 0.989218 0.989218 0.989218 0
281 2022-10-05 23:35:00+01:00 0.989218 0.989218 0.989218 0.989218 0.989218 0
282 2022-10-05 23:36:00+01:00 0.989511 0.989511 0.989511 0.989511 0.989511 0
from Append data within while loop to dataframe - python
No comments:
Post a Comment