I am working on a code which looks like below:
df=pd.read_csv("file.csv")
df['fraction'] = df ['number'] / df['year_total']
df.fraction = df.fraction.round(4)
df
Gives output as
programming_lang = ["r", "python", "c#", "java", "JavaScript", "php", "c++", "ruby", "Selenium"]
yearly_top = df[df['tag'].isin(programming_lang)]
yearly_top
Gives output as below:
year, tag, number, year_total, fraction
2008, java, 7473, 58390,0.1280
2008, php, 3111, 58390, 0.0533
2008, Python, 2080, 58390, 0.0356
......
2019, java, 83841, 1085170, 0.0773
2019, php, 61257, 1085170, 0.0564
2019, python, 107348, 1085170, 0.0989
It contains top programming language data from 2008 to 2019. I want to use a time series model to predict the fraction
value for these programming languages for the year 2020, 2021, and 2022. I am very new to this area. Any leads will be helpful
from Time Series prediction for python dataframe
No comments:
Post a Comment