Sunday 8 November 2020

Time Series prediction for python dataframe

I am working on a code which looks like below:

df=pd.read_csv("file.csv")
df['fraction'] = df ['number'] / df['year_total']
df.fraction = df.fraction.round(4)
df

Gives output as

enter image description here

programming_lang = ["r", "python", "c#", "java", "JavaScript", "php", "c++", "ruby", "Selenium"]

yearly_top = df[df['tag'].isin(programming_lang)]
yearly_top

Gives output as below:

year, tag, number, year_total, fraction
2008, java, 7473, 58390,0.1280
2008, php, 3111, 58390, 0.0533
2008, Python, 2080, 58390, 0.0356
......
2019, java, 83841, 1085170, 0.0773
2019, php, 61257, 1085170, 0.0564
2019, python, 107348, 1085170, 0.0989

It contains top programming language data from 2008 to 2019. I want to use a time series model to predict the fraction value for these programming languages for the year 2020, 2021, and 2022. I am very new to this area. Any leads will be helpful



from Time Series prediction for python dataframe

No comments:

Post a Comment