Jumat, 26 September 2025

Least Square Analysis For time series data

time series least square

time series least square

Least Square

One of time series method analysis is least square. The method similar to like least square that we use in other least square. in other word we use the time series data to analysis the trend of the time series data. The result of the method we can get the equation y = ax+b.

The analysis of the data

Prepare the time series data.

You can First, we collect the time series data and write in spreadsheet. Then, we upload the spreadsheet file to rstudio. we got a data frame data. we can run the test wit the data frame. but, if yo have already time series data you have to make adjustment the data.

# the example of making time series data
# Data from Rstudio (built-in) di R
data_lynx <- ts(lynx, start=c(1821), frequency=1)
# create time variable
timel <- as.numeric(time(data_lynx))

After set the data we examine least square method to the modified lynx data. the command is lm. you have to run it. After use the command, also use summary examine and you get the

# Examine the regressionm model 
tslynx <- lm(data_lynx ~ timel)
summary(tslynx)

Call:
lm(formula = data_lynx ~ timel)

Residuals:
   Min     1Q Median     3Q    Max 
 -1594  -1211   -755   1032   5366 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept) -4630.034   8493.112  -0.545    0.587
timel           3.285      4.523   0.726    0.469

Residual standard error: 1589 on 112 degrees of freedom
Multiple R-squared:  0.004689,  Adjusted R-squared:  -0.004198 
F-statistic: 0.5276 on 1 and 112 DF,  p-value: 0.4691

the result is not satisfying. the F test show insignificant result of the model, meaning the least square cannot explain the relation between data and the time. We also see the independent variable (time) does not affect the dependent variable, (lynx). as we guess first, the least square may not satisfy to analyse long time series data. we also consider to use other method for forecasting.

##Other Test

After see the result we also need to pass some test ti make sure the model is good. Certainly, i do not have to continue the test, since the F Value is so bad. I can find other data to explain the least square methods.

as the common we also run the normality test for this method or least square. we can use kolmogorov test for residual for regression model. this formula is also consider the mean and also standard devation of data.

#run normality test
ks.test(residuals(tslynx), "pnorm", mean = mean(residuals(tslynx)), sd = sd(residuals(tslynx)))

    Asymptotic one-sample Kolmogorov-Smirnov test

data:  residuals(tslynx)
D = 0.20038, p-value = 0.0002115
alternative hypothesis: two-sided

The residuals of the test is not good. the p value is far below 0,05, meaning the distribution is not normal. we can not reject the null hypothesis that the data is not normally distributed. Though the result not good. we continue to other test such as autokorelasi and heteroskedaticity. we employ dw test for detecting autocorrelation and bp test for detecting heteroskedaticty. before we use the coman, we have to library zoo and library lmtest . we rund both test comand to the tslynx model.

library(zoo)

Attaching package: 'zoo'
The following objects are masked from 'package:base':

    as.Date, as.Date.numeric
library(lmtest)
dwtest(tslynx)

    Durbin-Watson test

data:  tslynx
DW = 0.56312, p-value = 2.308e-15
alternative hypothesis: true autocorrelation is greater than 0
bptest(tslynx)

    studentized Breusch-Pagan test

data:  tslynx
BP = 0.0009146, df = 1, p-value = 0.9759

The result of dwtest shows the autorocrelation due the p value of dwtest is below 0,05. tge result cannot reject null hypothesis that there is autocorellation in this model. While, the thest od Breuch Pagan test show that the model free from heteroskedaticty.

I think to cut the term of the long of data to make the term of data. perhaps i will consider to cut data. the lynx data is a yearly data from 1821 t0 1934. there is 114 data. perhaps i will use the data from 1900 to 1934 only.

picture Angela from Pixabay

Tidak ada komentar:

Posting Komentar

Least Square Analysis For time series data

time series least square time series least square Least Square One of time...