Managing panel data in RStudio to estimate
regression equations and determine the effect of independent variables on
dependent variables.
One method for determining the influence of
variables is using panel data. This type of influence allows us to estimate the
dependent variable. Panel data is aggregated data in the form of
Creating a panel data structure
RStudio differs from other statistical
software. To manage any analysis, it requires a data structure in RStudio.
While other software simply copy-and-paste spreadsheet data, whether Excel or
Google Sheets, to immediately manage the data, RStudio requires converting it
into a data model recognized by RStudio.
The steps include importing a spreadsheet
file and making some relatively simple adjustments to make your data easier to
process. Specifically for panel analysis, the data structure required is
pdata.frame, which is short for panel data frame. This differs from a regular
data frame in RStudio because it considers both individual and time dimensions.
This approach is what makes it different.
On the right, you can click "Import
Data Set" and select Excel. There are several other options, such as SPSS,
SAS, Stata, Text, and others. If you have a spreadsheet, select Excel.
After that, you will select multiple
sheets. If you are working with multiple sheets in one file, you must select
one of the sheets. Below that, you can select it. Therefore, you must pay
attention to the neatness of your text. For example, if there is a gap between
the table title and the data content, the empty table will be marked
"NA" (Not Available), meaning the data is not available.
Preparing Excel as Data
To organize data, we can work with data.
Because data with a spreadsheet is easier, we can organize it with data, as in
the example below.
I uploaded the data in CSV format into
RStudio.
tobinq3 <- read.csv2("~/jurnal/tobinq3.csv")
Then I can view the data like this:
View(tobinq3)
The data isn't in a pdataframe format yet,
so we do it like this:
ptobinq=pdata.frame(tobinq3,index=c("Comp","Year"),drop.index
= TRUE,row.names=TRUE)
The name ptobinq is the name I created to
distinguish it from other files. From here, we've transformed the data
structure into a panel dataframe. You'll see it look like this:
Classes
‘pdata.frame’ and 'data.frame': 40
obs. of 3 variables:
$ DAR
: 'pseries' Named num 0.49 0.44
0.42 0.4 0.4 0.49 0.44 0.42 0.4 0.4 ...
..- attr(*, "names")= chr [1:40]
"Adaro-2014" "Adaro-2015" "Adaro-2016"
"Adaro-2017" ...
..- attr(*, "index")=Classes
‘pindex’ and 'data.frame': 40 obs.
of 2 variables:
.. ..$ Comp : Factor w/ 8 levels
"Adaro","ATPK",..: 1 1 1 1 1 2 2 2 2 2 ...
.. ..$ Tahun: Factor w/ 5 levels
"2014","2015",..: 1 2 3 4 5 1 2 3 4 5 ...
$ DER
: 'pseries' Named num 0.97 0.78
0.72 0.67 0.66 0.97 0.78 0.72 0.67 0.66 ...
..- attr(*, "names")= chr [1:40]
"Adaro-2014" "Adaro-2015" "Adaro-2016"
"Adaro-2017" ...
..- attr(*, "index")=Classes
‘pindex’ and 'data.frame': 40 obs.
of 2 variables:
.. ..$ Comp : Factor w/ 8 levels
"Adaro","ATPK",..: 1 1 1 1 1 2 2 2 2 2 ...
.. ..$ Tahun: Factor w/ 5 levels
"2014","2015",..: 1 2 3 4 5 1 2 3 4 5 ...
$ Tobin.Q: 'pseries' Named num -0.2702 0.2346 0.2706 0.034 0.0336 ...
..- attr(*, "names")= chr [1:40]
"Adaro-2014" "Adaro-2015" "Adaro-2016"
"Adaro-2017" ...
..- attr(*, "index")=Classes ‘pindex’
and 'data.frame': 40 obs. of 2 variables:
.. ..$ Comp : Factor w/ 8 levels
"Adaro","ATPK",..: 1 1 1 1 1 2 2 2 2 2 ...
.. ..$ Tahun: Factor w/ 5 levels
"2014","2015",..: 1 2 3 4 5 1 2 3 4 5 ...
- attr(*, "index")=Classes ‘pindex’
and 'data.frame': 40 obs. of 2 variables:
..$ Comp : Factor w/ 8 levels
"Adaro","ATPK",..: 1 1 1 1 1 2 2 2 2 2 ...
..$ Tahun: Factor w/ 5 levels
"2014","2015",..: 1 2 3 4 5 1 2 3 4 5 ..
It's clear that the word
"dataframe" appears above. Then, there's the company index name and
the year. This data is now ready to be converted into panel data analysis.
We can view the data at the top with the
head command.
> head(ptobinq)
DAR DER Tobin.QAdaro-2014 0.49 0.97 -0.27020301Adaro-2015 0.44 0.78 0.23455470Adaro-2016 0.42 0.72 0.27061008Adaro-2017 0.40 0.67 0.03397098Adaro-2018 0.40 0.66 0.03363631ATPK-2014 0.49 0.97 0.30736531
The data appears to be different, so
company and year are no longer variables as they are in Excel. With the
dataframe model, both the time dimension and the company dimension are taken
into account.


Tidak ada komentar:
Posting Komentar