cSEM/README.rmd at master · FloSchuberth/cSEM · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
---
output: github_document
---

<!-- README.md is generated from README.Rmd. Please edit that file -->

# cSEM: Composite-based SEM <img src='man/figures/cSEMsticker.svg' align="right" height="200" /></a>

[![CRAN status](https://www.r-pkg.org/badges/version/cSEM)](https://cran.r-project.org/package=cSEM)
[![R-CMD-check](https://github.com/FloSchuberth/cSEM/workflows/R-CMD-check/badge.svg)](https://github.com/FloSchuberth/cSEM/actions)
<!-- [![Build Status](https://travis-ci.com/M-E-Rademaker/cSEM.svg?branch=master)](https://travis-ci.com/M-E-Rademaker/cSEM) -->
[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/github/M-E-Rademaker/cSEM?branch=master&svg=true)](https://ci.appveyor.com/project/M-E-Rademaker/csem)
![Lifecycle Status](https://img.shields.io/badge/lifecycle-maturing-blue.svg)
[![CRAN downloads](https://cranlogs.r-pkg.org/badges/cSEM)](https://cran.r-project.org/package=cSEM)

## Purpose

Estimate, analyse, test, and study linear, nonlinear, hierarchical and
multi-group structural equation models using composite-based approaches
and procedures, including estimation techniques such as partial least
squares path modeling (PLS-PM) and its derivatives (PLSc, OrdPLSc,
robustPLSc), generalized structured component analysis (GSCA),
generalized structured component analysis with uniqueness terms (GSCAm),
generalized canonical correlation analysis (GCCA), principal component
analysis (PCA), factor score regression (FSR) using sum score,
regression or Bartlett scores (including bias correction using Croon’s
approach), as well as several tests and typical post-estimation
procedures (e.g., verify admissibility of the estimates, assess the
model fit, test the model fit, compute confidence intervals, compare
groups, etc.).

## News (2025-05-15):
- Implementation of doModelSearch to perform AGAS-PLS. Thanks to Gloria.

- Release of cSEM version 0.6.1

- Release of cSEM Version 0.6.0

- Implementation of a `plot()` function to visualize cSEM models. Thanks to Nguyen.

- Enhancement of the `predict()` function

## Installation
The package is available on [CRAN](https://cran.r-project.org/):
```{r, eval = FALSE}
install.packages("cSEM")
```

To install the development version, which is recommended, use:
```{r, eval = FALSE}
# install.packages("pak")
pak::pak("FloSchuberth/cSEM")
```

## Getting started

The best place to get started is the [cSEM-website](https://floschuberth.github.io/cSEM/).

## Basic usage

The basic usage is illustrated below.


```{r out.width = "80%", fig.align = "center", echo=FALSE}
knitr::include_graphics("man/figures/api.png")
```


Usually, using `cSEM` is the same 3 step procedure:

> 1. Pick a dataset and specify a model using [lavaan syntax](https://lavaan.ugent.be/tutorial/syntax1.html)
> 2. Use `csem()`
> 3. Apply one of the post-estimation functions listed below on the resulting object.

## Post-Estimation Functions

There are five major post-estimation verbs, three test family functions and
three do-family of function:

- `assess()` : assess the model using common quality criteria
- `infer()` : calculate common inferential quantities (e.g., standard
  errors, confidence intervals)
- `predict()` : predict endogenous indicator values
- `plot()` : Plot the cSEM model
- `summarize()` : summarize the results
- `verify()` : verify admissibility of the estimates

Tests are performed by using the test family of functions. Currently, the following
tests are implemented:

- `testCVPAT()` performs a cross-validated predictive ability test
- `testOMF()` : performs a test for overall model fit
- `testMICOM()` : performs a test for composite measurement invariance
- `testMGD()` : performs several tests to assess multi-group differences
- `testHausman()` : performs the regression-based Hausman test to test for endogeneity

Other miscellaneous post-estimation functions belong do the do-family of functions.
Currently, three do functions are implemented:

- `doIPMA()`: performs an importance-performance matrix analysis
- `doNonlinearEffectsAnalysis()`: performs a nonlinear effects analysis such as
   floodlight and surface analysis
- `doRedundancyAnalysis()`: performs a redundancy analysis

All functions require a `cSEMResults` object.

## Example
Models are defined using [lavaan syntax](https://lavaan.ugent.be/tutorial/syntax1.html)
with some slight modifications (see the [Specifying a model](https://floschuberth.github.io/cSEM/articles/cSEM.html#using-csem) section on the [cSEM-website](https://floschuberth.github.io/cSEM/)). For illustration we use the build-in and well-known
`satisfaction` dataset.

```{r warning=FALSE, error=FALSE, message=FALSE}
require(cSEM)

## Note: The operator "<~" tells cSEM that the construct to its left is modeled
##       as a composite.
##       The operator "=~" tells cSEM that the construct to its left is modeled
##       as a common factor.
##       The operator "~" tells cSEM which are the dependent (left-hand side) and
##       independent variables (right-hand side).

model <- "
# Structural model
EXPE ~ IMAG
QUAL ~ EXPE
VAL  ~ EXPE + QUAL
SAT  ~ IMAG + EXPE + QUAL + VAL
LOY  ~ IMAG + SAT

# Composite model
IMAG <~ imag1 + imag2 + imag3
EXPE <~ expe1 + expe2 + expe3
QUAL <~ qual1 + qual2 + qual3 + qual4 + qual5
VAL  <~ val1  + val2  + val3

# Reflective measurement model
SAT  =~ sat1  + sat2  + sat3  + sat4
LOY  =~ loy1  + loy2  + loy3  + loy4
"
```
The estimation is conducted using the `csem()` function.
```{r }
# Estimate using defaults
res <- csem(.data = satisfaction, .model = model)
res
```
This is equal to:
```{r eval=FALSE}
csem(
   .data                        = satisfaction,
   .model                       = model,
   .approach_cor_robust         = "none",
   .approach_nl                 = "sequential",
   .approach_paths              = "OLS",
   .approach_weights            = "PLS-PM",
   .conv_criterion              = "diff_absolute",
   .disattenuate                = TRUE,
   .dominant_indicators         = NULL,
   .estimate_structural         = TRUE,
   .id                          = NULL,
   .iter_max                    = 100,
   .normality                   = FALSE,
   .PLS_approach_cf             = "dist_squared_euclid",
   .PLS_ignore_structural_model = FALSE,
   .PLS_modes                   = NULL,
   .PLS_weight_scheme_inner     = "path",
   .reliabilities               = NULL,
   .starting_values             = NULL,
   .tolerance                   = 1e-05,
   .resample_method             = "none",
   .resample_method2            = "none",
   .R                           = 499,
   .R2                          = 199,
   .handle_inadmissibles        = "drop",
   .user_funs                   = NULL,
   .eval_plan                   = "sequential",
   .seed                        = NULL,
   .sign_change_option          = "none"
    )
```

The result is always a named list of class `cSEMResults`.

To access list elements use `$`:

```{r eval=FALSE}
res$Estimates$Loading_estimates
res$Information$Model
```

A useful tool to examine a list is the [listviewer package](https://github.com/timelyportfolio/listviewer/).
If you are new to `cSEM` this might be a good way to familiarize yourself with the structure
of a `cSEMResults` object.

```{r eval=FALSE}
listviewer::jsonedit(res, mode = "view") # requires the listviewer package.
```

Apply post-estimation functions:
```{r echo=FALSE, results='hide'}
rnorm(1) # necessary to initialize a .Random.seed object
```

```{r, message=FALSE, warning=FALSE}
## Get a summary
summarize(res)

## Verify admissibility of the results
verify(res)

## Test overall model fit
testOMF(res)

## Assess the model
assess(res)

## Predict indicator scores of endogenous constructs
predict(res)
```

#### Resampling and Inference

By default no inferential statistics are calculated since most composite-based estimators
have no closed-form expressions for standard errors. Resampling is used instead.
`cSEM` mostly relies on the `bootstrap` procedure (although `jackknife` is implemented as well)
to estimate standard errors, test statistics, and critical quantiles.

`cSEM` offers two ways for resampling:

1. Setting `.resample_method` in `csem()` to `"jackknife"` or `"bootstrap"` and subsequently using
   post-estimation functions `summarize()` or `infer()`.
2. The same result is achieved by passing a `cSEMResults` object to `resamplecSEMResults()`
   and subsequently using post-estimation functions `summarize()` or `infer()`.

```{r eval=FALSE}
# Setting `.resample_method`
b1 <- csem(.data = satisfaction, .model = model, .resample_method = "bootstrap")
# Using resamplecSEMResults()
b2 <- resamplecSEMResults(res)
```

```{r echo=FALSE, results='hide'}
b1 <- csem(.data = satisfaction, .model = model, .resample_method = "bootstrap")
```

The `summarize()` function reports the inferential statistics:

```{r}
summarize(b1)
```

Several bootstrap-based confidence intervals are implemented, see `?infer()`:

```{r eval=FALSE}
infer(b1, .quantity = c("CI_standard_z", "CI_percentile")) # no print method yet
```

Both bootstrap and jackknife resampling support platform-independent multiprocessing
as well as setting random seeds via the [future framework](https://github.com/futureverse/future/).
For multiprocessing simply set `.eval_plan = "multisession"` in which case the maximum number of available cores
is used if not on Windows. On Windows as many separate R instances are opened in the
background as there are cores available instead. Note
that this naturally has some overhead so for a small number of resamples multiprocessing
will not always be faster compared to sequential (single core) processing (the default).
Seeds are set via the `.seed` argument.

```{r eval=FALSE}
b <- csem(
  .data            = satisfaction,
  .model           = model,
  .resample_method = "bootstrap",
  .R               = 999,
  .seed            = 98234,
  .eval_plan       = "multisession")
```