top of page
Geometeric Graph

Network Replicability & Generalizability

Background:

In recent years, there has been debate as to whether network properties can be expected to replicate across samples [1-5]. A core challenge in comparing networks is accounting for - and distinguishing between - factors that introduce noise into the estimation of networks (e.g., sampling variability). To date, certain methodological practices may have contributed to observed inconsistencies, including use of single-item indicators and non-identical measurement tools.

Research Question:

Under which conditions are network characteristics more likely to replicate across samples? How can we increase the stability of observed network properties, and the trustworthiness of interpretations drawn?

Current Study:

Using a resampling approach, we systematically disentangled the effects of sampling variability from scale variability when assessing network replicability. Additionally, we explored whether consistencies in network characteristics were improved when more items were aggregated to estimate node scores, which we hypothesized should yield more representative measures of latent node constructs.

Publication Preprint:

OSF Project Page:

Research Approach

Design:

  • Case study using real-world, open-source, personality data [6].
     

  • Using a resampling technique, we sampled 50 pairs of data frames in three different ways:
     

    • Sampling variability condition: independent samples, identical scale items​

    • Scale variability condition: same sample, non-identical scale items

    • Sampling & scale variability condition: independent samples, non-identical scale items
       

  • This allowed us to systematically vary our parameters of interest (i.e. sampling variability and scale variability) while mirroring the properties of empirical data.
     

  • Additionally, across all conditions, we looked at the effects of higher levels of node aggregation (i.e. sum scores derived from 1, 2, 3, 5, or 8 items), to assess the merits of using multi-item vs. single-item indicators when estimating the nodes in a network.

Network Estimation:

  • For each pair of data frames, we estimated two networks, using a non-regularized approach (ggmModSelect) [7].

Network Comparisons:

  • For each pair of networks, we assessed the consistency between network properties using a set of descriptive metrics. Taken together, these metrics should provide an overview of the variability in network properties that can be expected, given the individual or joint presence of sampling variability and scale variability, and at varying degrees of node aggregation.
     

    • Adjacency matrices (i.e. correspondence in present vs. absent edges)​

    • Global strength (i.e. similarity in overall network connectivity)

    • Edge weight correlations (i.e. similarity in edge strengths)

    • Centrality correlations (i.e. correspondence in the rank ordering of centrality scores)

    • Tallied frequencies (i.e. proportion of replications sharing the same node and edge properties)

Results

Network_res1.PNG

Fig 1. Whole-sample networks: Structure & Centrality Rank. 
Networks above were estimated on the full sample (N = 424) and represent the Neuroticism personality dimension and corresponding subfacets. LEFT: Plots highlight differences in resulting network structures as a function of different measurement tools (i.e. NEO vs. IPIP scales). RIGHT: Centrality plot provides a comparison between rank order of centrality scores. Use of the NEO scale (blue line) determines Depression as the most central node, whereas use of IPIP scale (red line) determines Anxiety as the most influential node in the network.

FigS2_N.png

Fig 2. Whole-sample networks: Edge Weights & Centrality Stability. 
Plots above correspond to whole-sample networks (N = 424), with top and bottom plots representing the NEO vs. IPIP networks, resp. 
LEFT: Plots display accuracy of edge weights; grey horizontal bars represent 95% confidence intervals. In the majority of cases (i.e. 12 out of 15 edges), differences in edge strengths were not statistically significant. Exceptions were edges linking Anxiety and Anger, Depression and Impulsiveness, and Anger and Depression. RIGHT: Plots display centrality stability (x-axis: proportions of sampled cases; y-axis: correlation between original sample and subsample centrality estimates). CS-coefficients of 0.75 demonstrates high levels of stability across both scales.

Fig 3. Adjacency Matrices. 
Stacked barplots display the number of edges (max. 15) estimated as absent in both networks (red), present in both networks (blue), or different in each (green). Trends show that when sample size and node aggregation increased, more edges were estimated as non-zero (i.e. denser networks), indicating potentially greater network sensitivity. 

Fig1_N.png
Fig3_N.png

Fig 4. Global Strength. 
Barplots display average differences in network connectivity (i.e. global strength) across conditions. Differences were more pronounced at smaller sample sizes and at lower levels of node aggregation. Discrepancies were also marginally greater when both sampling variability and scale variability were present (RIGHT plot), as compared to sampling variability (LEFT plot) or scale variability (MIDDLE plot) alone.

Fig 5. Edge Correlations. 
Barplots display the similarity in edge strengths across pairs of networks. When networks were estimated using different scales (i.e. MIDDLE and RIGHT plots), edge correlations were markedly weak, especially in 1-item networks. Disparities, however, were attenuated at higher sample sizes and at greater levels of node aggregation.

Fig4_N.png

Fig 6. Centrality Correlations. 
Barplots display the similarity in centrality scores across pairs of networks. Once again, presence of scale variability (i.e. MIDDLE and RIGHT plots) produced weaker correlations than sampling variability alone (LEFT plot). Across all conditions, strength of correlations were comparable when networks were estimated with 1 item (at n = 212) or with 2 items (at n = 84).

​

Fig5_N.png

Key Insights

  • More stable measurement conditions led to improvements in network replicability and generalizability.

  • Findings emphasized the benefits of aggregating over multiple items to estimate node scores.

  • Multi-item indicators are likely to yield more representative reflections of broader facet-level constructs, rather than item-specific behaviors.

  • Improvements in replicability and generalizability of networks can therefore be interpreted in light of increased levels of validity and/or reliability of node measures.
     

  • Practical recommendation: Use multi-item indicators to estimate network nodes. The addition of even 1 item can produce meaningful effects (comparable to increasing the sample size by 2.5 times). 

White Room

References

  1. Borsboom, D., Fried, E. I., Epskamp, S., Waldorp, L. J., van Borkulo, C. D., van der Maas, H. L. J., & Cramer, A. O. J. (2017). False alarm? A comprehensive reanalysis of “Evidence that psychopathology symptom networks have limited replicability” by Forbes, Wright, Markon, and Krueger (2017). Journal of Abnormal Psychology, 126(7), 989–999. https://doi.org/10.1037/abn0000306

  2. Forbes, M. K., Wright, A. G. C., Markon, K. E., & Krueger, R. F. (2017a). Evidence that psychopathology symptom networks have limited replicability. Journal of Abnormal Psychology, 126(7), 969–988. https://doi.org/10.1037/abn0000276

  3. Forbes, M. K., Wright, A. G. C., Markon, K. E., & Krueger, R. F. (2017b). Further evidence that psychopathology networks have limited replicability and utility: Response to Borsboom et al. And Steinley et al. Journal of Abnormal Psychology, 126(7), 1011–1016. https://doi.org/10.1037/abn0000313

  4. Forbes, M. K., Wright, A. G. C., Markon, K. E., & Krueger, R. F. (2019). Quantifying the Reliability and Replicability of Psychopathology Network Characteristics. Multivariate Behavioral Research, 1–19. https://doi.org/10.1080/00273171.2019.1616526

  5. Fried, E. I., Eidhof, M. B., Palic, S., Costantini, G., Huisman-van Dijk, H. M., Bockting, C. L. H., Engelhard, I., Armour, C., Nielsen, A. B. S., & Karstoft, K.-I. (2018). Replicability and Generalizability of Posttraumatic Stress Disorder (PTSD) Networks: A Cross-Cultural Multisite Study of PTSD Symptoms in Four Trauma Patient Samples. Clinical Psychological Science, 6(3), 335–351. https://doi.org/10.1177/2167702617745092

  6. Goldberg, L. R., & Saucier, G. (2016). The Eugene-Springfield community sample: Information available from the research participants (Tech. Rep. No. 56-1). Eugene, Oregon: Oregon Research Institute.

  7. ​Epskamp, S., Cramer, A. O. J., Waldorp, L. J., Schmittmann, V. D., & Borsboom, D. (2012). qgraph: Network Visualizations of Relationships in Psychometric Data. Journal of Statistical Software, 48(4). https://doi.org/10.18637/jss.v048.i04

© 2022 by Arianne Herrera-Bennett, PhD

bottom of page