How to analyze alter-level attributes within egocentric network data using SPSS

A step-by-step guide.

[See original presentation here]

Key Takeaways

  1. Prep: Create a simple, sequential naming scheme for variable groups (compiled alter-types from IM generators; e.g., s1-5, t1-5). “Save as” before restructuring.

  2. Transpose: Select Data>Restructure and choose to restructure variables into cases. Move ordered variables from each var group into “variables to be transposed,” rename each target variable (e.g., s1-5 within “trans1,” renamed “sex1”).

  3. Categorize: Denote key ego-level variables as “fixed variable(s).” Create cluster ID (index) variables to identify each alters’ IM topic type.

  4. Restructure: Execute or save the command to a syntax file (latter is recommended). Verify the dataset is correctly restructured — alters at level 1 and ego-level values duplicated for level 2.




The following is (my attempt) at a much-needed update to Muller et al.’s (1999), How to use SPSS to study ego-centered networks.”

Background & Review of Terms

Egocentric or “core” social networks

An individual’s “core” network of their closest relationships (close ties; i.e., relationships with their closest friends/kin) (see [1][2]). Composed of the select few individuals who provide the most support to the ego, often not including more than five others.

Gathering core network data with surveys

Personal networks are most often measured by asking participants about who they discuss “important matters” (IM) with (i.e., the “IM generator”; see [3][4]). In this approach, respondents/”egos” provide the name of someone else based on this question (ostensibly their closest tie), who is then sought out themselves to eventually generate a broader social network.

Some survey instruments go deeper, including the R5D [5], which utilizes multiple topic-specific IM generators. Basically, it prompts a variation of the “important matters” question, asking each participant to name not just one person that provides them “core” support, but five different names for the five varying and most important types of support. The R5D consists of one to five boxes for names to be provided corresponding to the following topics: family-based support, career support, financial support, general welfare and happiness support, and support for health and well-being.

Compiling these topic-specific alters, the information provided for each of them, and considering them alongside the respondent subsequently creates an analyzable “core” network.

Egocentric network structure

Networks are inherently nested structures [6]. Meaning, nodes (alters) are nested within ties to other nodes nested within clusters of ties to other nodes and so on… That said, the usual approach to analyzing these structures is …

EGOs (person who takes the survey; participant, center of network) are at the individual, personal, or ‘ego’ level (level 1)

Through surveys, egos/respondents provide information about:

ALTERs (those connected/tied to ego; kin, classmates, close ties) are gathered via a name generator within survey and responses are composited to generate the ‘network’ level factors for each ego (level 2),


However, this is not the only way one could analyze the composition of personal networks.

Although the ego took the survey, therefore usually being the individual in a multilevel model, their alters can instead be observed as the smallest point in the structure since they are where the ties conclude. Doing so allows analysis of tie-level dependent variables [6][7] (e.g., social capital of a personal network, social support, and/or homophily) and the capacity to observe and control for other network characteristics, and how they interact within and between each level [8].

Case Study

To explain this methodology, I use an example ‘case study’ that includes a personal network survey dataset of incoming (freshman) undergraduate students and their families. The survey consists of roughly 900 students and 400 of their parents. Each participant provided five ties (alters) (via the R5D IM generators) and information about themselves (e.g., political ideology, attitudes and opinions, digital media use, etc.) as well as those in their network (e.g., perceived attitudes, on and offline connectedness, closeness).

This survey dataset is structured with egos at level 1, each having numerous variables with information about themselves, but also for each alter (those named in the R5D generators). This structure does not allow analysis of any alter-level dependent variable(s), and instead, only those at the ego-level, i.e., items which the respondent answered only about themselves. Alter-level characteristics would have to be compiled and averaged for any analysis. Therefore, we must restructure the dataset.

Data Management and Preparation

The case study survey data is originally structured with egos at “level 1,” each having numerous variables corresponding to questions asked about each alter via the R5D name generator. To begin the management and restructuring process, I’ve created a simple naming scheme to help me keep track of which of the five alters each alter-specific variable pertains to (i.e., family, career, finance, happiness, and health corresponds to 1-5, respectively) and assigned a letter to denote each “variable group” (i.e., s1-5, t1-5, etc.; see below).

The data (before restructuring) can be visualized as follows:

After this simple data management, count the number of variable groups you’ve created (i.e., a1-a5, b1-b5, and c1-c5 would be three var groupings). This is the total number of new alter-specific variables you will be creating.


Restructuring in SPSS

I cannot stress this enough before moving forward:

SAVE YOUR FILE AS A NEW DATASET.

Trust me. If you don’t already do this before each new command, especially those which rearrange your file structure, you surely will learn to do so in the near future.

Select the “Restructure…” command in the “Data” dropdown in SPSS. Ensure that you restructure variables into cases.

Enter the tallied number of new var groups in the bottom box, “more than one […]” (if creating more than one new alter-specific variable). Next, we specify the variable transposition.

From the dataset list on the left (see below), select all number-specified variables in a grouping (e.g., s1-s5) and drag or click the blue arrow to move them into the “variables to be transposed” box.

Rename “trans1” in the target variable” box to align with naming scheme (e.g., “trans1” renamed to “sex1” for s1-s5).

Repeat this process for each variable group you tallied above until all target variables in the dropdown are accounted for.

The case group identificationbox refers to an auto-generated ID variable with case-numbers for each newly transposed alter-level case (in this example, generated simply by adding another digit onto the original case/ID number). Feel free to rename and label it.

To denote which variables will be on the second level once transposed — i.e., vars with information provided by the original respondent (ego) about themselvesmove any ego-level var into the “fixed variable(s)” box on the bottom right. Doing so ensures they are not transposed and instead, their values are duplicated across each new alter-specific variable.

Note: if you have too many “fixed” variables and don’t want to sort through and select them in this tiny window (understandable), ignore this box and see below.

Click “next” and create an index variable, which will identify the TYPE of TOPIC corresponding to each new case once restructured, by sequentially assigning values from 1-5 to said cases (hence why an ordered naming scheme was recommended above).

With the next step, if you’d like to drop all non-categorized vars in your original dataset (i.e., all those not included as fixed or target variables), select “drop variable(s) from new file.” Or, select keep and treat as fixed variable(s)” to keep all remaining variables in the transposed dataset.

Lastly, either select to restructure the dataset immediately or have SPSS paste the command into a syntax file (I’d recommend the latter as to easily go back and re-run/check when something inevitably goes awry down the line) and click “finish.”


The dataset is now restructured and the previously separate, topic-specific IM variables transposed into composite variables typifying alter-level characteristics for each individual alter. Alters are at the first level with newly specified cases (new sample size is roughly 6,500 ), and the values for ego characteristics (now at level 2) are duplicated across each of their alters (see below).

Note: names w/ “1” denotes a level 1 var (i.e., s1, t1, m1, etc.); “2e” denotes level 2, ego (s2e, p2e, etc.).

 

Analyzing Restructured Data in Mplus

We are now ready to utilize your statistical software of choice to analyze. Here, I use Mplus — the best choice (which may or may not be biased by my lack of skills in HLM and R).

If you have not previously used Mplus with data cleaned and organized in SPSS, I’ve written a quick n’ easy how-to that reviews all the best practices to get you there. Similarly, even if you are ahead of the curve, or you just came back from that page (no judgement), I will indeed be doing a relatively brief review here, but will soon be posting a full write-up detailing multilevel modeling with core network data.

From here, the process is much like any other HLM analysis regarding model building. For instance, the Mplus syntax shown below is following Hox’s [9] Model 3, i.e., testing level 1 and 2 fixed effects [and, if relevant, interactions], and Model 4 [addition of random effects]. However, because of the restructuring and transposing method described above, many new capabilities that are often unavailable in core network research are included. For instance, and most significantly, the outcome is a level 1 variable, measuring how the alter and ego-level factors may relate to characteristics specific to each alter. This enables testing of between-level random effects to understand how changes in the ego, as well as those characteristics of their other network alters, may influence levels of alter-level homophily.

The MLM approach to core network analysis controls for the within and between level variances and biases of each individual alter in relation to a respective ego, and vice versa. That is, the nuances of core networks, including many other characteristics, past relationships, external stimuli (e.g., frequency of contact, shared experiences), and so on that are not explicitly measured here are controlled for — to the greatest extent that MLM is able — by the included multilevel interrelationships. When conceptualized appropriately, it is indeed very much like classic MLM examples where students are nested within classrooms, thus controlling for the variances and biases within and between such classrooms due to their teachers, relationships, and so on.

 

Footnote

In the example dataset used here, not every participant provided names for all five topic-specific IM generators. This is easily apparent when looking at descriptive statistics for the average network size of all egos (a fairly normal distribution from one to five). Unless enforced (which is not recommended), most personal networks surveys will show similar results. However, SPSS does not account for these issues — when restructuring, SPSS will generate the same number of alters (depending on the number of IM generators the survey included; in this case, five) for every participant whether they provided names for all of them or not. As such, further data cleaning is absolutely essential to ensure the total number of alter-level cases, and the characteristics considered when analyzing, is correct.

I will be writing up another post ASAP that will address this issue as it was a cruel and time-consuming one to figure out on my own. If you’re somehow reading this in between the time of writing and when I get it out there — god speed, traveller.

 

References

  1. Hampton, K. N., Sessions, L. F., & Her, E. J. (2011). Core Networks, Social Isolation, and New Media. Information, Communication & Society, 14(1), 130–155. doi.org/10.1080/1369118X.2010.513417

  2. Fisher, D. (2005). Using egocentric networks to understand communication. IEEE Internet Computing, 9(5), 20–28. doi.org/10.1109/MIC.2005.114

  3. Hampton, K. & Chen, W. (2021). "Studying social media from an ego-centric perspective." Personal networks: Classic readings and new directions in ego-centric analysis: 718-733.

  4. Marsden, P. V. (1987). Core Discussion Networks of Americans. American Sociological Review, 52(1), 122–131. doi.org/10.2307/2095397

  5. Hampton, K. N. (2022). A restricted multiple generator approach to enumerate personal support networks: An alternative to global important matters and satisficing in web surveys. Social Networks, 68, 48-59. doi.org/10.1016/j.socnet.2021.04.006

  6. Frank, K. A., Muller, C., & Mueller, A. S. (2013). The Embeddedness of Adolescent Friendship Nominations: The Formation of Social Capital in Emergent Network Structures. AJS; American Journal of Sociology, 119(1), 216–253. doi.org/10.1086/672081

  7. van Duijn, M. A. J., van Busschbach, J. T., & Snijders, T. A. B. (1999). Multilevel analysis of personal networks as dependent variables. Social Networks, 21, 187–209. doi.org/10.1016/S0378-8733(99)00009-X

  8. Wellman, B., & Frank, K. (2000). Network Capital in a Multi-Level World: Getting Support in Personal Communities. Social Capital, 233–273. [ResearchGate]

  9. Hox, J. (2002). Multilevel analysis techniques and applications. Lawrence Erlbaum Associates Publishers. [ResearchGate]

  10. Müller, C., Wellman, B., & Marin, A. (1999). How to use SPSS to study ego-centered networks. Bulletin de Méthodologies Sociologiques, 64. doi.org/10.1177/075910639906400106