Robin Shields
Reading the Social Networks of Universities on Twitter
Today on the show: social networks analysis in educational research.
My guest is Robin Shields. Robin is an Associate Professor at the University of Bath in the United Kingdom. His research broadly investigates the globalization of education, examining patterns of convergence and differentiation in educational policy and practice. He particularly focuses on the innovative application of research methods such as social network analysis and multilevel modeling to address key theoretical debates in the field. He has applied these methods to the study of international higher education and international development education.
On today’s show we discuss some of his work looking at twitter feeds of world class universities, which can be found in the February 2016 issue of Higher Education.
Citation: Shields, Robin, interview with Will Brehm, FreshEd, 26, podcast audio, May 2, 2016. https://freshedpodcast.com/robinshields
Will Brehm 2:19
Robin Shields, welcome to FreshEd.
Robin Shields 2:21
Thanks, Will. I’m very happy to be on the show.
Will Brehm 2:24
You’ve done a lot of work on network analysis or social network analysis, and it’s become pretty popular in educational studies of late. What is social network analysis in basic terms?
Robin Shields 2:39
Okay, Will. Well, in the most basic sense, you could say educational studies, and by extension it’s related disciplines like sociology, have tried to understand education and society by understanding individuals as a unit of analysis. These individuals could be students, teachers. It could be institutions. But social network analysis shifts the focus from the individual to relationships between individuals. So, our unit of analysis is no longer just the individual. It’s really the relationship between individuals. So, we’re looking at a network, which is a set of relationships between individuals. And what this entails is a far greater level of complexity than you would find in a lot of, you could say, actor-based studies. So, on one hand, we have much larger and more complex data sets because as the size of our sample, as the group of individuals we’re studying grows, the number of possible connections between them grows exponentially. So, for instance, if you have a regular-sized classroom of 30 students you’d have over 400 possible relationships, friendships, or collaboration relationships that could exist between the students. And we also have complexity in the way relationships work. Kind of more like chaos theory type of complexity, meaning that small changes can have very big effects. So, for example, if you take a classroom, if you increase, just slightly, the propensity or the likelihood for students to form relationships, like friendships or collaboration, then that could have a very big influence on how the overall network of the classroom looks. So, small changes can have very big effects.
Will Brehm 4:22
So, what sort of information could be found by looking at relationships rather than individuals?
Robin Shields 4:32
That’s a good question we could find -you can look at, for instance, inequality is a relationship which you can see very well in network structures, which is much harder to find in individual relationships. Centrality and power. So, if you’re trying to establish relationships of power in a given context, then it’s quite hard to do that in actor-based research. You can kind of use proxy measures, but you can often see through ties, through relationships in the network, who are the most important people, you can look at flows of information. So, how particular individuals might mediate flows of information between groups who are otherwise unconnected. So, if you have one student who connects to other groups of students that otherwise have no connections between them, then you know that person is pretty important in terms of mediating any potential flows of information between the two groups.
Will Brehm 5:29
So, it seems like inequality and power and flows of knowledge, these are very kind of intangible, but they are the processes that make up the network. And that’s what network analysis tries to understand.
Robin Shields 5:46
Yeah. I’d say so. But you’re right that they’re intangible and that it’s very difficult to actually measure flows of information or flows of knowledge. So, we rely on proxy measures. We rely on things like, I don’t know, citations is a common link in a network. So, who cites you is kind of taken as a proxy that they regard their work as important or influential, or at least have read it. In my study, I use social network ties. So, relationships and social network websites as another proxy that two organizations know of each other, perhaps respect each other, or are aware of each other’s existence.
Will Brehm 6:29
In the 1990s, there was a term, the networked society. Is the network analysis in educational research, is there a connection to this notion of the networked society?
Robin Shields 6:42
Well, I think Manuel Castells’ work on the networked society is very seminal in shaping how people understand contemporary society and contemporary social organization. So, there are a few key concepts that he puts forward. The first is that society is kind of characterized by ties formed very rapidly, and that are not centrally organized but are rather self-organized. So, individuals form ties out of common self-interest, and they’re not really mediated by institutions. So, that’s an important concept which translates very well to social network research. The networks are basically self-organizing and there’s no kind of macroscopic power usually, or a force organizing how networks are organized but they arise from, kind of, self-interest or other types of relationships between actors. He also talks a lot about how global networks are, kind of, deterritorialized. So, space matters much less because there’s electronic media. So, people can connect around the world very easily. So, there are definitely some analogies and especially this idea of borderless networks, you hear a lot in contemporary higher education discourse. We talk about borderless higher education quite a bit. And there’s often an assumption that universities around the world can interact with distance-posing a very minimal barrier. However, it’s also worth noting that social network analysis predates Manuel Castells’ work by several decades. So, people were working on social network analysis in sociology, in organizational psychology since the mid 20th century, really. So, this idea of taking the relationship as a unit of analysis does predate Manuel Castells’ notion of the networked society by several decades.
Will Brehm 8:34
So, we will get to some of your works in a bit that looks at world class universities using a large data set of Twitter. But I’m just curious, can social network analyses be done both quantitatively and qualitatively?
Robin Shields 8:51
Yes, it’s definitely possible to do both quantitative and qualitative social network analysis research. I’m definitely more of an expert in quantitative social network analysis. And I do think probably that accounts for the majority of papers that are published on social network analysis nowadays. But a great source to check out if you’re interested in qualitative approaches to social network analysis, is some of Stephen Ball’s more recent work and he uses the term network ethnography to describe his kind of ethnographic approach to understanding relationships between large non-state, often corporate interests and how they shape educational policy. So, he’s using an ethnographic approach. But rather than a traditional ethnography, I guess you could say, he’s really interested on how these networks of power form and understanding the links between organizations. I think it’s also important to note that it’s possible to do -even not going into the qualitative and quantitative distinction -, it’s possible to do both inductive and deductive work in social network analysis. So, a lot of social network analysis, even if it’s quantitative, it’s actually exploratory in nature. It’s exploring the network structure and trying to develop possible models about how the network might work. And finally, for qualitative researchers, it’s often important to keep in mind the possibility of data transformation by which I mean you may have a set of interviews with 10 or 20 different individuals. It may be possible when you’re coding your data to come up with ties between the individuals. So, if you ask them, who do they look for for advice, you might be able to transform your qualitative data, in addition to using it as a qualitative data set, into a quantitative network where you look at who’s mentioned who as a source of advice. And so, you can kind of use a mixed method approach that way.
Will Brehm 10:57
Have you used such an approach in your work before?
Robin Shields 11:00
No, I never have. I think it would be great though. So, hopefully we’ll see more studies like that in the field soon.
Will Brehm 11:08
So, let’s focus in on this purely quantitative side. And when I read your work, I come across all sorts of terms like nodes and ties and reciprocity and neighborhoods. Could you give someone like me who doesn’t know much about network analysis a quick overview of some of the conceptual vocabulary required to understand what it is that’s going on in your analyses?
Robin Shields 11:34
Yeah, I can try. I mean, it’s no substitute for even a brief primer on social network analysis. I’ve recently published one in a handbook of higher education policy research that just introduced some of this vocabulary. And even if you look at the Wikipedia page on social network, for example, you can really quickly get up to speed. But to establish a basic kind of lexicon, we can say that we consider a network as a group of connected entities, right? And these could be individual people, or organizations, or even nation states. But we usually call these actors or nodes. So, actors -I’ll use the term actors in this conversation- these are the individuals who are connected in the network. And they are connected by we would say ties or links, right? So, a link is a bridge between two individuals, two actors, which is usually either absent or present. So, you can say that person A cites person B, and that link is either there like person A does cite person B or it’s not there.
Will Brehm 12:41
And this would be the relationship that you talked about earlier?
Robin Shields 12:44
Yes, exactly. So, the links or ties are the relationships between the actors. So, if we have a set of actors, and we have a set of ties between them, the first question we might ask about our network is, how many ties are there? Is everyone in the network connected to everyone else? Conversely, is almost no one connects to anyone else with a very sparse connections? And the term we use for this measure of how connected the network is, is the density of the network. This ranges from zero to one, usually. So, if the density is one, it means everybody’s connected to everybody else. And if no one’s connected to anyone else, the density is zero. And of course, in real life networks, we find that the density falls somewhere in between these two bounds. So, some ties are present, some aren’t, and our interest is in why are the ties that are present there? I’ll just introduce two more concepts that will be useful in our conversation. The next thing we might look at is whether or not ties work in both directions. So, if we have two actors called Mary and John, if Mary knows John, is it also the case that John knows Mary? And how often is it the case that ties work in both directions? So, we call this property reciprocity. And we can measure reciprocity as a proportion of ties that work in two directions, as a proportion of all the ties in the network.
Will Brehm 14:10
So, just one example here is that, Steven Ball that we mentioned earlier. I know Steven Ball through his work, but he surely does not know me. So, that would be a one directional sort of relationship. Me knowing him.
Robin Shields 14:25
Yeah. And so, you often find that reciprocity is a good way to study power through asymmetries. Another example is, if you know the social network, Twitter, which we’ll talk about later. Celebrities often have many 1000s of followers, right? So, actually, millions of people follow celebrities, but they follow relatively few people themselves.
Will Brehm 14:50
So, it’s asymmetrical.
Robin Shields 14:52
Yeah. Most of their ties are asymmetrical. That’s kind of a sign of their influence. So, we’ve established nodes and ties or actors and ties and then reciprocity. The last thing I think is important to understand is looking at groups of connected actors. And the most basic way to do this is through the concept of transitivity which is sometimes called clustering. Although I like to use the term the friend of a friend characteristics of a network. So, if we use Mary and John as an example. If Mary knows John, and Mary also knows someone called Sue, how often is it the case that John also knows Sue? So, there’s mutual friendships in operation in the network. And that’s another property of the network that we can measure. Levels of high transitivity, or high clustering, are often called small world networks because it feels like a small world. So, when John meets Sue, he says, “Oh, you also know Mary! What a small world you know, we have all these connections in common”. So, those are a few basic terms. You’ve got your actors, you’ve got your ties, reciprocity, and then transitivity. That’s a way to start to understand the network and start to discuss the structure.
Will Brehm 16:09
So, reciprocity or asymmetry tells us a little bit about power in a network. What does transitivity tell us? If we have a small world network where everyone knows everyone else and the third person also knows the other two, what does this tell us about a network?
Robin Shields 16:32
It depends on the context to some extent but in most contexts, it would say that there’s a very good flow of information. So, there’s many paths through which information could travel. You know, if people are citing one another, they’ll be very familiar with the same set of work, if they’re spreading rumors, then the rumors would spread very quickly and efficiently. So, there’s not really key individuals who mediate the connections between the rest of the network. I guess the flip side to this is centralization, which is the extent to which the network has kind of a core-periphery structure. So, if there’s one person, let’s say, Mary knows everyone in the network but no one else knows anyone else, right? So, all the information flows through Mary, then she can mediate how information flows, she can stop information she doesn’t want from flying around. So, that’s a very central network. And that’s kind of the centralized network we could say. That’s kind of the opposite of a small world.
Will Brehm 17:34
Right. And usually in terms of citations in academic work it’s somewhere in the middle. There’s some people that have a lot of power, or centrality, I guess the word is, because everyone is citing them. But everyone else isn’t necessarily citing each other.
Robin Shields 17:52
Yeah, that’s right. But at least those of us who aren’t the mega superstars, we’re all citing each other a bit. So, there’s some flow of information.
Will Brehm 17:59
Right, exactly. Right. So, it’s kind of in the middle somewhere.
Robin Shields 18:01
Yeah. I think that’s important. I mean, that’s a good point about social networks, in general, is they’re only meaningful if they’re somewhere in the middle. Like if everyone cites everyone else, or no one cites anyone else then there’s nothing to study. So, we need things to be in the middle in the balance. And that’s where it gets interesting.
Will Brehm 18:21
And then you can use this analysis to then develop particular theories to explain why it’s in the middle or not.
Robin Shields 18:31
Yeah, exactly. Yeah.
Will Brehm 18:33
So, let’s turn to some of your own work. You have done some work lately using Twitter. Now, tell us about this work and why Twitter became such a powerful source of data.
Robin Shields 18:51
Okay. I mean, that’s a really good question. I find Twitter data very interesting. It’s a great source of data, I think, for people doing either quantitative or qualitative research. And I’m going to use the answer that mountain climbers often use to describe why they climb a mountain: because it’s there. Many people don’t know this but most Twitter data is by default public. So, unless you choose to make your Twitter account private, who you follow, what you tweet is public. And not only is it public but it can be downloaded quite easily through software. Twitter provides, kind of, an interface for software to download Twitter data. So, it’s very easy to get a very complex data set, a very large data set, potentially, that would otherwise take you weeks or months, or maybe even years to assemble. So, I was interested in how universities relate to one another in kind of a public sphere. There’s a lot of bibliometric data about who cites who but I thought it’d be interesting to see in a public sphere, how do universities relate to one another, kind of in public discourse, you could say. And so, Twitter data seemed like the ideal way to start to look into this.
Will Brehm 20:05
Before we jump into the data, why did you think it would be interesting to look at how universities relate to each other in the public sphere?
Robin Shields 20:13
Yeah. Well, I was trying to understand, I think, the way that status and particularly rankings of universities organized relationships between them. And I had found a lot of literature that raised expectations, kind of, that status and ranking were very, very important, and that these really organized relationships between universities. I had even talked to managers who talk about strategically how they approach universities with different rankings and different statuses. How they first establish a link with a friend of a friend if they want to make a key relationship, they make relationships with mutual friends. And so, I saw, kind of, a lot of, you could say network-based behavior. People unconsciously applying these concepts of social network analysis. And I have to say, my expectation going into the study was really that we would see, in social media data, relationships of power that kind of fell along the lines of rankings and status of universities.
Will Brehm 21:24
So, what were some of the findings, I guess, in your study, looking at these world class universities and ranking and status through a lot of Twitter data?
Robin Shields 21:38
Yeah. So, I just briefly touch on the data that I collected and then I’ll show you what I found. So, I took the top 200 universities in the world as, kind of, measure them for global rankings. So, any university that appeared in two of the four was in my sample. And then I found the central Twitter account for these universities, usually, it just has the name of the university. And then I looked at how they were connected to one another. And I was interested to see, are the higher-ranked universities more central? Are they less likely to reciprocate ties to lower-ranked universities? And what I found is, there is some effect of ranking. So, higher-ranked institutions are more likely to be followed than lower-ranked institutions but actually the size of this effect, you could say, is quite minimal. And other factors such as geography -so being in the same location, being in the same geographic region- were more important. And even more important than that were the existing relationships within the network. So, if there’s a friend of a friend, you know, if there’s a mutual connection between universities, that made them much more likely to follow each other on Twitter than anything like rankings or location.
Will Brehm 23:04
And so, to make this abstract or generalization, like Harvard would be this high-ranking university or Oxford, and they would be more likely or geography where they’re based in say, Massachusetts, or Oxford, England is a better predictor of the relationships they form on Twitter rather than them connecting to other high-ranking universities, like Harvard to Princeton, or Oxford to Cambridge?
Robin Shields 23:38
Yeah. That’s very well put. So, Oxford would be -I mean, there is some effect of ranking. So, they might be more likely to make ties with Stanford and Cambridge and all of that. But being in the same nation state, so Cambridge would be a very likely tie with Oxford. Another high-ranked institution which is very far away would be much less probable. University of Tokyo might be an example there. The distance would decrease the probability of a tie quite a bit. But the most important thing would actually be whether or not University of Tokyo followed Oxford. And if they did then the reciprocal tie would be very likely.
Will Brehm 24:20
So, all of this data and this analysis that you’ve done. So, what does it tell us about rankings and status for world class universities?
Robin Shields 24:33
That’s a good question. I think what it shows is that it’s possible to overstate the importance of rankings when we look at how universities actually interact with one another. And a lot of the emphasis on rankings, the great importance which is attached to it may be kind of talked into being. So, people have seen these new rankings. They’ve seen that they’re important to managers and such. And so, they’ve assumed that they organize the field of universities, if you like. And I think there’s perhaps a tendency to overstate that.
Will Brehm 25:08
Now, do you think you could have uncovered this same finding, using actor center research? Looking at individuals rather than the relationships between individuals?
Robin Shields 25:21
So, I could have perhaps done a survey with universities. I could have sent a survey to these 200 universities and said, “How important is ranking to the relationships you form”? And then I would have got back a bunch of responses, and then I could have said, well, their mean response was such and such. It would have provided another way to answer the question but I don’t think it would have been as empirically accurate, I suppose because it would have been their own self-assessment whereas I’ve looked at who they actually formed relationships with. So, I guess the advantage of the network approach is that I actually have data on how they connect to one another rather than their own assessment of how they connect to one another
Will Brehm 26:07
Right. It’s less subjective.
Robin Shields 26:09
Yeah. Because the ties are kind of tangible. They’re very concrete.
Will Brehm 26:15
Right. And you also use this approach called the exponential random graph models to explain the probability of an observed network as a function of both endogenous and exogenous variables. Now, can you explain this to someone like myself, who doesn’t know much about social network analysis?
Robin Shields 26:39
Yeah. I can do my best. And I’ve got to say the term exponential random graph (ERGM) models is enough to turn off anyone who doesn’t like statistics because it’s such a mouthful. But this is a really important method in social network analysis. I think it will become more important in education research in coming years. If you look at journals that -Comparative Education is often drawn upon, like American Journal of sociology, there have been a few ERGM papers published dating back to about 2010. So, I think we may find more in education research soon. But to put it in very basic terms, what we are trying to do with an ERGM is to explain the overall network structure, the big picture, through local selection forces, the propensity of individuals to form ties with one another. And as you said, we kind of disaggregate that in between two endogenous and exogenous forces. So, let’s say we’re talking about two actors, John and Mary, and whether John cites Mary. So, some of the things we might look at are exogenous to the network. Okay, they’re outside of our network. So, are John and Mary in the same department, or working in the same university? That may increase the propensity for John to cite Mary. Do they attend the same conferences? Are they in the same field? Do they have an office on the same floor? Right? These are all characteristics of John and Mary which are outside the network and may influence the probability for John to cite Mary. Conversely, we can look at other ties of the network to try to explain whether or not John might cite Mary. So, is Mary the most highly-cited person in her field? Is she the Stephen ball of her field? If so, John might be more likely to cite Mary. Do they have common ties? So, do they cite a lot of other literature in common? That’s transitivity, right? There’s a friend of a friend effect there. So, John may be more likely to cite Mary if they’re already citing a lot of the same literature. And if we have enough data, what we can do is disaggregate these endogenous and exogenous forces. So, we can say the effective reciprocity, independent of all these other variables, endogenous and exogenous, is estimated to be such and such a number, and we can test whether or not that’s likely to be different from zero. So, do we have confidence? Is there a significance to that effect of reciprocity?
Will Brehm 29:18
So, how did you use this modeling in your research using Twitter?
Robin Shields 29:24
So, what I looked at was the probability of a university following another university on Twitter. And I looked at sub endogenous variables. So, structures of the network, I looked at whether or not the tie was reciprocal, how would that influence tie formation? And I looked at transitivity, whether or not there were third parties that kind of were friends of friends between the two universities. And then I looked at a lot of exogenous variables. So, I looked at ranking. That was kind of the key variable I wanted to test. But I also looked at geographic variables. So, the distance between universities and whether or not they were in the same country, and whether or not they were in English-speaking countries were all variables that I used to further control the analysis and kind of explore this idea of the borderless university. Yeah. So, what I found is that the effect of ranking was significant. So, we could say it’s meaningful. It’s different from zero but it’s really not very large. It’s not as large as the effect of distance and it’s not as large as the effect of the effect of the endogenous variables.
Will Brehm 30:34
Right. So, the factors outside of the network are more meaningful than the rankings and the ties that exist within the network.
Robin Shields 30:45
Well, the ties that exist within the network are actually the most influential. So, the biggest predictor of the network structure is the rest of the network, which sounds kind of odd. But that’s the most likely thing to influence a tie formation based on my data. But after that, the exogenous factors that are not ranking were more important than ranking.
Will Brehm 31:09
Right. Okay. So, in comparative education, how else might researchers utilize this approach that you’ve used looking at Twitter data and trying to explain the effect of rankings on prestige and status?
Robin Shields 31:30
You know, there’s so many possibilities for this. I wish I had time to kind of explore –
Will Brehm 31:34
Well, where are you going? What are the directions you’re moving in?
Robin Shields 31:39
Well, I’ll give you a few kinds of ideas I have that I know other people in the field are working on. But I think the possibilities are almost endless. So, I think a good example would be inter-organizational relationships. Looking at how organizations like the World Bank, different branches of UNESCO, use Twitter, who they follow, who they mention, would be a very interesting way to use very similar approaches. There’s no ranking in that sense but you can look at whether or not multilaterals follow bi-laterals, you might look at kind of North to South relationships. So, you could look at ministries of education and the World Bank and how national agencies relate to international organizations. That would be really interesting. Aid to education would be a great network analysis. There is great data from the OECD on which countries and which donors fund which other countries. And that would be a really interesting network to study in more depth. Bibliometric analyses, I know these are going on already. People are working on bibliometric analyses of key policy documents, seeing who these documents cite and who the cited documents cite -they’re kind of snowballing their sample. And even a little bit of critical self-reflection on the field. So, looking at who cites who in comparative education? What are, kind of, the cliques? What are the neighborhoods, the different groups of actors in comparative education?
Will Brehm 33:11
Well, we really look forward to your future papers using network analysis. I think they’re just really fascinating. And you’re such a clear writer and you make these very complex ideas and statistical methods quite easy to follow and understand, and it’s very persuasive. So, thank you very much for joining FreshEd Robin Shields and we look forward to speaking with you again.
Robin Shields 33:36
Thanks, Will. It was a pleasure.
Coming soon!
Coming soon!