Social Networks
What is Social Network Analysis?
Social Network Analysis (SNA) is the study of the relationships (links) between people (nodes) in a network. SNA combines elements of mathematical graph theory, sociology, and matrix algebra to product quantitative measures that describe patterns of association.
- Related Items
- Small world problem
- Small world experiment
- Six degrees of separation
- Scale free problem
- Power law distribution
- Rich get richer
- Robust yet fragile
- The strength of weak ties
- Guanxi in Chinese society
- The rule of 150
Models
- Random graph on Wikipedia
- A random graph is obtained by starting with a set of n vertices and adding edges between them at random. Most commonly studied is the Erdős-Rényi model, called G(n,p), in which each of the possible edges occurs independently with probability p. A closely related model, G(n,M) assigns equal probability to all graphs with exactly M edges.
- Properties of G(n, p):
- mean degree <k> = p(n-1);
- average path length L ~ ln(n)/ln(<k>);
- clustering coefficient C = p = <k>/n « 1, which means that random graphs usually have a small clustering coefficient.
- possion degree distribution
- Despite their simplicity and power, the ER graphs fail to explain two important properties observed in real-world networks:
- By assuming a constant and independent probability of two nodes being connected, they do not account for local clustering and triadic closures. This fact is formally expressed by the ER graphs having a low clustering coefficient.
- They do not account for the formation of hubs. Formally, the degree distribution of ER graphs converges to a Poisson distribution, rather than a power law observed in most real-world, scale-free networks.
* Small-world network on Wikipedia
- Watts and Strogatz (WS) model on Wikipedia
- Many empirical graphs are well modeled by small-world networks. Social networks, the connectivity of the Internet, and gene networks all exhibit small-world network characteristics.
- A certain category of small-world networks were identified as a class of random graphs by Duncan Watts and Steven Strogatz in 1998. They noted that graphs could be classified according to their clustering coefficient and their mean-shortest path length. While many random graphs exhibit a small mean-shortest path (varying typically as the logarithm of the number of nodes) they also usually have a small clustering coefficient.
- Watts and Strogatz measured that in fact many real-world networks have a small shortest path but also a clustering coefficient significantly higher than expected by random chance. Watts and Strogatz proposed a simple model of random graphs with (i) a small average shortest path and (ii) a large clustering coefficient.
- Limitations
- The major limitation of the model is that it produces graphs that are homogeneous in degree. In contrast, real networks are often scale-free networks inhomogeneous in degree, having hubs and a scale-free degree distribution. Such networks are better described by the preferential attachment family of models, such as the Barabási–Albert (BA) model.
- The model also implies a fixed number of nodes and thus cannot be used to model network growth.
- Scale-free network on Wikipedia
- Barabási–Albert (BA) model on Wikipedia | Generalized scale-free model on Wikipedia
- Scale-free networks show a power law degree distribution like most real networks.
- Network growth and preferential attachment have been shown to create networks with power law degree distributions.
- Social-circles network model on Wikipedia
Issues in Social Networks
- Advertising for social networks Ad Chap
- Social search on Wikipedia
- Visualizing online social networks pdf
Resources
Research People & Organization
- Jon Kleinberg - Cornell
-
- the author of Six Degrees: The Science of a Connected Age (W.W. Norton, 2003) and Small Worlds: The Dynamics of Networks between Order and Randomness (Princeton University Press, 1999).
- Albert-László Barabási - University of Notre Dame
- Scale fee network in Science @ 1999
- Christos Faloutsos | Jure Leskovec - CMU
- Lada Adamic - University of Michigan
- Joshua O'Madadhain - UCI PhD, now in Google, developer of JUNG
- His interests include machine learning, data mining, information retrieval, social network analysis, combinatorial optimization, and the design of data structures and algorithms. He is currently investigating the use of machine learning techniques to create predictive and descriptive models for event-based network data.
- Academic Research Centers
- Center for Complex Network Research - Albert-László Barabási's lab in Univ. of Notre Dame
- Social Network Analysis - Program on Networked Governance, Harvard University
- CASOS Institute at Carnegie Mellon *NEW*
- UCI DataLab | Padhraic Smyth - UCI (JUNG developer)
- Corporate Research Centers
- Social Computing Group - Microsoft Research
Toolbox & Library
- JUNG - the Java Universal Network/Graph Framework – is a software library that provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network. Developed by Scott White, Joshua O'Madadhain, Danyel Fisher, Yan-Biao Boey at UCI. [Open source, free, and has a wide variety of algorithms available. Recommended by Hongbo]
- libSNA is the premier open source library for conducting social network analysis (SNA) research. [Open source, free, few basic network statistics, in Python]
- UCINET software package. [Commercial, slow, and cannot be embedded into applications: you can't call UCINET in an end-user display]
- Pajek Pajek Wiki - Analysis and Visualization of Large Scale Networks, Program for Large Network Analysis
- Computer Programs for Social Network Analysis in INSNA.org
- Social network analysis software - on Wikipedia
- Computational Models and Social Network Tools in CASOS@CMU
Data Sets
- Data Sets in INSNA.org
- Network Analysis Data in CASOS@CMU
Metrics in social network analysis
- Centrality on Wikipeida - There are various measures of the centrality of a vertex within a graph that determine the relative importance of a vertex within the graph. There are four measures of centrality that are widely used in network analysis: degree centrality, betweenness, closeness, and eigenvector centrality.
- Degree centrality: The count of the number of ties to other actors in the network. If the network is directed (meaning that ties have direction), then we usually define two separate measures of degree centrality, namely indegree and outdegree. Indegree is a count of the number of ties directed to the node, and outdegree is the number of ties that the node directs to others. For positive relations such as friendship or advice, we normally interpret indegree as a form of popularity, and outdegree as gregariousness.
- Betweenness: Degree an individual lies between other individuals in the network; the extent to which a node is directly connected only to those other nodes that are not directly connected to each other; an intermediary; liaisons; bridges. Therefore, it's the number of people who a person is connected to indirectly through their direct links.
- Closeness: The degree an individual is near all other individuals in a network (directly or indirectly). It reflects the ability to access information through the “grapevine” of network members. Thus, closeness is the inverse of the sum of the shortest distances between each individual and every other person in the network.
- Eigenvector centrality: a measure of the importance of a node in a network. It assigns relative scores to all nodes in the network based on the principle that connections to nodes having a high score contribute more to the score of the node in question. Google's PageRank is a variant of the Eigenvector centrality measure.
- Flow betweenness centrality: The degree that a node contributes to sum of maximum flow between all pairs of nodes (not that node).
- Centralization on Wikipedia - The difference between the n of links for each node divided by maximum possible sum of differences. A centralized network will have many of its links dispersed around one or a few nodes, while a decentralized network is one in which there is little variation between the n of links each node possesses.
- Clustering coefficient on Wikipeida - A measure of the likelihood that two associates of a node are associates themselves. A higher clustering coefficient indicates a greater 'cliquishness'. The clustering coefficient for a vertex is the proportion of links between the vertices within its neighbourhood divided by the number of links that could possibly exist between them.
- Cohesion on Wikipedia - The degree to which actors are connected directly to each other by cohesive bonds. Groups are identified as ‘cliques’ if every actor is directly tied to every other actor, ‘social circles’ if there is less stringency of direct contact, which is imprecise, or as structurally cohesive blocks if precision is wanted.
- Structural cohesion - The minimum number of members who, if removed from a group, would disconnect the group. It is also useful to know that k-cohesive graphs (or k-components) are always a subgraph of a k-core, although a k-core is not always k-cohesive. A k-core is simply a subgraph in which all nodes have at least k neighbors but it need not even be connected.
^
- Density on Wikipedia - The degree a respondent's ties know one another/ proportion of ties among an individual's nominees. Network or global-level density is the proportion of ties in a network relative to the total number possible (sparse versus dense networks).
- , denotes the number of edges, the number of vertices, the maximal density is 1, which is similar to the
- Path Length - The distances between pairs of nodes in the network.
- Average path length is the average of these distances between all pairs of nodes.
- Diameter - The maximum shortest path length
- Radiality - Degree an individual’s network reaches out into the network and provides novel information and influence
- Reach - The degree any member of a network can reach other members of the network.
- Structural equivalence on Wikipedia - Refers to the extent to which actors have a common set of linkages to other actors in the system. The actors don’t need to have any ties to each other to be structurally equivalent.
- Structural hole - Static holes that can be strategically filled by connecting one or more links to link together other points. Linked to ideas of social capital: if you link to two people who are not linked you can control their communication.
- Degree distribution on Wikipedia - Degree distribution gives the probability distribution of degrees in a network. Its use originates from the study of random graph by Paul Erdős and Alfréd Rényi, and it has become an important concept which describes the topology of complex networks. Power-law distribution by Albert-László Barabási.
- Size - the number of nodes
- Triadic Census - TriadicCensus is a standard social network tool that counts, for each of the different possible configurations of three vertices, the number of times that that configuration occurs in the given graph. 16 possible triads
- Structural Holes - Calculates some of the measures from Burt's text "Structural Holes: The Social Structure of Competition". In Structural Holes, Ron Burt (1992; 1995) describes a set of new measures based on ego networks. One key set of measures is concerned with the notion of redundancy. The general meaning of redundancy is clear: a person's ego network has redundancy to the extent that her contacts are connected to each other as well. Unpacking Burt's Redundancy Measures
References
Papers
Model of Small-World Networks and Search
- D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, 393:440-442 (1998).
- Networks must exist on a sliding scale between clustered and random, producing tightly knit groups of friends but also short paths that reach throughout the whole network. Proposed two concepts to measure the small world, which are HIGH clustering coefficient and SHORT average path length.
- J. Kleinberg. Navigation in a Small World. Nature 406(2000), 845.
- J. Kleinberg. The small-world phenomenon: An algorithmic perspective. Proc. 32nd ACM Symposium on Theory of Computing, 2000.
- J. Kleinberg. Small-World Phenomena and the Dynamics of Information. Advances in Neural Information Processing Systems (NIPS) 14, 2001.
- D. J. Watts, P. S. Dodds, and M. E. J. Newman. Identity and search in social networks. Science, 296, 1302-1305 (2002).
- Oskar Sandberg and Ian Clarke. The Evolution of Navigable Small-World Networks. arxiv cs.DS/0607025, July 2006.
Scale-free network / Degree distribution
- R. Albert, H. Jeong, A.-L. Barabási. Diameter of the world wide web. Nature 401, 130-131 (1999).
- A.-L. Barabási, R. Albert. Emergence of scaling in random networks. Science 286, 509-512 (1999).
- A.-L. Barabási, R. Albert, H. Jeong, G. Bianconi. Power-law distribution of the World Wide Web. Science 287, 2115 (2000).
Search a social network
- Lada A. Adamic, Rajan M. Lukose, Amit R. Puniyani, Bernardo A. Huberman. Search in Power-Law Networks. Phys. Rev. E, 64 46135 (2001).
- P. S. Dodds, R. Muhamad, and D. J. Watts. An experimental study of search in global social networks. Science, 301, 827-829 (2003).
- Lada A. Adamic and Eytan Adar. How to search a social network. Social Networks, 27(3):187-203, July 2005.
- G. Kossinets and D. J. Watts. Empirical Analysis of Evolving Social Networks. Science, 311, 88-90 (2006).
Good Survey papers on complex social networks
- R. Albert, A.LBarabási. Statistical mechanics of complex networks. Rev. Modern Phys. 74(1), 47–97 (2002).
- M. E. J. Newman. The structure and function of complex networks. SIAM Review, 2003, 45, 167-256.
Good Related Courses
- The Structure of Information Networks by Kleinberg @ Cornell
- “The past decade has seen a convergence of social and technological networks, with systems such as the World Wide Web characterized by the interplay between rich information content, the millions of individuals and organizations who create it, and the technology that supports it. This course covers recent research on the structure and analysis of such networks, and on models that abstract their basic properties. Topics include combinatorial and probabilistic techniques for link analysis, centralized and decentralized search algorithms, network models based on random graphs, and connections with work in the social sciences.”
Good Related Tutorials
- Tools for large networks: Structure and diffusion in WWW2008 by Jure Leskovec and Christos Faloutsos
Books
- Robert A. Hanneman and Mark Riddle. Introduction to Social Network Methods. UC Riverside, 2005.
- Peter J. Carrington, John Scott, Stanley Wasserman. Models and Methods in Social Network Analysis. Cambridge University Press, 2005. on Amazon
- John P Scott. Social Network Analysis: A Handbook (Paperback). Sage Publications Ltd, 2000.
Articles
- Could it be a Big World After All? The `Six Degrees of Separation' Myth, J. Kleinfeld. Society, April 2002.
- Thoughts on the Social Graph, August, 2007.
- Google’s Marissa Mayer: Social search is the future, Interviewed by VentureBest
- Is Social Search Coming from Google?, February 1, 2008
- Social Search is Coming, February 1, 2008
- The Evolution of Social Networks by By Levy Cohen, Search Engine Watch, Mar 27, 2008
- Evolution of the social network by By Marc Cieslak, BBC Click, 29 March 2008
Videos & PPTs
- Duncan Watts and Kleinberg discuss the small-world problem on NPR's Science Friday (8 August 2003).
- Challenges in Social Network Data: Processes, Privacy and Paradoxes by Jon Kleinberg on Videolectures
- Social networks and trust : NetTrust on Youtube
- Social Network Analysis: A Research Perspective by Danyel Fisher (Microsoft)ppt
Others
- Social Networks notes of Prof. Cosma Rohilla Shalizi
- Social Network Analysis Page of Prof. Tom Snijders