This thesis is a contribution to a deeper understanding of how information propagates and what this process entails. At its very core is the concept of the network: a collection of nodes and links, which describes the structure of the systems under investigation. The network is a mathematical model which allows to focus on a very fundamental property: the mutual relations (links) between information exchanging agents (nodes). This simplicity makes networks elegant, as no specifics of any supporting hardware are needed to reason on this high level of abstraction. The developing field of network science led to countless applications of the network model to all sorts of complex systems in nature and technology. Naturally, it became an essential part of many multi-disciplinary research projects. Therefore, understanding how information propagates in networks enables us to learn and conceivably control the intricate processes, which we observe in complex systems. Since complex systems are the driver for this research, the first three chapters of this thesis are studies based on data collected from vastly different application domains, after more fundamental research is addressed in the later parts.
Chapter 2 deals with the interaction of players of a popular multiplayer online game. Due to the competitive design of the game, teams are formed ad-hoc and compete with each other for victory. Some of the players exhibit anti-social behavior towards their teammates, which is known as toxicity. We analyze how toxicity in player networks emerges by developing a toxicity detector, highlighting possible triggers and analyze the disposition of players towards toxic teammates. Furthermore, we show how toxicity is linked to game success.
Chapter 3 continues with a study of the human brain as a functional network. Information processing in the brain is measurable with technologies like magnetoencephalography. From such measurements that were collected from a group of subjects, the phase transfer entropy is computed as a quantity that reflects information exchange. When associated with the links between brain regions, unusual high numbers of certain substructures are observed in this network. We find that one of these substructures, the bi-directional two-hop path, to be highly abundant and robust within different frequencies bands, which highlights its importance for the propagation of brain activity. A clustering of the network based on these frequent substructures reveals a spatially coherent organization of important brain regions.
A common symbol of propagation is the virus, which is at the center of the third data-driven analysis of this thesis in Chapter 4. More precisely, we research the digital version of the virus, the computer worm, and analyze its propagation by epidemic network models. With epidemic models, the state of the nodes in a network can be described as susceptible or infected. An infection process and a curing process determine how the nodes are changing between those states. We extend on the standard epidemic models, the SIS model, by a time-dependent curing rate function to reflect the changes in the effectiveness of the active worm removal. Once we set the curing rate function, the empirical worm data are fitted and analyzed on multiple scales from the global over the country down to the autonomous system level. The fitted model explains how computer worms or similar self-replicating pieces of information might change in their effectiveness over long periods of time.
The SIS model returns as a central piece in Chapter 5 again. Although spreading processes are frequently modeled in isolation, the dynamics of many real-world applications are often driven by the interaction of multiple of such processes. These interactions can range from viruses that compete for susceptible nodes to viruses that mutually reinforce their propagation. We study the special case of superinfection, in which one dominant virus spreads within the infected population of a weaker virus. We highlight the conditions for which a co-existence of both viruses is stable and show that extinction cycles become possible if the infection rate of the dominant virus becomes too strong. Furthermore, we show that some of the possible outcomes of a superinfection are difficult to approximate with common mean-field techniques. However, the second largest eigenvalue of the infinitesimal generator of the underlying Markov process is potentially linked to co-existence and thus stability.
Chapter 6 is a study on the capabilities of symbolic regression for network properties. We develop an automated system based on Genetic Programming which is able to be trained by families of networks to learn the relations between several of their properties. These properties can be features of the networks like the eigenvalues of their adjacency or Laplacian matrices or network metrics like the network diameter or the isoperimetric number. We show that the system can generate approximate formulas for those metrics that often give better results than previously known analytic bounds. The evolved formulas for the network diameter are evaluated on a selection of real-world networks of different origins. The network diameter bounds hop-based information propagation and is thus of high importance for designing network algorithms. A careful selection of training networks and network features is crucial for evolving good approximate formulas for the network diameter and similar properties.
Finally, the thesis concludes with Chapter 7 which revisits the concepts that were developed and provides some critical assessment on their potential and limitations.