Month: August 2017

A network approach to topic models

One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a collection of documents. Despite their success — in particular of its most widely used variant called Latent Dirichlet Allocation (LDA) — and numerous applications in sociology, history, and linguistics, topic models are known to suffer from severe conceptual and practical problems, e.g. a lack of justification for the Bayesian priors, discrepancies with statistical properties of real texts, and the inability to properly choose the number of topics. Here, we approach the problem of identifying topical structures by representing text corpora as bipartite networks of documents and words and using methods from community detection in complex networks, in particular stochastic block models (SBM). We show that our SBM-based approach constitutes a more principled and versatile framework for topic modeling solving the intrinsic limitations of Dirichlet-based models through a more general choice of nonparametric priors. It automatically detects the number of topics and hierarchically clusters both the words and documents. In practice, we demonstrate through the analysis of artificial and real corpora that our approach outperforms LDA in terms of statistical model selection.

 

A network approach to topic models
Martin Gerlach, Tiago P. Peixoto, Eduardo G. Altmann

Source: arxiv.org

Diffusion Dynamics and Optimal Coupling in Directed Multiplex Networks

We study the dynamics of diffusion processes acting on directed multiplex networks, i.e., coupled multilayer networks where at least one layer consists of a directed graph. We reveal that directed multiplex networks may exhibit a faster diffusion at an intermediate degree of coupling than when the two layers are fully coupled. We use three simple multiplex examples and a real-world topology to illustrate the characteristics of the directed dynamics that give rise to a regime in which an optimal coupling exists. Given the ubiquity of both directed and multilayer networks in nature, our results could have important implications for the dynamics of multilevel complex systems towards optimality.

 

Diffusion Dynamics and Optimal Coupling in Directed Multiplex Networks
Alejandro Tejedor, Anthony Longjas, Efi Foufoula-Georgiou, Tryphon Georgiou, Yamir Moreno

Source: arxiv.org

The Approach Towards Equilibrium in a Reversible Ising Dynamics Model: An Information-Theoretic Analysis Based on an Exact Solution

We study the approach towards equilibrium in a dynamic Ising model, the Q2R cellular automaton, with microscopic reversibility and conserved energy for an infinite one-dimensional system. Starting from a low-entropy state with positive magnetisation, we investigate how the system approaches equilibrium characteristics given by statistical mechanics. We show that the magnetisation converges to zero exponentially. The reversibility of the dynamics implies that the entropy density of the microstates is conserved in the time evolution. Still, it appears as if equilibrium, with a higher entropy density is approached. In order to understand this process, we solve the dynamics by formally proving how the information-theoretic characteristics of the microstates develop over time. With this approach we can show that an estimate of the entropy density based on finite length statistics within microstates converges to the equilibrium entropy density. The process behind this apparent entropy increase is a dissipation of correlation information over increasing distances. It is shown that the average information-theoretic correlation length increases linearly in time, being equivalent to a corresponding increase in excess entropy.

 

Kristian Lindgren and Eckehard Olbrich

Journal of Statistical Physics 168(4), 919-935 (2017).

 

Source: link.springer.com

How Do People Differ? A Social Media Approach | NECSI

Research from a variety of fields including psychology and linguistics have found correlations and patterns in personal attributes and behavior, but efforts to understand the broader heterogeneity in human behavior have not yet integrated these approaches and perspectives with a cohesive methodology. Here we extract patterns in behavior and relate those patterns together in a high- dimensional picture. We use dimension reduction to analyze word usage in text data from the online discussion platform Reddit. We find that pronouns can be used to characterize the space of the two most prominent dimensions that capture the greatest differences in word usage, even though pronouns were not included in the determination of those dimensions. These patterns overlap with patterns of topics of discussion to reveal relationships between pronouns and topics that can describe the user population. This analysis corroborates findings from past research that have identified word use differences across populations and synthesizes them relative to one another. We believe this is a step toward understanding how differences between people are related to each other.

 

Vincent Wong, Yaneer Bar-Yam, How do people differ? A social media approach, New England Complex Systems Institute (July 1, 2017).

Source: www.necsi.edu

An Immune System Inspired Theory for Crime and Violence in Cities

Crime is ubiquitous and has been around for millennia. Crime is analogous to a pathogenic infection and police response to it is similar to an immune response. The biological immune system is also engaged in an arms race with pathogens. We propose an immune system inspired theory of crime and violence in human societies, especially in large agglomerations like cities. In this work we suggest that an immune system inspired theory of crime can provide a new perspective on the dynamics of violence in societies. The competitive dynamics between police and criminals has similarities to how the immune system is involved in an arms race with invading pathogens. Cities have properties similar to biological organisms and in this theory the police and military forces would be the immune system that protects against detrimental internal and external forces. Our theory has implications for public policy: ranging from how much financial resource to invest in crime fighting, to optimal policing strategies, pre-placement of police, and number of police to be allocated to different cities. Our work can also be applied to other forms of violence in human societies (like terrorism) and violence in other primate societies and eusocial insects. We hope this will be the first step towards a quantitative theory of violence and conflict in human societies. Ultimately we hope that this will help in designing smart and efficient cities that can scale and be sustainable despite population increase.

 

An Immune System Inspired Theory for Crime and Violence in Cities

Soumya Banerjee
INDECS 15(2), 133-143, 2017
DOI 10.7906/indecs.15.2.2

Source: indecs.eu