Soheil Feizi, Oct 2015
- Visitor: Soheil Feizi (MIT)
- Date: October 19th (Mon)
- 9:15 am - Yutaka Hori (218 ANB; ex 3552)
- 10:15 am - Enrique Mallada (ANB 338)
- 11 am - Vipul Singhal (Keck 139)
- 11:45 am - seminar set up
- 12-1:15 pm - seminar (121 ANB)
- 1:30 pm -
- 2:15 pm - Anandh Swaminathan (Steele 103)
- 3 pm - Enoch Yeung (Keck 139)
- 4 pm - Richard Murray (Steele 109)
- 4:45 pm - Done
Title: Learning (from) networks: fundamental limits, algorithms, and applications
Abstract: Network models provide a unifying framework for understanding dependencies among variables in medical, biological, and other sciences. Networks can be used to reveal underlying data structures, infer functional modules, and facilitate experiment design. In practice, however, size, uncertainty and complexity of the underlying associations render these applications challenging.
In this talk, we illustrate the use of spectral, combinatorial, and statistical inference techniques in several significant network science problems. First, we consider the problem of network alignment where the goal is to find a bijective mapping between nodes of two networks to maximize their overlapping edges while minimizing mismatches. To solve this combinatorial problem, we present a new scalable spectral algorithm, and establish its efficiency theoretically and experimentally over several synthetic and real networks. Next, we introduce network maximal correlation (NMC) as an essential measure to capture nonlinear associations in networks. We characterize NMC using geometric properties of Hilbert spaces and illustrate its application in learning network topology when variables have unknown nonlinear dependencies. Finally, we discuss the problem of learning low dimensional structures (such as clusters) in large networks, where we introduce logistic Random Dot Product Graphs, a new class of networks which includes most stochastic block models as well as other low dimensional structures. Using this model, we propose a spectral network clustering algorithm that possesses robust performance under different clustering setups. In all of these problems, we examine underlying fundamental limits and present efficient algorithms for solving them. We also highlight applications of the proposed algorithms to data-driven problems such as functional and regulatory genomics of human diseases, and cancer.