Interpretable machine learning
The impact of machine learning prediction models has created a growing need for us to understand why they make their predictions. The interpretation of these models is important to reveal their fundamental behavior, to obtain scientific insights into data, and to help us trust automatic predictions. In our recent works, we advance these directions via the problem of feature interaction discovery. We develop a way to interpret the feature interactions in feedforward neural networks by tracing their learned weights. We follow-up on this method and develop a way of learning transparent neural networks. Lastly, we investigate applications of this work on interpreting black-box models beyond feedforward neural networks, such as image/text classifiers and recommender systems. The physical meanings and practical importance of our interpretations are demonstrated throughout.
ICLR 2018; NeurIPS 2018
Deep learning for healthcare
Deep learning is reinventing the field of medicine and healthcare. There is steady progress by AI researchers to help combat deadly diseases and make assistive care more affordable. In our recent works, we surveyed all the applications of deep learning for early prediction of Prostate Cancer as well as efficient and confident estimates for the time to recovery. To bridge the gap between the potential and actual efficient quality care, we use real Robot-assisted radical prostatectomy data to demonstrate our algorithms. We have also developed models to distill knowledge from any deep learning black box models into interpretable tree-based models. Being at the frontiers of deep learning for health care, we have also patented technology that set us apart from the crowd.
BJUI 2019; J UROL 2019; Sci Rep 2018; J. Biomed. Inform. 2018; J Endourol 2018; ICDMW 2017
Social network analysis
We focus on information diffusion on social media, and its impact on misinformation propagation, polarization, and towards the discovery of topics and communities. We developed latent variable models that can uncover topics and author communities from blog data and transfer algorithms for sentiment prediction. We also analyze social networks and topic flows in posts from corporate discussion forum to examine the determining factors that lead to innovative ideas. We developed methods for early detection of fake news by learning generative models of social media posts. We are further investigating how diffusion analysis can facilitate fake news mitigation efforts.
ICML 2003; ECML 2003; CIKM 2002; IJCAI 2007; ICML 2009; CIKM 2009; KDD SNA Workshop 2007; WSDM 2013; VLDB 2013; IJCAI 2018; ACM TIST 2019; KDD 2021; ICWSM 2021; NeurIPS 2021
Physics-informed machine learning
Machine learning methods have had great success in learning complex representations of data that enable novel modeling and data processing approaches in many scientific disciplines. Specifically, physics related knowledge can be integrated with data-driven models for efficient learning on physical domain. Under this view, we develop several methods to utilize physical prior knowledge for learning spatiotemporal dynamics, such as climate modeling, traffic prediction, and physically simulated data. First, we propose a novel solution that can automatically infer data quality levels of different sources through local variations of spatiotemporal signals without explicit labels. Second, we propose an architecture to incorporate implicit physics knowledge which is given from domain experts by informing it in latent space. Furthermore, we exploit neighboring information to learn finite differences on graph-based observations inspired by physics equations.
NeurIPS Workshop 2017; ICLR 2018; ICLR Workshop 2019; ICLR 2020
Learning in multiplayer games
Modeling a domain as a multi-player game finds usage in many fields such as surveillance, security, finance and obviously multi-player video games. Recent works have featured beyond human-level intelligence at games of Poker, Go, Chess, DotA and Starcraft using multi-agent reinforcement learning techniques. A key game theoretic concept which allows computing Nash equilibria strategies in multi-player games is fictitious play. In our recent works, we have introduced both on-policy and off-policy algorithms based on fictitious play and reinforcement learning to effectively compute Nash equilibria strategies for spatial coverage and security domains e.g., to protect precious resources like national parks and conserved forests. We further continue to work on modeling actions and probability densities in continuous action spaces and developing efficient techniques to compute optimal strategies for players in multi-player games.
IJCAI 2017; AAAI 2018; AAMAS 2019; GameSec 2019
Generative models for time series
Generative model of time series tries to reveal the underlying data generating mechanism from observations. Such models have powered many application e.g prediction, anomaly detection and simulation have been widely used in many domians including healthcare, wearables, finance and manufactoring. We build large-scale generative model for time series and solve the challenges of non-stationary, mixed sampling rates, real-time streaming analysis and many others real-world scenarios.
ICML 2018; AAAI 2020
Temporal causal modeling
We have introduced and continue to develop a collection of techniques to discover and model causal relationships in multivariate time series data, based on Granger causality. Granger causality is an operational definition of causality created by the Nobel Prize winner Clive Granger and used extensively in econometrics. We have enhanced it with sparse structure learning algorithms for graphical models via penalized regression and developed several other models to capture the dynamic properties, spatial constraints, non-linearity properties, and relational information in time-series data.
KDD 2007; KDD 2009(a); KDD 2009(b); KDD 2009(c); ICML 2010; AAAI 2010; SDM 2012; ICML 2012;SDM 2013; KDD 2013
Autonomous self-driving vehicles
Human driving behavior understanding is a key ingredient for intelligent transportation systems. Either developing self-driving car drives like humans or building V2X systems to improve human driving experience, we need to understand how humans drive and interact with environments. Massive human driving data collected by top ride-sharing platforms and fleet management companies, offers the potential for in-depth understanding of human driving behavior. We build real-time driving behavior understanding systems which works with front-view videos, GPS/IMU signals collected from daily driving scenarios. The analysis procedure is designed by mimicking the human intelligence for driving, powered with representation capability of deep neural networks as well as recent advances in visual perception, video temporal segmentation, attention mechanism, etc.
ICCV Workshop 2019
Learning with structured data
We have developed graphical models and graph-based algorithms to capture the dependencies or constraints between input/output variables in structured data. We introduced segmentation conditional random field models (SCRF) to infer the segmentation of sequences by modeling the dependency information between segments and kernel conditional random fields (kernel CRF) to permit the use of implicit features spaces through Mercer kernels in conditional random fields. We also relaxed the common assumption that the output space has to be fixed beforehand and developed reversible jump MCMC algorithms for fast inference.
RECOMB 2005; ICML 2005; IJCAI 2007; NIPS 2011
Learning with less supervision
We have developed transfer learning algorithms to solve target tasks by leveraging abundant data from related source tasks. We are especially interested in most challenging cases, where observations in the target domain are short unstructured texts with very limited information. We have also explored the potent combination of transfer learning and active learning, where learners use query labels of target examples to adapt the quality of the transfer.
CIKM 2009; KDD 2011(b,c); AAAI 2011; ICDM 2012; ICDM 2013
Anomaly detection
We are investigating graph-based models to capture the dependency between features or instances for more effective anomaly detection. We developed a graph-based algorithm that makes use of a global similarity matrix motivated by manifold ranking, which results in more compact clusters for the minority classes.
ICDM 2008; SDM 2009; KDD 2009; ICDM 2012
Latent models for text and multimedia analysis
We have developed discriminative latent graphical models to uncover the hidden semantics of text and video data by leveraging labels or dependency information between instances. We developed topic-link LDA models to uncover both the topics and community structure by leveraging link information between documents. We also developed the discriminative harmonium model to better uncover the latent topics in video data by leveraging label information.
ICML 2009; SDM 2007; KDD 2012; ICML 2012
Climate science
Global warming is one of the most critical socio-technological issues that mankind faces in the 21st century. We are working to find solutions by providing better understanding and quantifying the causal effects of climate and climate-forcing agents. We have developed a data-centric approach, namely temporal causal modelsĀ and extensions of this model (group lasso, graph Laplacian, and hidden Markov random fields) that capture the spatial and temporal information in climate data.
KDD 2007; KDD 2009(a); KDD 2009(b); ICML 2010; AAAI 2010; ICML 2012; KDD 2013