University of Otago logo. Computer and Information Science Seminars

Seminar Homepage


Genevera Allen, Rice University


Two talks: Convex Biclustering with Applications to Cancer Subtype Discovery
Within Group Variable Selection via the Exclusive Lasso


Owheo G34 - 2:00 pm, Wednesday 23 March


Genevera will present two short talks of about 30 minutes each, with a Q&A session to follow.

1: Convex Biclustering with Applications to Cancer Subtype Discovery

Discovering molecular cancer subtypes is a critical step towards developing personalized therapies for cancer treatment. Cancer subtypes consist of groups of patients that share distinct genomic signatures and also exhibit differing clinical outcomes. Finding these subtypes from high-throughput genomics data, can be framed as a biclustering problem where we simultaneously seek to find clusters of patients and clusters of genomic biomarkers indicative of these patient groups. In this talk, we present a convex formulation of the biclustering problem that possesses a unique global minimizer and an iterative algorithm, COBRA, that is guaranteed to identify it. Our approach generates an entire solution path of possible biclusters as a single tuning parameter is varied. We also show how to select this tuning parameter via validation which reduces to solving a trivial modification of our convex biclustering problem. The key contributions of our work are its simplicity, interpretability, and algorithmic guarantees - features that arguably are lacking in the current alternative algorithms. We apply this method to discover ovarian cancer subtypes and also discuss extensions of this approach to subtype discovery by integrating data from multiple genomic platforms.

2: Within Group Variable Selection via the Exclusive Lasso

Many data sets consist of variables with an inherent group structure. The problem of group selection has been well studied, but in this paper, we seek to do the opposite: our goal is to select at least one variable from each pre-defined group in the context of predictive regression modeling. This problem is NP-hard, but we propose the tightest convex relaxation: a composite penalty that is a combination of the L1 and L2 norms. Our so-called Exclusive Lasso method performs structured variable selection by ensuring that at least one variable is selected from each group. We study our method's statistical properties and develop computationally scalable algorithms for fitting the Exclusive Lasso. We study the effectiveness of our method via simulations using NMR spectroscopy data. Here, we use the Exclusive Lasso to select the appropriate chemical shift from a dictionary of possible chemical shifts for each molecule in the biological sample.

Last modified: Wednesday, 23-Mar-2016 13:31:09 NZDT

This page is maintained by the seminar list administrator.