Posts | Kemal Eren

Sep 12, 2013

The Bimax Algorithm

Introduction Continuing the series on biclustering algorithms, this post is about BiMax, a method developed to serve as a baseline in a comparison of biclustering algorithms [1]. Because BiMax was developed as a reference method, its objective is intentionally simple: it finds all biclusters consisting entirely of 1s in a binary matrix. Specifically, BiMax enumerates all inclusion-maximal biclusters, which are biclusters of all 1s to which no row or column can be added without introducing 0s.

Aug 18, 2013

Cheng and Church

Introduction The subject of today’s post is a biclustering algorithm commonly referred to by the names of its authors, Yizong Cheng and George Church [1]. It is one of the best-known biclustering algorithms, with over 1,400 citations, because it was the first to apply biclustering to gene microarray data. This algorithm remains popular as a benchmark: almost every new published biclustering algorithm includes a comparison with it. Cheng and Church were interested in finding biclusters with a small mean squared residue, which is a measure of bicluster homogeneity.

Jul 12, 2013

Spectral Biclustering, Part 2

Introduction Continuing the discussion of spectral biclustering methods, this post covers the Spectral Biclustering algorithm (Kluger, et. al., 2003) [1]. This method was originally formulated for clustering tumor profiles collected via RNA microarrays. This section introduces the checkerboard bicluster structure that the algorithm fits. The next section describes the algorithm in detail. Finally, in the last section we will see how it can be used for clustering real microarray data.

Jul 4, 2013

Spectral Biclustering, Part 1

Introduction This is the second entry in the series on biclustering algorithms, this time covering spectral biclustering. This is the first part, focusing on the Spectral Co-Clustering algorithm (Dhillon, 2001) [1]. The next part will focus on a related algorithm, Spectral Biclustering (Kluger et. al., 2003) [2]. To motivate the spectral biclustering problem, let us return to our friend Bob, who hosted the party in the previous article. Bob is thinking about music again, but this time he is writing song lyrics.

Jun 25, 2013

An Introduction to Biclustering

Introduction This is the first in my series of biclustering posts. It serves as an introduction to the concept. The later posts will cover the individual algorithms that I am implementing for scikit-learn. Before talking about biclustering, it is necessary to cover the basics of clustering. Clustering is a fundamental problem in data mining: given a set of objects, group them according to some measure of similarity. This deceptively simple concept has given rise to a wide variety of problems and the algorithms to solve them.

Jun 16, 2013

GSOC 2013

This summer I will be participating in the Google Summer of Code by implementing biclustering algorithms for scikit-learn. As required by the Python Foundation, I will be reporting my progress on this blog.