Sunday, April 19, 2009

Compact representation for "large-scale" computations

Last Friday, I gave a presentation on some of the really interesting and recent work by John Langford et al. in the advanced machine learning seminar at UMass Amherst. After listening to John at a recent Machine learning friends lunch seminar, I got the high-level idea of this work so I went on to explore its details. Although this work addresses issues tangential to the focus of the seminar, it worked out fine, I guess.

I posted my slides at http://vis-www.cs.umass.edu/~vidit/presentations/repLargeScaleComp.pdf.

Please let me know if I misinterpreted/missed something important.

Sunday, March 22, 2009

Infinitely Imbalanced Logistic Regression

Art Owen had an interesting, if not surprising, paper titled Infinitely Imbalanced Logistic Regression in JMLR 8 (2007). In this paper, Owen investigates logistic regression for binary classification tasks where one of the classes have finite number of samples and the number of samples for the other class approaches infinity, thus creating an infinite imbalance between the two. He shows that although the intercept of the learned logistic regression approaches -infinity, the remaining coefficient vector approaches a non-trivial and useful limit. Furthermore, the minority class affects the coefficient vector ONLY through its empirical mean, thus it suffices (within sampling uncertainty, that is) to replace all the samples of the minority class with a single sample for the mean.

Monday, March 16, 2009

Junction tree notes

I needed a brief and precise refresher for junction tree algorithms to review a paper. I found the following lecture notes from Mark Peshkin very useful:
http://ai.stanford.edu/~paskin/gm-short-course/lec3.pdf

Tuesday, March 10, 2009

getting cgi to work on Mac

Sorry for this non-vision-ey, non-ML-ey post, but I need to make a note of this to save some time and energy and time in future. Who knows, you might find it useful someday.

Steps to follow:
1) httpd -V (to find out SERVER_CONFIG_FILE)

2) edit this file as following (would need root/super-user access):

(a) insert (somewhere)

<Directory "/Users/*/Sites/cgi-bin">
AllowOverride None
Options ExecCGI
Order allow,deny
Allow from all
</Directory>


(b) Uncomment
AddHandler cgi-script .cgi


You would need to turn Web Sharing OFF and then ON to bring the changes into effect.

Monday, February 02, 2009

SUnS'09

I attended the scene understanding symposium at Boston. There were some very good talks (I missed two sessions which I definitely did not want to). I missed the first half of Jeremy Wolfe's talk on "Search in real scenes: The latest mysteries, the latest clues." It was interesting.

One interesting thing I learned from Michael Paradiso's talk was about some experiments (--Need link--) showing that the regions of brain that are involved in early stages of processing continue to be active in even the later stages. This suggests that a pipeline approach to model it may not be useful.

Uri Hasson had an entertaining talk on estimating temporal responses in the human brain while watching modified clips (sequential, reverse sequential, etc.) of Charlie Chaplin's videos.

I missed David Forsyth's talk. Ce Liu's talk about dense scene alignment was very impressive. He has code and results on this project webpage.

Tuesday, December 30, 2008

NIPS' paper list -- revisited/filtered with one-line detail.

# Modeling the effects of memory on human online sentence processing with particle filters
R. Levy, F. Reali, T. Griffiths
-- incremental, limited-memory model for understanding sentences using particle filters.

# An ideal observer model of infant object perception
C. Kemp, F. Xu
-- perception is guided by the principle of persistence, i.e. things tend to remain the same and mostly follow rigid motion.

# A rational model of preference learning and choice prediction by children
C. Lucas, T. Griffiths, F. Xu, C. Fawcett
-- econometric model for explaining a young child's use of statistical information to infer preferences. very interesting.

----------------------------------------
# Bounds on marginal probability distributions
J. Mooij, H. Kappen
-- bound on single-variable marginal probability distributions in factor graphs by propagating bounds (convex sets of probability distributions) over a subtree of the factor graph, rooted in the variable of interest. Bounds its approximate Belief Propagation marginal, or belief, as well.

# Domain Adaptation with Multiple Sources
Y. Mansour, M. Mohri, A. Rostamizadeh
-- convex combination of multiple source hypothesis can perform poorly. there exists a distribution weighted combination that achieves the same error as the maximum error on all the sources. (this paper could throw some insight for distributed IR .. may be that is exactly this paper is about).

# Beyond Novelty Detection: Incongruent Events, when General and Specific Classifiers Disagree
D. Weinshall, H. Hermansky, A. Zweig, J. Luo, H. Jimison, F. Ohl, M. Pavel
-- HAVE to see. experiments include face recognition for audio+video data.

# Accelerating Bayesian Inference over Nonlinear Differential Equations with Gaussian Processes
B. Calderhead, M. Girolami, N. Lawrence
-- GP regression to accelerate inference.. should look in more details.

# Generative and Discriminative Learning with Unknown Labeling Bias
M. Dudik, S. Phillips
-- entropy-based weighting offers an improvement over constant estimates of class proportions, consistently reducing log loss on unbiased test data... not sure about the generalizability here.. need to look at the full paper.

Saturday, December 20, 2008

NIPS papers by title-- my selection

My selection (solely based on the title) grouped according to my inference about their content. I will look at the abstracts soon (hopefully), and then look at some of these more carefully.

----------
# Modeling the effects of memory on human online sentence processing with particle filters
R. Levy, F. Reali, T. Griffiths
# Analyzing human feature learning as nonparametric Bayesian inference
J. Austerweil, T. Griffiths
# An ideal observer model of infant object perception
C. Kemp, F. Xu
# A rational model of preference learning and choice prediction by children
C. Lucas, T. Griffiths, F. Xu, C. Fawcett

---------------
# One sketch for all: Theory and Application of Conditional Random Sampling
P. Li, K. Church, T. Hastie
# Bounds on marginal probability distributions
J. Mooij, H. Kappen
# Rademacher Complexity Bounds for Non-I.I.D. Processes
M. Mohri, A. Rostamizadeh
# Domain Adaptation with Multiple Sources
Y. Mansour, M. Mohri, A. Rostamizadeh
# Comparing model predictions of response bias and variance in cue combination
R. Natarajan, I. Murray, L. Shams, R. Zemel
# Beyond Novelty Detection: Incongruent Events, when General and Specific Classifiers Disagree
D. Weinshall, H. Hermansky, A. Zweig, J. Luo, H. Jimison, F. Ohl, M. Pavel
# Accelerating Bayesian Inference over Nonlinear Differential Equations with Gaussian Processes
B. Calderhead, M. Girolami, N. Lawrence
# Generative and Discriminative Learning with Unknown Labeling Bias
M. Dudik, S. Phillips

-----------
# Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation
I. Mukherjee, D. Blei
# Learning Taxonomies by Dependence Maximization
M. Blaschko, A. Gretton
# DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification
S. Lacoste-Julien, F. Sha, M. Jordan
# Deflation Methods for Sparse PCA
L. Mackey
# Bayesian Exponential Family PCA
S. Mohamed, K. Heller, Z. Ghahramani

----------
# SDL: Supervised Dictionary Learning
J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman
# Cascaded Classification Models: Combining Models for Holistic Scene Understanding
G. Heitz, S. Gould, A. Saxena, D. Koller
# A "Shape Aware" Model for semi-supervised Learning of Objects and its Context
A. Gupta, J. Shi, L. Davis

Richard Hamming's talk on doing top quality research

I stumbled on a transcript of Hamming's talk from 1986, where he talks about doing "Nobel-Prize research" as opposed to "good research". While his concerns about the role of computing appear to be unequivocally addressed in the current of science, the points he makes about how a good researcher could be more productive and successful, will always remain valid. Although the transcript is pretty long and incoherent in parts (IMHO), I found it to be one of the most inspirational, yet practical, talks about doing good research. Lots of big names (Shannon, Feynman, etc.) and related anecdotes/events are mentioned.

My favorite points: open doors, knowledge/productivity (or the lack of either) as compound interest, reading papers to learn about the problems (and not the solutions), appearance of conforming, and (for pretty much the first time by a researcher) acknowledgment of management as a reasonable choice.
 
Learning in Vision