The Field

The field of statistical machine learning is far-reaching in both its effect on other scientific communities as well as our everyday lives. dance As a field formed from the intersection of computer science and statistics, it draws heavily upon concepts from algorithms, optimization, probability, and information theory. Through this interdisciplinary pedigree, its science alone is capable of solving the deceptively simple problems that humans untangle without pause. Statistical machine learning seeks classification and inferential engines grounded in solid theoretical understanding. Then everyday problems, such as internet search and object recognition, can be solved through the creative application of these notions. Researchers in fields as diverse as bioinformatics, finance, communications, and artificial intelligence also directly rely on these techniques.

However, this power does not come freely. To achieve high levels of success, learning algorithms may require multiple phases of training using enormous corpora constituting terabytes of storage space. Optimization techniques may pose learning dilemmas as linear algebraic operations on matrices with dimensionalities in the millions. Rigorous mathematical frameworks may require extensive parameter experimentation and adjustment. These problems, and more, necessitate researchers intelligently allocate available resources and amplify any inherent structure or implications intrinsic to the relevant data.

My Focus

My research interests, and long-term goals, center on statistical machine learning as applied to language modeling and analysis. I find that there are several advantages in exploring machine learning under this application. These include:
  • Its high-dimensional nature makes for exciting theory.
  • There is generally an abundance of accessible training data.
  • Sophisticated language models could revolutionize the manner in which we access data.
Recently I have dereived great satisfaction in helping to form a stochastic model under which information retreival and document classification decisions are possible. It is my hope to more deeply explore these notions in ever-increasing levels of model sophistication and theoretical understanding.

In the long term I would like to explore topic extraction at varying levels of resolution. Ideally this effort could be achieved in the realm of graphical models. I would also like to explore and apply data mining techniques.

More generally, I am interested in algorithms, statistics, information theory, and programming languages.