Research in Algorithms

Dimension reduction for scientific applications: Reducing the number of dimensions, that is, the number of features representing a data point, is important in scientific applications to minimize the effect of irrelevant or redundant features in any subsequent analysis. Often many different types of features are extracted for each data point using a range of techniques, and domain information alone may not be sufficient to prune the features to keep only the relevant ones. We investigated filters, wrappers, and several non-linear dimension reduction techniques for their effectiveness in scientific applications ranging from remote sensing to astronomy and plasma physics.


ASPEN – Approximate splitting for ensembles of decision trees: Ensembles of classifiers, where different classifiers are created from the same data set through randomization, can improve the classification accuracy. To reduce the cost of creating multiple classifiers, we considered two ways to randomize the split decision at each node of the tree – use a sub-sample of instances at the node to identify the best split, or create a histogram, evaluate splits at the mid-point of each bin, and select the split randomly in the bin that contains the best split. A combination of both ideas can furthur reduce the cost of building the ensemble.


Tracking moving objects in simulations and video: Detection and tracking of moving objects are important tasks in problems such as activity detection and identification in video sequences. We explored a range of techniques, focusing on how to make them more robust and computationally efficient so we could detect and track a moderate number of vehicles in video from traffic sequences, as well as a large number of non-rigid, coherent structures in spatio-temporal data from simulations.