Newswise — In data analysis, it’s the outlier information that is usually the most interesting, yet sometimes that information goes unrecognized by the most common evaluation methods because they make inaccurate assumptions. But now Michael Houle, a senior university lecturer at New Jersey Institute of Technology’s Ying Wu College of Computing, along with collaborators in Australia, Denmark and Serbia have become outliers themselves for developing the math to prove that breaking those assumptions can work better than conventional methods. “Outlier detection, one of the most fundamental tasks in data mining, aims to identify observations that deviate from the general distribution of the data.
Such observations often deserve special attention as they may reveal phenomena of extreme importance, such as network intrusions, sensor failures or disease,” they wrote in an award-winning paper about their new proof, Dimensionality-Aware Outlier Detection , given at the recent SIAM International Conference on Data Mining (SDM24) in Houston. “Dimensionality is the number of features that you use to describe your data. If you have a 100 pixel by 100 pixel image, which is three colors per pixel, that’s 30,000 features,” Houle explained.
Dimensionality-Aware Outlier Detection was tested on 800 datasets. It uses a mathematical concept called local intrinsic dimensionality. “It is awareness of local variations in dimensionality that makes our method unique,” Houle noted.
".
