An Outlier Analysis on Multi-Dimensional and Time-Series Data
DOI:
https://doi.org/10.5281/zenodo.8071460Keywords:
Outlier detection, multivariate, time-series, datasetAbstract
Outlier detection refers to the detection of unexpected situations in the data. Outliers are fraud, hacking, mislabeled data, or unusual behavior in the system. Therefore, it is important to determine these values. In this study, outlier detection performances of the algorithms used in outlier detection analysis on different types of data sets were calculated and compared. As a result of the study, it was seen that the algorithms showed sufficient success. The highest performance was seen in the Histogram-based outlier detection algorithm with 99 % accuracy.
References
H. Wang, M. J. Bah, and M. Hammad, “Progress in Outlier Detection Techniques: A Survey,” IEEE Access, vol. 7, pp. 107964–108000, 2019, doi: 10.1109/ACCESS.2019.2932769.
J. Han, M. Kamber, and J. Pei, “Data Preprocessing,” Data Min., pp. 83–124, Jan. 2012, doi: 10.1016/B978-0-12- 381479-1.00003-4.
H. Liu, X. Li, J. Li, and S. Zhang, “Efficient Outlier Detection for High-Dimensional Data,” IEEE Trans. Syst. Man, Cybern. Syst., vol. 48, no. 12, pp. 2451–2461, Dec. 2018, doi: 10.1109/TSMC.2017.2718220.
H. S. Wu, “A survey of research on anomaly detection for time series,” 2016 13th Int. Comput. Conf. Wavelet Act. Media Technol. Inf. Process. ICCWAMTIP 2017, pp. 426–431, Oct. 2017, doi: 10.1109/ICCWAMTIP.2016.8079887.
D. Xu, Y. Wang, Y. Meng, and Z. Zhang, “An improved data anomaly detection method based on isolation forest,” Proc. - 2017 10th Int. Symp. Comput. Intell. Des. Isc. 2017, vol. 2, pp. 287–291, Feb. 2018, doi: 10.1109/ISCID.2017.202.
M. Goldstein, “Unsupervised Anomaly Detection Benchmark,” 2015.
S. Behera and R. Rani, “Comparative analysis of density based outlier detection techniques on breast cancer data using hadoop and map reduce,” Proc. Int. Conf. Inven. Comput. Technol. ICICT 2016, vol. 2, Jul. 2016, doi: 10.1109/INVENTIVE.2016.7824883.
F. Keller, E. Müller, and K. Böhm, “HiCS: High Contrast Subspaces for Density-Based Outlier Ranking”.
F. Tony Liu, K. Ming Ting, and Z.-H. Zhou, “Isolation Forest”.
I. D. K. and V. O. Kozitsin, “Skoltech Anomaly Benchmark (SKAB).,” Kaggle, 2020. https://www.kaggle.com/dsv/1693952 (accessed May 24, 2022).
I. Katser, V. Kozitsin, V. Lobachev, and I. Maksimov, “Unsupervised Offline Changepoint Detection Ensembles,” Appl. Sci. 2021, Vol. 11, Page 4280, vol. 11, no. 9, p. 4280, May 2021, doi: 10.3390/APP11094280.
A. Garg, W. Zhang, J. Samaran, R. Savitha, and C. S. Foo, “An Evaluation of Anomaly Detection and Diagnosis in Multivariate Time Series,” IEEE Trans. Neural Networks Learn. Syst., 2021, doi: 10.1109/TNNLS.2021.3105827.
X. Wang, D. Pi, X. Zhang, H. Liu, and C. Guo, “Variational transformer-based anomaly detection approach for multivariate time series,” Measurement, vol. 191, p. 110791, Mar. 2022, doi: 10.1016/J.MEASUREMENT.2022.110791.
H. Li, X. Peng, H. Zhuang, and Z. Lin, “Multiple Temporal Context Embedding Networks for Unsupervised time Series Anomaly Detection,” ICASSP 2022 - 2022 IEEE Int. Conf. Acoust. Speech Signal Process., pp. 3438–3442, May 2022, doi: 10.1109/ICASSP43922.2022.9747668.
A. Putina, M. Sozio, D. Rossi, and J. M. Navarro, “Random histogram forest for unsupervised anomaly detection,” Proc. - IEEE Int. Conf. Data Mining, ICDM, vol. 2020-November, pp. 1226–1231, Nov. 2020, doi:10.1109/ICDM50108.2020.00154.
I. Ullah, H. Hussain, I. Ali, and A. Liaquat, “Churn Prediction in Banking System using K-Means, LOF, and CBLOF,” 1st Int. Conf. Electr. Commun. Comput. Eng. ICECCE 2019, Jul. 2019, doi: 10.1109/ICECCE47252.2019.8940667.
G. A. Susto, A. Beghi, and S. McLoone, “Anomaly detection through on-line isolation Forest: An application to plasma etching,” pp. 89–94, Jul. 2017, doi: 10.1109/ASMC.2017.7969205.
T. Huang et al., “An LOF-based adaptive anomaly detection scheme for cloud computing,” Proc. - Int. Comput. Softw. Appl. Conf., pp. 206–211, 2013, doi: 10.1109/COMPSACW.2013.28.
A. Kind, M. Stoecklin, and X. Dimitropoulos, “Histogram-based traffic anomaly detection,” IEEE Trans. Netw. Serv. Manag., vol. 6, no. 2, pp. 110–121, Jun. 2009, doi: 10.1109/TNSM.2009.090604.
M. Xie, J. Hu, and B. Tian, “Histogram-based online anomaly detection in hierarchical wireless sensor networks,” Proc. 11th IEEE Int. Conf. Trust. Secur. Priv. Comput. Commun. Trust. - 11th IEEE Int. Conf. Ubiquitous Comput. Commun. IUCC-2012, pp. 751–759, 2012, doi: 10.1109/TRUSTCOM.2012.173.
H.-P. Kriegel, M. Schubert, and A. Zimek, “Angle-Based Outlier Detection in High-dimensional Data,” 2008, Accessed: May 19, 2022. [Online]. Available: http://www.dbs.ifi.lmu.de
M. Ahmed, A. N. Mahmood, and M. R. Islam, “A survey of anomaly detection techniques in financial domain,” Futur. Gener. Comput. Syst., vol. 55, pp. 278–288, Feb. 2016, doi: 10.1016/J.FUTURE.2015.01.001.
J. Liu and T. Zou, “Identifying the outlier in tunnel monitoring data: An integration model,” Comput. Commun., vol. 188, pp. 145–155, Apr. 2022, doi: 10.1016/J.COMCOM.2022.03.002.
J. Fang, Z. Xie, H. Cheng, B. Fan, H. Xu, and P. Li, “Anomaly detection of diabetes data based on hierarchical clustering and CNN,” Procedia Comput. Sci., vol. 199, pp. 71–78, Jan. 2022, doi: 10.1016/J.PROCS.2022.01.010.
T. Y. Kim and S. B. Cho, “Web traffic anomaly detection using C-LSTM neural networks,” Expert Syst. Appl., vol. 106, pp. 66–76, Sep. 2018, doi: 10.1016/J.ESWA.2018.04.004.
V. Chang, L. M. T. Doan, A. Di Stefano, Z. Sun, and G. Fortino, “Digital payment fraud detection methods in digital ages and Industry 4.0,” Comput. Electr. Eng., vol. 100, p. 107734, May 2022, doi: 10.1016/J.COMPELECENG.2022.107734.
I. T. Christou, M. Bakopoulos, T. Dimitriou, E. Amolochitis, S. Tsekeridou, and C. Dimitriadis, “Detecting fraud in online games of chance and lotteries,” Expert Syst. Appl., vol. 38, no. 10, pp. 13158–13169, Sep. 2011, doi: 10.1016/J.ESWA.2011.04.124.
M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “LOF: Identifying Density-Based Local Outliers,” 2000.
Z. Zhao, Yue and Nasrullah, Zain and Li, “PyOD: A Python Toolbox for Scalable Outlier Detection,” Journal of Machine Learning Research, 2019. http://jmlr.org/papers/v20/19-011.html (accessed May 24, 2022).
N. R. Prasad, S. Almanza-Garcia, and T. T. Lu, “Anomaly detection,” Comput. Mater. Contin., vol. 14, no. 1, pp. 1–22, 2009, doi: 10.1145/1541880.1541882.
T. Fawcett, “An introduction to ROC analysis,” Pattern Recognit. Lett., vol. 27, no. 8, pp. 861–874, Jun. 2006, doi: 10.1016/J.PATREC.2005.10.010.
Published
Issue
Section
License
Copyright (c) 2022 AINTELIA Science Notes Journal

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
COPYRIGHT NOTICE
Authors submitting a manuscript do so on the understanding that if accepted for publication, copyright of the article shall be assigned to Aintelia® Science Notes Journal (ASNJ).
By submitting their work, authors agree to the following terms:
-
Copyright Transfer: Copyright of the published article is transferred to Aintelia® Science Notes Journal. The journal reserves the right to publish, reproduce, distribute, and archive the work.
-
Licensing: While the journal retains the copyright, the article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This allows third parties to share and adapt the work for non-commercial purposes, provided the original work and the journal are properly cited.
-
Author Rights: Authors retain the right to use their article for their own scholarly needs, such as including it in a thesis or dissertation, presenting it at conferences, or distributing it to students for educational purposes, provided that the journal is cited as the original publisher.







