Comparative Evaluation of Academic Performance in Waikato Environment for Knowledge Analysis Using Multiple Classification Algorithms

Abstract

Data mining (DM) is the process of applying algorithms on large databases with the aim of discovering knowledge that would help in taking informed decisions by the management of academic institutions (Chalurapruek, S, et al, 2018). This paper seeks to discover the best classifiers to be used on educational data when using Waikato Environment for Knowledge Analysis (WEKA). The variables of importance namely carry-over, marital status, age range, entry mode and accommodation location were selected by J45 classifier. Four sampled datasets from four schools/faculties namely School of Physical Sciences (SPS), School of Environmental Studies (SES), School of Technology and Science Education (STSE), School of Agriculture and Agricultural Technology (SAAT) belonging to Modibbo Adama University of Technology (MAUTECH), Yola, Nigeria, were used for the DM task. All the classifiers available in WEKA suite were applied independently on the four different datasets. Notably, classifiers such as J48, NaiveBayes, Logistics and Regression gave better performances when compared with the rest. In the comparative analysis, the Regression model had the overall best performance of 98.366 %, 99.3197 %, 96.3964 % and 96.875 % on the four datasets respectively. The computed average performance of each of the four classifiers on the four datasets gave97.5446 %, 94.6954 %, 95.670125 %, and 97.739275 % respectively.

Country : Nigeria

1 Asabe Sandra Ahmadu PhD2 Etemi Joshua Garba PhD3 Ally Dauda Ahmadu

  1. Department of Computer Science, Modibbo Adama University Yola, Nigeria
  2. Department of Computer Science, Modibbo Adama University Yola, Nigeria
  3. ICT Centre, Federai University Wukari, Nigeria

IRJIET, Volume 5, Issue 7, July 2021 pp. 73-80

doi.org/10.47001/IRJIET/2021.507013

References

  1. Aftarczuk, K. (2007). Evaluation of selected data mining algorithms implemented in Medical Decision Support Systems. Blekinge: Blekinge Institute of Technology School of Engineering.
  2. Al-Radaideh, Q. A.  and  Nagi, E. A. (2012) Using Data Mining Techniques to Build a Classification Model for Predicting Employees Performance (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 3, No. 2, 2012. www.ijacsa.thesai.org
  3. Aziz, A. A., Ismail, N. H., & Ahmad, F. (2014). Proceeding of the International Conference on Artificial Intelligence and Computer Science (AICS 2014), Bandung, INDONESIA. (e-ISBN978-967-11768-8-7). Organized by http://WorldConferences.net Retrieved on 15/03/2016.
  4. Cessie, S.L., and Houwelingen J.C.V.(1992) Ridge Estimators in Logistic Regression University of Leiden, Neitherlands. 1992 Royal statistical society. 41, No.1 pp 191-201,  www.inf.unibz.it/dis/teaching/DWDM/project2010, Retrieved on 07/09/2017.
  5. El-Halees, A (2009) “Mining Students Data to Analyze E-Learning Behavior: A Case Study”.
  6. Kotsiantis, S., Pierrakeas, C. and Pintelas, P. (2004). Prediction of Student’s Performance in Distance Learning Using Machine Learning Techniques‖, Applied Artificial Intelligence, 18(5) 411-426.
  7. Kumar, V. And Chadha, A. ( 2011). An Empirical Study of the Applications of Data Mining Techniques in Higher Education (IJACSA) International Journal of Advanced Computer Science and Applications, http://ijacsa.thesai.org Retrieved: 10/06/2014.
  8. Landwehr, N.  (3003) Logistic Model trees www.cs.waikato.ac.nz/ml/publications/2003/landwehr-etal.ps
  9. Michael J. A. B. and Gordon S. L. (2004), Data Mining Techniques, 2nd ed., Wiley Publishing Inc., USA, www.cs.waikato.ac.nz/ml/weka Retrieved on 15/03/2016.
  10. Osmanbegović, E., & Suljić, M. (2012). Data Mining Approach For Predicting Student Performance. Economic Review – Journal of Economics and Business, X(1 ), 3-12.
  11. Pal, A. K and Pal, S. (2013) Classification Model of Prediction for Placement of Students, I.J.Modern Education and Computer Science, pp 49-56 Published Online November 2013 in MECS  http://www.mecs-press.org Retrieved on 22/09/2017.
  12. Philip, C. and Pedro, M. (1999). On the use of support vector machines for phonetic classification. Acoustics, Speech, and Signal Processing. 1999 IEEE International Conference. (2).
  13. Romero, C. and Ventura, S. (2007) ‘Educational data Mining: A Survey from 1995 to 2005’, Expert Systems with Applications (33), pp. 135-146.
  14. Romero, C. , Ventura, S. and Garcia, E. (2008) ‘Data mining in course management systems: Moodle case study and tutorial’, Computers & Education, vol. 51, no. 1, pp. 368-384. Software.ucv.ro/~cmihaescu/ro/teaching/AIR/docs/Lab4-NaiveBaye Retrieved on 07/09/2017.
  15. Sudha,  M. And Kumaravel, A. (2017) Comparative Analysis between Rough set theory and Data mining algorithms on their prediction. Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 7 (2017), pp. 3249-3260 © Research India Publications http://www.ripublication.com Retrieved: 11/09/2017.
  16. uros@krcadinac.com. URL: http://krcadinac.com... (2015) Naive Bayes classifier. ai.fon.bg.ac.rs/wp-content/uploads/2015/04/Classification-Naive-Bayes-2015.pdf Retrieved on 07/09/2017.
  17. Witten, I. H. And Frank, E. (2005); “Data Mining Practical Machine Learning Tools and Techniques”, Second Edition, Morgan Kaufmann Publishers is an imprint of Elsevier.500 Sansome Street, Suite 400 San Francisco, CA 94111.  pp.267-pp320, www.cs.waikato.ac.nz/ml/weka Retrieved on 15/03/2016.
  18. Yadav, S. K. And Pal, S. (2012) Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification. World of Computer Science and Information Technology Journal (WCSIT) ISSN: 2221-0741 Vol. 2, No. 2, 51-56, 2012, https://www.researchgate Retrieved on 15/03/2016.
  19. Zou, K. H., Tuncali, K., and Silverman, S. G. (2003) Statistical Concepts Series in Radiology, Published online 10.1148/radiol.2273011499 Radiology 2003; 227:617–628 Vol.(3) pp 617-622, https://www.coursehero Retrieved on 07/09/2017.
  20. Chalurapruek S, Dee T. S., Johari R., Kizilcec R. F and Steven M. L (2018), How a Data Driven Course Planning Tool Affects college students GPA, Evidence from two field experiments: In Proceedings o Fifth Annual ACM Conference on Learning at Scale, Association o Computing Machineries, https://doi.org/10.1145/3231644.3231668 Google Scholar.
  21. Naïve Bayes Classifier is an approach that adopts the Bayes theorem, by combining previous knowledge with new knowledge.iJES ‒ Vol. 7, No. 2, 2019.
  22. The Naive Bayes algorithm is simple probabilistic classification. This algorithm calculates a set of probabilities by calculating the frequency and combination of values in a particular data set (Patil, T. R. (2013) “Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification,” Int. J. Comput. Sci. Appl. ISSN 0974-1011, vol. 6, no. 2, pp. 256– 261).