Lifelong Learning in Neural Networks: Techniques, Challenges, and Applications

Abstract

Lifelong Machine Learning (LML) is a versatile improvement to neural networks that in- creases models’ ability to learn from sequential data in bite size and incrementally, while continually expanding generally acquired knowledge to new tasks. One of the persistent problems encountered in LML is known as catastrophic forgetting, whereby nets dislearn prior tasks upon exposure to new tasks. The following review explores these challenges in detail and presents fundamental neural network-based approaches to address such troubles in lifelong learning systems. In the edition, we prevent updates some of the key connectivist parameters while retaining prior knowledge from other tasks through regularization methods such as Elastic Weight Consolidation (EWC) and Learning without Forgetting (LwF). Even though useful, such strategies should be used with caution since they require as much emphasis on revisiting previous tasks as on acquiring new ones. Other rehearsal methods include the Partition Reservoir Sampling (PRS) and Optimizing Class Distribution in Memory (OCDM) that uses a portion of previous data for retraining, which can however prove rather space consuming for large-scale applications. Some architectural approaches, like the Compact, Picking, and Growing (CPG) principle, mean that the network structure grows with new tasks and extend from existing neurons or layers without influence from previous information. But these methods predetermine scalability since they increase computational complexity with the size of a casual network. Nevertheless, problems of how to deal with imbalance in data and shift in labels are still open problems particularly when applied in situations where the data distribution changes over time. However, lifelong learning in neural networks continue to experience growth challenges in catastrophic forgetting, scalability, and efficient knowledge transfer thus the need for further re-search. It will be crucial for applying neural networks for situations where it is required to learn over time but do not want to forget what has been learnt earlier.

Country : Iraq

1 Sura Saad Basher

  1. Northern Technical University, Iraq

IRJIET, Volume 9, Issue 8, August 2025 pp. 12-21

doi.org/10.47001/IRJIET/2025.908003

References

  1. H. Liu and M. Cocea. Traditional machine learning. In Machine Learning, pages 11–22. Springer International Publishing, Cham, 2018.
  2. Z. Chen and B. Liu. Lifelong machine learning, 2018.
  3. Z. Mai, R. Li, J. Jeong, and et al. Online continual learning in image classification: An empirical survey. Neurocomputing, 469:28–51, 2022.
  4. G. I. Parisi, R. Kemker, J. L. Part, and et al. Continual lifelong learning with neural networks: A review. Neural Networks, 113:54–71, 2019.
  5. D. Kudithipudi, M. Aguilar-Simon, J. Babb, and et al. Biological underpinnings for lifelong learning machines. Nature Machine Intelligence, 4(3):196–210, 2022.
  6. X. Wang, Y. Chen, and W. Zhu. A survey on curriculum learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9):4555–4576, 2022.
  7. G. M. Van de Ven, T. Tuytelaars, and A. S. Tolias. Three types of incremental learning. Nature Machine Intelligence, 4(12):1185–1197, 2022.
  8. Y. C. Hsu, Y. C. Liu, A. Ramasamy, and et al. Re-evaluating continual learning scenarios: A categorization and case for strong baselines. arXiv preprint arXiv:1810.12488, 2018.
  9. M. De Lange, R. Aljundi, M. Masana, and et al. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3366–3385, 2022.
  10. R. M. French. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences, 3(4):128–135, 1999.
  11. D. Hassabis, D. Kumaran, C. Summerfield, and M. Botvinick. Neuroscience-inspired artificial intelligence. Neuron Review, 95(2):245–258, 2017.
  12. S. Thrun and T. Mitchell. Lifelong robot learning. Robotics and Autonomous Systems, 15:25–46, 1995.
  13. M. McCloskey and N. J. Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. The Psychology of Learning & Motivation, 24:104–169, 1989.
  14. R. Ratcliff. Connectionist models of recognition memory: Constraints imposed by learning and forgetting functions. Psychological Review, 97(2):285–308, 1990.
  15. J. P. Nadal, G. Toulouse, J. P. Changeux, and S. Dehaene. Networks of formal neurons and memory palimpsets. Europhysics Letters, 1:535–543, 1986.
  16. A.V. Robins. Catastrophic forgetting in neural networks: The role of rehearsal mechanisms. In Proceedings of the First New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems. IEEE Computer Society Press, 1993.
  17. A.V. Robins. Catastrophic forgetting, rehearsal and pseudorehearsal. Connection Science, 7(2):123–146, 1995.
  18. A.Gepperth and C. Karaoguz. A bio-inspired incremental learning architecture for applied perceptual problems. Cognitive Computation, 8(5):924–934, 2015.
  19. S. A. Rebuffi, A. Kolesnikov, G. Sperl, and C. H. Lampert. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  20. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, and et al. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017.
  21. F. Zenke, B. Poole, and S. Ganguli. Continual learning through synaptic intelligence. In Proceedings of the International Conference on Machine Learning (ICML), Sydney, Australia, 2017.
  22. J. Hertz, A. Krogh, and R. G. Palmer. Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City, CA, 1991.
  23. G. I. Parisi, J. Tani, C. Weber, and S. Wermter. Lifelong learning of humans actions with deep neural network self-organization. Neural Networks, 96:137–149, 2017.
  24. A.A. Rusu, N. C. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, and et al. Progressive neural networks. In Proceedings of the International Conference on Learning Representations (ICLR), 2016.
  25. F. M. Richardson and M. S. C. Thomas. Critical periods and catastrophic interference effects in the development of self-organising feature maps. Developmental Science, 11(3):371–389, 2008.
  26. M. M. Murray, D. J. Lewkowicz, A. Amedi, and M. T. Wallace. Multisensory processes: A balancing act across the lifespan. Trends in Neurosciences, 39:567–579, 2016.
  27. J. D. Power and B. L. Schlaggar. Neural plasticity across the lifespan. Wiley Interdisciplinary Reviews: Developmental Biology, 6(216), 2016.
  28. M. K. Benna and S. Fusi. Computational principles of synaptic memory consolidation. Nature Neuroscience, 19(12):1697–1708, 2016.
  29. S. Fusi, P. J. Drew, and L. F. Abbott. Cascade models of synaptically stored memories. Neuron, 45(4):599–611, 2005.
  30. Z. Li and D. Hoiem. Learning without forgetting. In Proceedings of the European Conference on Computer Vision (ECCV), pages 614–629, Amsterdam, The Netherlands, 2016.
  31. G. Hinton, O. Vinyals, and J. Dean. Distilling the knowledge in a neural network. In Advances in Neural Information Processing Systems (NIPS), Workshop on Deep Learning and Representation, 2014.
  32. M. Roseberry, B. Krawczyk, Y. Djenouri, and A. Cano. Self-adjusting k nearest neighbors for continual learning from multi-label drifting data streams. Neurocomputing, 442:10–25, Jun 2021.
  33. G. A. Carpenter and S. Grossberg. Adaptive resonance theory. Technical Report Tech. Rep. CAS/CNS TR-98-029, Dept. Cognitive Neural Systems, Center for Adaptive Systems, Boston University, Boston, MA, USA, 2010.
  34. N. Masuyama, Y. Nojima, C. K. Loo, and H. Ishibuchi. Multi-label classification via adaptive resonance theory-based clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(7):8696–8712, Jul 2023.
  35. K. Du, F. Lyu, F. Hu, L. Li, W. Feng, F. Xu, and Q. Fu. Agcn: Augmented graph convolutional network for lifelong multi-label image recognition. In Proceedings of the IEEE International Conference on Multimedia Expo (ICME), pages 1–6, Jul 2022.
  36. K. Du, L. Li, F. Lyu, F. Hu, Z. Xia, and F. Xu. Class-incremental lifelong learning in multilabel classification. 2022. arXiv:2207.07840.
  37. Y. Wang, N. J. Bryan, M. Cartwright, J. P. Bello, and J. Salamon. Few shot continual learning for audio classification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pages 321–325, Jun 2021.
  38. C. D. Kim, J. Jeong, and G. Kim. Imbalanced continual learning with partitioning reservoir sampling. In Computer Vision—ECCV, pages 411–428, Glasgow, U.K., 2020. Springer.
  39. Y.-S. Liang and W.-J. Li. Optimizing class distribution in memory for multi-label online continual learning. 2022. arXiv:2209.11469.
  40. T.-N. Pham, Q.-T. Ha, M.-C. Nguyen, and T.-T. Nguyen. A probability based close domain metric in lifelong learning for multi-label classification. In Advanced Computational Methods for Knowledge Engineering, pages 143–149, Cham, Switzerland, 2020. Springer.
  41. M.-L. Zhang and L. Wu. Lift: Multi-label learning with label-specific features. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1):107–120, Jan 2015.
  42. D. Dalle Pezze, D. Deronjic, C. Masiero, D. Tosato, A. Beghi, and G. A. Susto. A multilabel continual learning framework to scale deep learning approaches for packaging equipment monitoring. Engineering Applications of Artificial Intelligence, 124, Sep 2023.
  43. C.-H. Chen, C.-H. Tu, J.-D. Li, and C.-S. Chen. Defect detection using deep lifelong learning. In Proceedings of the IEEE 19th International Conference on Industrial Informatics (INDIN), pages 1–6, Jul 2021.
  44. S. Dong, H. Luo, Y. He, X. Wei, and Y. Gong. Knowledge restore and transfer for multi-label class-incremental learning. 2023. arXiv:2302.13334.
  45. Y. Wang, Z. Wang, Y. Lin, L. Khan, and D. Li. Cifdm: Continual and interactive feature distillation for multi-label stream learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 2121–2125, Jul 2021.
  46. G. Song, K. Huang, H. Su, F. Song, and M. Yang. Deep continual hashing for real-world multi-label image retrieval. Computers Vision and Image Understanding, 234, Sep 2023.
  47. J. Jia, F. He, N. Gao, X. Chen, and K. Huang. Learning disentangled label representations for multi-label classification. 2022. arXiv:2212.01461.