Publications internationales
Résumé: Abstract Case-Base Maintenance (CBM) becomes of great importance when implementing a Computer-Aided Diagnostic (CAD) system using Case-Based Reasoning (CBR). Since it is essential for the learning to avoid the case-base degradation, this work aims to build and maintain a quality case base while overcoming the difficulty of assembling labeled case bases, traditionally assumed to exist or determined by human experts. The proposed approach takes advantage of large volumes of unlabeled data to select valuable cases to add to the case base while monitoring retention to avoid performance degradation and to build a compact quality case base. We use machine learning techniques to cope with this challenge: an Active Semi-Supervised Learning approach is proposed to overcome the bottleneck of scarcity of labeled data. In order to acquire a quality case base, we target its performance criterion. Case selection and retention are assessed according to three combined sampling criteria: informativeness, representativeness, and diversity. We support our approach with empirical evaluations using different benchmark data sets. Based on experimentation, the proposed approach achieves good classification accuracy with a small number of retained cases, using a small training set as a case base. Keywords: Machine learningcase-based reasoningcase-base maintenancesemi-supervised learningactive learning
Résumé: Abstract The healthcare sector generates important amount of medical data on a daily basis, several machine learning (ML) methods have been developed and studied in order to usefully exploit this substantial sum of information generated colossally, in a wide range practical data mining applications. Yet, an essential key when it comes to developing a competent computer-aided diagnosis (CAD) system is the supervision of data, made by expert annotators; a labelling process considered as a challenging task; as it is both very time consuming and expensive. This survey paper provides the influence of semi-supervised learning framework as it addresses the scarcity of the supervised data for the development of computer-aided diagnosis systems. The methods used and results obtained are discussed and key findings are highlighted. Further, in the light of this review some directions for future research are given; we present a proposed approach using a semi-supervised technique as a core for the learning of a case-based reasoning (CBR) system in CAD context. Keywords labelled data, unlabelled data, semi-supervised learning, SSL, active learning, sample selection, computer-aided diagnosis, CAD, case-based reasoning, CBR
Communications internationales
Résumé: Abstract: Medical data presents several hidden challenges compared to conventional datasets, the first challenge we encounter before even considering data processing and analysis is the presence of missing values in datasets. In this paper, we present a theoretical framework for missing values imputation through a fuzzy c-means machine learning clustering algorithm and statistical measures of central tendency, namely mean, median and mode. The algorithm of the proposed approach along with the experimental design is presented, aiming to evaluate the performance of the proposed missing values imputation on different medical datasets from the University of California Irvine Machine Learning Repository.
Résumé: Abstract Maintenance is generally defined in the field of software and knowledge engineering as an activity that takes place after the development of the system is complete, and the application has already been deployed in operation. The success factor for case-based reasoning (CBR) systems is the quality of their case bases, in order to guarantee this quality, a maintenance process must be planned, and this is how the field of case base maintenance (CBM) emerged. The goal of this paper is to propose a case base maintenance strategy that delivers a small case base size, removes irrelevant cases from the case base, and targets only valuable cases to be retained to increase classification accuracy. We propose a case base maintenance approach C_IRD that focuses on balancing the efficiency of case retrieval and the competence of a case base by employing a soft clustering technique: FCM. The method delivers interesting abilities and is able to maintain case bases with satisfactory accuracy by reducing its size, which leads to a reduction in retrieval time. Keywords Case-based reasoning Case base maintenance Soft clustering Fuzzy C-means
Résumé: Abstract: Developing a competent and accurate Computer-Aided Diagnosis (CAD) system to assist medical experts in making diagnosis requires a substantial amount of labeled (diagnosed) samples, however collecting labeled data is very costly and challenging when it comes to expert's annotation. This task is considered as a burden, and is both very time consuming and expensive. The framework of Semi-Supervised Learning (SSL) approach addresses this problem by taking advantage of the abundant amount of accessible unlabeled(undiagnosed) data together with the few limited labeled data in order to train precise classifiers while requiring less human effort and time. This paper reviews different CAD systems using SSL for numerous tasks; the methods used and results obtained are discussed and key findings are highlighted; to conclude with a presented proposed approach for the development of a CAD system; applying Semi-Supervised learning for the classification of cases in order to improve the performance of Case-Based Reasoning(CBR) system.