• ورود به سامانه
      مشاهده مورد 
      •   صفحهٔ اصلی
      • نشریات انگلیسی
      • The ISC International Journal of Information Security
      • Volume 11, Issue 3
      • مشاهده مورد
      •   صفحهٔ اصلی
      • نشریات انگلیسی
      • The ISC International Journal of Information Security
      • Volume 11, Issue 3
      • مشاهده مورد
      JavaScript is disabled for your browser. Some features of this site may not work without it.

      Enhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining

      (ندگان)پدیدآور
      BaniMustafa, Ahmed
      Thumbnail
      دریافت مدرک مشاهده
      FullText
      اندازه فایل: 
      1.043 مگابایت
      نوع فايل (MIME): 
      PDF
      نوع مدرک
      Text
      ORIGINAL RESEARCH PAPER
      زبان مدرک
      English
      نمایش کامل رکورد
      چکیده
      This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes which is known to deteriorate the performance of classifiers. It also influences its validity and generalizablity. The classification models in this study were built using five machine learning algorithms known as PLS-DA, MLP, SVM, C4.5 and ID3. This model is built after carrying out a number of intensive data preprocessing procedures to tackle the problem of imbalanced classes and improve the performance of the constructed classifiers.These procedures involves applying data transformation, normalization, standardization, re-sampling and data reduction procedures using a number of variables importance scorers. The best performance was achieved by building an MLP model that was trained and tested using five-fold cross-validation using datasets that were re-sampled using SMOTE method and then reduced using SVM variable importance scorer. This model was successful in classifying samples with excellent accuracy and also in identifying the potential disease biomarkers. The results confirm the validity of metabolomics data mining for diagnosis of cachexia. It also emphasizes the importance of data preprocessing procedures such as sampling and data reduction for improving data mining results, particularly when data suffers from the problem of imbalanced classes.
      کلید واژگان
      Data mining
      metabolomics
      cachexia
      preprocessing
      Imbalanced Classes
      Re-sampling
      Data Reduction

      شماره نشریه
      3
      تاریخ نشر
      2019-08-01
      1398-05-10
      ناشر
      Iranian Society of Cryptology
      سازمان پدید آورنده
      American University of Madaba

      شاپا
      2008-2045
      2008-3076
      URI
      https://dx.doi.org/10.22042/isecure.2019.11.0.11
      http://www.isecure-journal.com/article_90821.html
      https://iranjournals.nlai.ir/handle/123456789/73449

      مرور

      همه جای سامانهپایگاه‌ها و مجموعه‌ها بر اساس تاریخ انتشارپدیدآورانعناوینموضوع‌‌هااین مجموعه بر اساس تاریخ انتشارپدیدآورانعناوینموضوع‌‌ها

      حساب من

      ورود به سامانهثبت نام

      تازه ترین ها

      تازه ترین مدارک
      © کليه حقوق اين سامانه برای سازمان اسناد و کتابخانه ملی ایران محفوظ است
      تماس با ما | ارسال بازخورد
      قدرت یافته توسطسیناوب