Τεχνικές μείωσης του πληθυσμού των δεδομένων με ανεκτικότητα στις απούσες τιμές (Master thesis)

Κουκάρας, Πολυχρόνης


Full metadata record
DC FieldValueLanguage
dc.contributor.authorΚουκάρας, Πολυχρόνηςel
dc.date.accessioned2023-01-25T12:25:09Z-
dc.date.available2023-01-25T12:25:09Z-
dc.identifier.urihttp://195.251.240.227/jspui/handle/123456789/15604-
dc.descriptionΜεταπτυχιακή εργασία - Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων,2020(α/α 11965)el
dc.rightsDefault License-
dc.subjectData Reduction Techniquesen
dc.subjectMissing Values Imputationen
dc.subjectERHCen
dc.subjectPartial distanceen
dc.subjectK-means Clusteringen
dc.subjectCategorization of Neighboring Neighborsen
dc.subjectCalculationen
dc.titleΤεχνικές μείωσης του πληθυσμού των δεδομένων με ανεκτικότητα στις απούσες τιμέςel
heal.typemasterThesis-
heal.type.enMaster thesisen
heal.secondaryTitleData reduction techniques with missing values tolerance
heal.generalDescriptionΜεταπτυχιακή εργασίαel
heal.identifier.secondary11965-
heal.dateAvailable2023-01-25T12:26:09Z-
heal.languageel-
heal.accessfree-
heal.recordProviderΣχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτωνel
heal.publicationDate2020-07-15-
heal.bibliographicCitationΚουκάρας, Π. (2020). Τεχνικές μείωσης του πληθυσμού των δεδομένων με ανεκτικότητα στις απούσες τιμές (Μεταπτυχιακή εργασία). ΔΙΠΑΕ.el
heal.abstractIn recent years, large amounts of training data, from various sources, become available on a daily basis. These quantities are usually not possible to be used by classification algorithms due to the high cost of computing as well as the high memory storage requirements. Therefore, this data is often pre-processed by Data Reduction Techniques in order to reduce computing costs and memory requirements. Many data reduction techniques have been proposed and are available in the literature. These techniques mainly concern the ‗k Nearest Neighbor classifier‘. However, these techniques cannot manage the Missing Values that always appear in real training data sets. Thus, before pre-processing by a data reduction technique, it is necessary to apply another pre-processing step to complete the Missing Values Imputation. In the literature, we come across to several such methods and this paper presents the most important ones. However, by applying an extra pre-processing step is a major drawback that adds computational cost. This is the motivation for this thesis. This thesis proposes a new variant of a data reduction technique that can manage missing values without requiring the additional pre-processing step for data imputation. This technique is a Prototype Generation algorithm and is called the Editing and Reduction through Homogeneous Clusters (ERHC) algorithm. The new ERHC variant manages the missing values using the partial distance technique and applying k-means clustering that does not take into account the missing values. In addition, the performance of ERHC has been tested after the imputation of missing values by the average per class imputation method. The two aforementioned ERHC variants are compared to each other and to the algorithm of the nearest neighbors without reducing the population of data by performing experiments on 13 data sets and estimating the accuracy of classification and reduction ratio (Reduction Rate) achieved by the two ERHC algorithms. The experimental results show remarkable performance for both variants of the ERHC algorithm.en
heal.advisorNameΟυγιάρογλου, Στέφανοςel
heal.committeeMemberNameΟυγιάρογλου, Στέφανοςel
heal.committeeMemberNameΔιαμαντάρας, Κωνσταντίνοςel
heal.committeeMemberNameΔέρβος, Δημήτριοςel
heal.academicPublisherΣχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων
heal.academicPublisherIDihu-
heal.numberOfPages94 σελ.-
heal.fullTextAvailabilitytrue-
heal.type.elΜεταπτυχιακή εργασίαel
Appears in Collections:Πτυχιακές Εργασίες

Files in This Item:
File Description SizeFormat 
ΚΟΥΚΑΡΑΣ ΠΟΛΥΧΡΟΝΗΣ-ΔΙΠΛΩΜΑΤΙΚΗ ERHC-IMP-PD.pdfΜεταπτυχιακή εργασία2.39 MBAdobe PDFView/Open



 Please use this identifier to cite or link to this item:
http://195.251.240.227/jspui/handle/123456789/15604
  This item is a favorite for 0 people.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.