Nearest neighbour classification on data streams and systems with limited memory via prototype replacement policies (Bachelor thesis)

Αραμπατζής, Γεώργιος


Full metadata record
DC FieldValueLanguage
dc.contributor.authorΑραμπατζής, Γεώργιοςel
dc.date.accessioned2021-03-18T12:10:39Z-
dc.date.available2021-03-18T12:10:39Z-
dc.identifier.urihttp://195.251.240.227/jspui/handle/123456789/13324-
dc.descriptionπτυχιακή εργασία- Σχολή Τεχνολογικών Εφαρμογών- Τμήμα Μηχανικών Πληροφορικής, 2017(α/α 8413)el
dc.rightsDefault License-
dc.subjectk nearest neighbours, k-NN classification, data reduction, classification, clustering, prototypes, prototype selection, prototype abstraction, editing, condensing, dynamic environments, streaming environments, data streams, k-means, dRHC, classification accuracy, reduction rate, preprocessing cost, computational cost, Wilcoxon signed ranks testen
dc.subjectk πλησιέστεροι γείτονες, ταξινόμηση k-NN, μείωση δεδομένων, ταξινόμηση, ομαδοποίηση, πρωτότυπα, επιλογή πρωτοτύπου, αφαίρεση πρωτοτύπου, επεξεργασία, συμπύκνωση, δυναμικά περιβάλλοντα, περιβάλλοντα ροής, ροές δεδομένων, k-mean, dRHC, ακρίβεια ταξινόμησης, ποσοστό μείωσης, κόστος προεπεξεργασίας, υπολογιστικό κόστος, Wilcoxon δοκιμασμένη βαθμολογίαel
dc.titleΚατηγοριοποίηση εγγύτερων γειτόνων σε συστήματα ροών δεδομένων και συστήματα με περιορισμένη μνήμη μέσω πολιτικών αντικατάστασης αντιπροσωπευτικών αντικειμένωνel
dc.titleNearest neighbour classification on data streams and systems with limited memory via prototype replacement policiesen
heal.typebachelorThesis-
heal.type.enBachelor thesisen
heal.generalDescriptionπτυχιακή εργασίαel
heal.classificationWorkflow--Management--Computer programsen
heal.classificationΡοή εργασίας--Διαχείριση--Προγράμματα υπολογιστώνel
heal.classificationData reductionen
heal.classificationΜέιωση δεδομένωνel
heal.classificationHeuristic algorithmsen
heal.classificationΕυριστικοί αλγόριθμοιel
heal.identifier.secondary8413-
heal.accessaccount-
heal.recordProviderΣχολή Τεχνολογικών Εφαρμογών- Τμήμα Μηχανικών Πληροφορικήςel
heal.publicationDate2017-02-20-
heal.bibliographicCitationΑραμπατζής, Γ. (2017).Κατηγοριοποίηση εγγύτερων γειτόνων σε συστήματα ροών δεδομένων και συστήματα με περιορισμένη μνήμη μέσω πολιτικών αντικατάστασης αντιπροσωπευτικών αντικειμένων . Nearest neighbour classification on data streams and systems with limited memory via prototype replacement policies(πτυχιακή εργασία).Αλεξάνδρειο ΤΕΙ Θεσσαλονίκηςel
heal.abstractThe k Nearest Neighbour classifier, due to the fashion it functions, can not handle very large volumes of training data. Thus, rendering the use of k-NN as an inappropriate choice when it comes to streaming environments. Adding to that list, most conventional Data Reduction Techniques, that operate by either selecting training prototypes or by generating a set of training prototypes, are also categorized as unsuitable for such environments. The “dynamic Reduction through Homogeneous Clusters” algorithm, dynamic RHC in short, is a prototype abstraction algorithm. dRHC has the advantages, compared to other data reduction techniques, that it can update its condensing set every time a new training data arrives. Thus, being able to handle large amount of training data. Although, dynamic RHC does not take into account the phenomenon of concept drift, that resides in most data streams. In addition to this, after many consecutive condensing set updates, its size keeps increasing, making its use prohibitive for classification tasks. All the disadvantages mentioned earlier comprise the core motivation behind this thesis. Trying to tackle the problems dRHC algorithm was facing, the new dRHC2 algorithm was proposed. While keeping all the advantages dRHC has, dRHC2 uses a certain, predefined by the user, threshold in order to limit the infinite growth of the condensing set in dRHC. dRHC2 ranks its prototypes, inside the condensing set, in order to select the most relevant ones. More specific, dRHC2 has two different heuristics. One heuristics is considered more appropriate when the goal is solely to restraint the steadily increasing condensing set, while still maintaining high percentage during the classification accuracy. The other heuristic, takes into consideration the phenomenon of concept drift, tackling the other main weakness of dRHC. Both heuristics of the proposed algorithm dRHC2 were evaluated on several datasets, based on the classification accuracy, the reduction rate and the preprocessing cost. In addition, the algorithm was compared to its predecessor, dRHC, the experimental measurements of which were validated by statistical tests of significance. Finally, the results indicate that, not only dRHC2 manages to be a lot faster, but also excels in accuracy, establishing its superiority over dRHC.en
heal.advisorNameΔέρβος, Δημήτριοςel
heal.committeeMemberNameΔέρβος, Δημήτριοςel
heal.academicPublisherΣχολή Τεχνολογικών Εφαρμογών- Τμήμα Μηχανικών Πληροφορικήςel
heal.academicPublisherIDteithe-
heal.numberOfPages86 σελ.-
heal.fullTextAvailabilityfalse-
heal.type.elΠροπτυχιακή/Διπλωματική εργασίαel
Appears in Collections:Πτυχιακές Εργασίες

Files in This Item:
There are no files associated with this item.



 Please use this identifier to cite or link to this item:
http://195.251.240.227/jspui/handle/123456789/13324
  This item is a favorite for 0 people.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.