Μείωση δεδομένων με διαμερισμό του χώρου χρησιμοποιώντας αλγόριθμους Convex Hull για την εύρεση των πιο απομακρυσμένων αντικειμένων (Bachelor thesis)
Γκιοργκίνης, Θωμάς
In the ever evolving scientific field of data categorization, the nearest
neighbors algorithm is a stable, efficient method. However, it has sev eral weaknesses that deem it inappropriate in some cases of data sets.
Its main drawbacks are: high cost of categorizing each object due to
multiple calculations, high storage requirements, and dependence of
the accuracy of the results on the quality of the training data. In or der to address its weaknesses, several data reduction algorithms have
been implemented, aiming at minimizing the processing burden of the
categorizer without affecting its accuracy. The Reduction by Space
Partitioning algorithm is one of the most well-known prototype gen eration algorithms for accelerating instance based categorists. RSP3
is based on a repetitive separation process of the original training
set. As the criterion for the most subset will be divided first, RSP3
adopts the larger diameter criterion. That is, the subset with the
largest diameter defined by the two most distant objects is divided
first. The process continues until the resulting subsets are homoge neous, that is, they include objects of the same class. The search for
the two most remote objects of each subset is a costly process, since
it requires the calculation of all the distances between the objects of
each sub-set. Thus, RSP3 has a high computational pre-processing
cost. In the framework of the thesis, computational geometry algo rithms will be studied to find the Convex Hull of each subset. Convex
Hull is composed by the objects of the data set that define its con tour. The motivation for studying these algorithms stems from the
following observation: If these objects that define Convex Hull are
found, searching for the most remote objects in each subset will be
done quickly since it will not be necessary to calculate all possible
distances to the subset but only all possible distances between the
objects of Convex Hull. In the experimental part of the thesis, two
Convex Hull algorithms will be implemented and incorporated into
RSP3. It will also be investigated whether it is possible to use Con vex Hull algorithms in multi-dimensional datasets. The new variants
of RSP3 will be tested in 15 sets of real data categorization data and
their speed will be compared to that of the “conventional” RSP3.
Institution and School/Department of submitter: | Σχολή Τεχνολογικών Εφαρμογών - Τμήμα Μηχανικών Πληροφορικής |
Subject classification: | Αλγόριθμοι Algorithms Μείωση δεδομένων Data reduction |
Keywords: | Convex Hull;απομακρυσμένα αντικείμενα;remotely objects;RSP3;αλγόριθμος των εγγύτερων γειτόνων;nearest neighbor algorithm;δεδομένα;data;Τεχνικές μείωσης δεδομένων;Data reduction techniques |
Description: | Πτυχιακή εργασία - Σχολή Τεχνολογικών Εφαρμογών - Τμήμα Μηχανικών Πληροφορικής, 2019 (α/α 11044) |
URI: | http://195.251.240.227/jspui/handle/123456789/15059 |
Appears in Collections: | Πτυχιακές Εργασίες |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
GKIORKINIS.pdf | 1.32 MB | Adobe PDF | View/Open |
Please use this identifier to cite or link to this item:
This item is a favorite for 0 people.
http://195.251.240.227/jspui/handle/123456789/15059
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.