WebImputer: A Web application for missing value imputation in datasets (Master thesis)
Αντωνιάδης, Δημήτριος
Missing values in datasets is a very important research issue in big data analysis. These datasets are often
used for training machine learning models and if a significant percentage of the values is missing, it may
result to inaccurate predictions or incorrect model evaluations. To address this issue, several imputation
techniques have been proposed as part of the data cleaning process. However, applying these techniques
to real-world datasets can be challenging and time-consuming for researchers and data scientists. This
thesis presents the development of a web application that utilizes various imputation methods, offering
an easy and user-friendly way to handle missing values in datasets. Users can access the website, upload
their datasets with missing values in CSV format, choose one of the available imputation methods based
on the feature types of the dataset and then download the file containing the imputed values, as soon as the
imputation process is complete. The web application, named WEBIMPUTER, offers a variety of imputation
solutions for numerical, categorical and mixed feature datasets, providing a wide range of parameter
options for the imputation models. Finally, several experiments that have been conducted by applying
all the imputation algorithms of the application to various datasets of different file size and measuring
the execution time are presented here, to help users gain a better understanding of the computational
efficiency of the models.
Institution and School/Department of submitter: | Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων |
Subject classification: | Διαδικτυακές εφαρμογές -- Ανάπτυξη Ελλιπή στοιχεία (Στατιστική) Σύνολα δεδομένων Web applications -- Development Missing observations (Statistics) Data sets |
Keywords: | Εφαρμογές Web;Απούσες τιμές;Ανάλυση δεδομένων;Μέθοδοι καταλογισμού;Web applications;Missing values;Data analysis;Imputation methods;Webimputer |
Description: | Μεταπτυχιακή εργασία - Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων, 2023 (α/α 14082) |
URI: | http://195.251.240.227/jspui/handle/123456789/16867 |
Item type: | masterThesis |
General Description / Additional Comments: | Μεταπτυχιακή εργασία |
Subject classification: | Διαδικτυακές εφαρμογές -- Ανάπτυξη Ελλιπή στοιχεία (Στατιστική) Σύνολα δεδομένων Web applications -- Development Missing observations (Statistics) Data sets |
Submission Date: | 2024-08-27T22:41:58Z |
Item language: | el |
Item access scheme: | free |
Institution and School/Department of submitter: | Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων |
Publication date: | 2023-10-16 |
Bibliographic citation: | Αντωνιάδης, Δ. (2023). WebImputer: A Web application for missing value imputation in datasets (Μεταπτυχιακή εργασία). ΔΙ.ΠΑ.Ε. |
Abstract: | Missing values in datasets is a very important research issue in big data analysis. These datasets are often used for training machine learning models and if a significant percentage of the values is missing, it may result to inaccurate predictions or incorrect model evaluations. To address this issue, several imputation techniques have been proposed as part of the data cleaning process. However, applying these techniques to real-world datasets can be challenging and time-consuming for researchers and data scientists. This thesis presents the development of a web application that utilizes various imputation methods, offering an easy and user-friendly way to handle missing values in datasets. Users can access the website, upload their datasets with missing values in CSV format, choose one of the available imputation methods based on the feature types of the dataset and then download the file containing the imputed values, as soon as the imputation process is complete. The web application, named WEBIMPUTER, offers a variety of imputation solutions for numerical, categorical and mixed feature datasets, providing a wide range of parameter options for the imputation models. Finally, several experiments that have been conducted by applying all the imputation algorithms of the application to various datasets of different file size and measuring the execution time are presented here, to help users gain a better understanding of the computational efficiency of the models. |
Advisor name: | Ουγιάρογλου, Στέφανος |
Examining committee: | Ουγιάρογλου, Στέφανος |
Publishing department/division: | Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων |
Publishing institution: | ihu |
Number of pages: | 75 |
Appears in Collections: | Μεταπτυχιακές Διατριβές |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Antoniadis.pdf | 2.47 MB | Adobe PDF | View/Open |
Please use this identifier to cite or link to this item:
This item is a favorite for 0 people.
http://195.251.240.227/jspui/handle/123456789/16867
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.