WebImputer: A Web application for missing value imputation in datasets (Master thesis)

Αντωνιάδης, Δημήτριος


Missing values in datasets is a very important research issue in big data analysis. These datasets are often used for training machine learning models and if a significant percentage of the values is missing, it may result to inaccurate predictions or incorrect model evaluations. To address this issue, several imputation techniques have been proposed as part of the data cleaning process. However, applying these techniques to real-world datasets can be challenging and time-consuming for researchers and data scientists. This thesis presents the development of a web application that utilizes various imputation methods, offering an easy and user-friendly way to handle missing values in datasets. Users can access the website, upload their datasets with missing values in CSV format, choose one of the available imputation methods based on the feature types of the dataset and then download the file containing the imputed values, as soon as the imputation process is complete. The web application, named WEBIMPUTER, offers a variety of imputation solutions for numerical, categorical and mixed feature datasets, providing a wide range of parameter options for the imputation models. Finally, several experiments that have been conducted by applying all the imputation algorithms of the application to various datasets of different file size and measuring the execution time are presented here, to help users gain a better understanding of the computational efficiency of the models.
Institution and School/Department of submitter: Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων
Subject classification: Διαδικτυακές εφαρμογές -- Ανάπτυξη
Ελλιπή στοιχεία (Στατιστική)
Σύνολα δεδομένων
Web applications -- Development
Missing observations (Statistics)
Data sets
Keywords: Εφαρμογές Web;Απούσες τιμές;Ανάλυση δεδομένων;Μέθοδοι καταλογισμού;Web applications;Missing values;Data analysis;Imputation methods;Webimputer
Description: Μεταπτυχιακή εργασία - Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων, 2023 (α/α 14082)
URI: http://195.251.240.227/jspui/handle/123456789/16867
Item type: masterThesis
General Description / Additional Comments: Μεταπτυχιακή εργασία
Subject classification: Διαδικτυακές εφαρμογές -- Ανάπτυξη
Ελλιπή στοιχεία (Στατιστική)
Σύνολα δεδομένων
Web applications -- Development
Missing observations (Statistics)
Data sets
Submission Date: 2024-08-27T22:41:58Z
Item language: el
Item access scheme: free
Institution and School/Department of submitter: Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων
Publication date: 2023-10-16
Bibliographic citation: Αντωνιάδης, Δ. (2023). WebImputer: A Web application for missing value imputation in datasets (Μεταπτυχιακή εργασία). ΔΙ.ΠΑ.Ε.
Abstract: Missing values in datasets is a very important research issue in big data analysis. These datasets are often used for training machine learning models and if a significant percentage of the values is missing, it may result to inaccurate predictions or incorrect model evaluations. To address this issue, several imputation techniques have been proposed as part of the data cleaning process. However, applying these techniques to real-world datasets can be challenging and time-consuming for researchers and data scientists. This thesis presents the development of a web application that utilizes various imputation methods, offering an easy and user-friendly way to handle missing values in datasets. Users can access the website, upload their datasets with missing values in CSV format, choose one of the available imputation methods based on the feature types of the dataset and then download the file containing the imputed values, as soon as the imputation process is complete. The web application, named WEBIMPUTER, offers a variety of imputation solutions for numerical, categorical and mixed feature datasets, providing a wide range of parameter options for the imputation models. Finally, several experiments that have been conducted by applying all the imputation algorithms of the application to various datasets of different file size and measuring the execution time are presented here, to help users gain a better understanding of the computational efficiency of the models.
Advisor name: Ουγιάρογλου, Στέφανος
Examining committee: Ουγιάρογλου, Στέφανος
Publishing department/division: Σχολή Μηχανικών - Τμήμα Μηχανικών Πληροφορικής και Ηλεκτρονικών Συστημάτων
Publishing institution: ihu
Number of pages: 75
Appears in Collections:Μεταπτυχιακές Διατριβές

Files in This Item:
File Description SizeFormat 
Antoniadis.pdf2.47 MBAdobe PDFView/Open



 Please use this identifier to cite or link to this item:
http://195.251.240.227/jspui/handle/123456789/16867
  This item is a favorite for 0 people.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.