Show simple item record

Power DataMate Tool: Leveraging Logistic Regression Classification for Interactive Data Modeling

dc.contributor.authorAbu Alrub, Mahmoud Ibrahim
dc.contributor.advisorShaout, Adnan
dc.date.accessioned2024-05-10T16:58:33Z
dc.date.available2025-05-10 12:58:33en
dc.date.issued2024-04-27
dc.date.submitted2024-04-18
dc.identifier.urihttps://hdl.handle.net/2027.42/193121
dc.description.abstractThe demand for efficient predictive modeling techniques has become crucial due to the growing occurrence of binary classification problems in diverse fields. Therefore, it is desirable to utilize the logistic regression classification as a potent technique for data modeling, specifically with examining its efficacy in capturing and analyzing correlations across varied datasets, thus, Power DataMate software tool is developed. The promise of logistic regression in modeling complicated data structures is thoroughly examined due to its simplicity, interpretability, and adaptability to binary classification tasks. The choice of this research to focus on logistic regression for inquiry is based on its capability to represent intricate interactions between predictors and the binary response variable. However, the goal is to forecast the likelihood of discovering Primary Keys (PK) and Foreign Keys (FK) among datasets. While many off-the-shelf data analytics software and logistic regression classification research are available, it is found that there is a lack of research or solutions that provide a method where an entity data is analyzed using logistic regression to detect its PKs and features automatically or interactively.The research technique encompasses the acquisition of a combination of fictious and real-world six datasets. Four are in the form of data file while two are in the form of database. The data, then, is preprocessed to verify its quality, followed by the deployment of data training and   prediction algorithms. On the other hand, sufficient training and testing datasets were incorporated to efficiently train and evaluate the model performance. Breaking new ground, we allow the users not only to automatically have their data modeled, but also to interactively review and confirm primary keys and features for further data analysis and modeling. While the research entails a comprehensive evaluation of model performance indicators, including accuracy and precision and recall, results show that the accuracy of PK detection is 89% and 82% for the FK. Hence, these results are the first of their kind and could be a starting point for further model enhancements and data analytics research, especially for analyzing data files projects where Power DataMate user has the choice to interactively feed the learning algorithm for better outcomes.Keywords: Data Mining, Data Modeling, Classification Problem, Logistic Regression, Primary Key, Foreign Key.en_US
dc.language.isoen_USen_US
dc.subjectData modelingen_US
dc.subjectData miningen_US
dc.subjectData classificationen_US
dc.subjectLogistic regressionen_US
dc.subjectPrimary Key or Foreign Key Discoveryen_US
dc.subject.otherComputer and Information Scienceen_US
dc.titlePower DataMate Tool: Leveraging Logistic Regression Classification for Interactive Data Modelingen_US
dc.typeThesisen_US
dc.description.thesisdegreenameMaster of Science (MS)en_US
dc.description.thesisdegreedisciplineSoftware Engineering, College of Engineering & Computer Scienceen_US
dc.description.thesisdegreegrantorUniversity of Michigan-Dearbornen_US
dc.contributor.committeememberMedjahed, Brahim
dc.contributor.committeememberWatt, Paul
dc.identifier.uniqnamemiabuen_US
dc.description.bitstreamurlhttp://deepblue.lib.umich.edu/bitstream/2027.42/193121/1/Abu_Alrub_Thesis_Power_DataMate_Tool (1).pdfen
dc.identifier.doihttps://dx.doi.org/10.7302/22766
dc.description.mappingfebc42ae-d444-43ae-98fd-dc98ee638897en_US
dc.identifier.orcid0000-0002-8916-0259en_US
dc.description.filedescriptionDescription of Abu_Alrub_Thesis_Power_DataMate_Tool (1).pdf : Thesis
dc.identifier.name-orcidAbu Alrub, Mahmoud; 0000-0002-8916-0259en_US
dc.working.doi10.7302/22766en_US
dc.owningcollnameDissertations and Theses (Ph.D. and Master's)


Files in this item

Show simple item record

Remediation of Harmful Language

The University of Michigan Library aims to describe library materials in a way that respects the people and communities who create, use, and are represented in our collections. Report harmful or offensive language in catalog records, finding aids, or elsewhere in our collections anonymously through our metadata feedback form. More information at Remediation of Harmful Language.

Accessibility

If you are unable to use this file in its current format, please select the Contact Us link and we can modify it to make it more accessible to you.