Conducted research project to predict human-attractiveness as numerical scores through Ridge, Logistic regression and SVM classification with PCA dimensional reduction.
The goal of this project was to predict human attractiveness by several machine learning algorithms with user profile data, such as age, height, college, and self-taken pictures, used in cooperation with the biggest dating service in South Korea. In order to join the dating service, each user must be rated based on their attractiveness on a 5.0 scale by other random members (of the opposite sex).
This project was conducted under highly-secured circumstances by cooperating with an authorized colleague located at the corporate branch.
NOTE: The designing and coding of the model was done at Stony Brook University while applying the code with the actual data was safely done only under the conditions that the corporation had provided in South Korea (research advisors only suggested advice/directions and never saw/touched the data).
NOTE: The corporate team and my team arranged not to mention/open any kind of data from the service. Thus, detailed information regarding the data will be skipped in all further explanations.
Under advisement by Professor Steven Skiena, Ridge regression with 10K samples: 54 features (age, height, college, etc.) + 1024 features of self-taken picture images (extracted from pre-trained deep learning model: Inception-v3) was conducted.
Under advisement by Professor Minh Hoai Nguyen, binary classification between Non Attractive (1.0 to 2.5) and Attractive (3.5 to 5.0) classes trough Support Vector Machine with One-vs-Rest strategy with the same data was conducted.
NOTE Internal range of scores between larger than 2.5 and less than 3.5 (most populated) was excluded.
Under advisement by Professor Martin Radfar, PCA dimensional reduction on 13K face images for men extracted by self-taken profile pictures through the OpenCV face recognition algorithm was conducted. In addition, Logistic regression between Non Attractive (1.0 to 2.5) and Attractive (2.5 to 5.0) classes was conducted.
Although this project was intended for IEEE ICMLA 2018, legal issues emerged regarding publication of research on data from South Korea. As a result, all further processes for this project have been unfortunately suspended.