PCA Psychophysical Analysis
Dimensionality reduction and clustering of psychophysical behavioral data – Computational Linear Algebra course
Project Description
An exploratory data analysis project developed for the Computational Linear Algebra for Large Scale Problems course at Politecnico di Torino (2025/2026). The project applies PCA to reduce 93 initial variables into 5 principal components that capture the most significant variance in human psychophysical profiles, then uses K-Means clustering to identify distinct behavioral groups.
Key Findings
- The Reckless Hedonist (Cluster 1): Larger physical stature (avg. 178.6 cm, 73.3 kg), sensation-seeking behavior.
- The Anxious Conformist (Cluster 0): Smaller physical frames (avg. 169.3 cm, 59.3 kg).
- The Female-Dominant Profile (Cluster 2): 60.6% female, distinct psychological markers.
- The Heterogeneous Group (Cluster 3): High weight variability (σ = 16.8), diverse physical segment.
Methodology
- Data Preprocessing: Ordinal Encoding for categorical variables, StandardScaler for numerical normalization.
- PCA: Condensed 93 features into 5 principal components while retaining structural integrity.
- K-Means Clustering: Segmented the population into 4 psychophysical profiles based on the principal components.
- Statistical Profiling: Analyzed cluster centroids to interpret correlations between height/weight and behavioral factors (extraversion, anxiety).
Technologies Used
Language: Python 3
Libraries: Scikit-Learn, Pandas, NumPy, Matplotlib
Environment: Jupyter Notebook
Techniques: PCA, K-Means Clustering, Ordinal Encoding, StandardScaler
Academic Context
Course: Computational Linear Algebra for Large Scale Problems
Academic Year: 2025/2026
Authors: Lucio Baiocchi, Leonardo Passafiume
Link repo: GitHub