PCA Psychophysical Analysis

Dimensionality reduction and clustering of psychophysical behavioral data – Computational Linear Algebra course

Project Description

An exploratory data analysis project developed for the Computational Linear Algebra for Large Scale Problems course at Politecnico di Torino (2025/2026). The project applies PCA to reduce 93 initial variables into 5 principal components that capture the most significant variance in human psychophysical profiles, then uses K-Means clustering to identify distinct behavioral groups.

Key Findings

  • The Reckless Hedonist (Cluster 1): Larger physical stature (avg. 178.6 cm, 73.3 kg), sensation-seeking behavior.
  • The Anxious Conformist (Cluster 0): Smaller physical frames (avg. 169.3 cm, 59.3 kg).
  • The Female-Dominant Profile (Cluster 2): 60.6% female, distinct psychological markers.
  • The Heterogeneous Group (Cluster 3): High weight variability (σ = 16.8), diverse physical segment.

Methodology

  • Data Preprocessing: Ordinal Encoding for categorical variables, StandardScaler for numerical normalization.
  • PCA: Condensed 93 features into 5 principal components while retaining structural integrity.
  • K-Means Clustering: Segmented the population into 4 psychophysical profiles based on the principal components.
  • Statistical Profiling: Analyzed cluster centroids to interpret correlations between height/weight and behavioral factors (extraversion, anxiety).

Technologies Used

Language: Python 3
Libraries: Scikit-Learn, Pandas, NumPy, Matplotlib
Environment: Jupyter Notebook
Techniques: PCA, K-Means Clustering, Ordinal Encoding, StandardScaler

Academic Context

Course: Computational Linear Algebra for Large Scale Problems
Academic Year: 2025/2026
Authors: Lucio Baiocchi, Leonardo Passafiume

Link repo: GitHub