eHarmony matchings

This data set was provided by eHarmony, Inc. The data consists of pairs of individuals, which either matched (positive example) or did not (negative example). The data is partitioned into two subsets corresponding to two equal-length segments of time. The data is stored in CSV files, organized as follows.

Each row describes an individual. The first column is an identification number for that individual, and all subsequent columns contain the (numeric) feature values.
Each row describes a pairwise interaction. The first column indicates whether the interaction is positive (1) or negative (0). The second and third columns contain identification numbers for the corresponding individuals.

Please refer to the paper below for more details about this data set.


To protect the privacy of users, all features have been obfuscated and normalized. I cannot provide names for the features.



If you use this data, please cite the following paper:
bib | pdf
Metric learning to rank
Twenty-seventh International Conference on Machine Learning (ICML).

Source code

The source code for MLR is now hosted on GitHub.