The motivation here is that if your unsupervised learning method assigns high probability to similar data that wasn't used to fit parameters, then it has probably done a good job of capturing the distribution of interest.
A good resource (with references) for clustering is sklearn's documentation page, Clustering Performance Evaluation.
This covers several method, but all but one, the Silhouette Coefficient, assumes ground truth labels are available.
It is a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, data compression, and computer graphics.
Cluster analysis itself is not one specific algorithm, but the general task to be solved.
This method is also mentioned in the question Evaluation measure of clustering, linked in the comments for this question.