Meowmeow Posted April 20, 2020 Author Report Posted April 20, 2020 Just now, ring_master said: K- means vaadu but K-means assigns cluster to it's members based on euclidean distance between clusters tSNE is what I used for clustering in the earlier problem based on Euclidian distance. Quote
kathanayaka Posted April 20, 2020 Report Posted April 20, 2020 First use gridsearch and randomsearchcv algorithms to get the best hyper parameters Then try K-means(n=1,2,3,....) and SVM(best model) for the clustering(playing with c,gamma values) Hierarchial clustering before k means will also give you an estimate on optimal no of clusters Quote
Meowmeow Posted April 20, 2020 Author Report Posted April 20, 2020 Just now, kathanayaka said: First use gridsearch and randomsearchcv algorithms to get the best hyper parameters Then try KNN(n=1,2,3,....) and SVM(best model) for the clustering(playing with c,gamma values) Does that model cluster based on Euclidean distance between the clusters of the probability distribution of the cluster? Quote
kathanayaka Posted April 20, 2020 Report Posted April 20, 2020 2 minutes ago, Meowmeow said: Does that model cluster based on Euclidean distance between the clusters of the probability distribution of the cluster? K means supports only Euclidean distance Quote
ring_master Posted April 20, 2020 Report Posted April 20, 2020 3 minutes ago, Meowmeow said: tSNE is what I used for clustering in the earlier problem based on Euclidian distance. Clustering will be based on similiarity measure... so distance is used as similarity measure . You don;t want to use distance to measure similarity? Is that you're trying to do? Quote
Meowmeow Posted April 20, 2020 Author Report Posted April 20, 2020 2 minutes ago, kathanayaka said: K means supports only Euclidean distance Yeah I used tSNE for Euclidian distance. I need an algorithm for probability distribution of the cluster. Quote
kathanayaka Posted April 20, 2020 Report Posted April 20, 2020 Just now, Meowmeow said: Yeah I used tSNE for Euclidian distance. I need an algorithm for probability distribution of the cluster. why did even tSNE come into picture Its just for data exploration and visualizing high-dimensional data and gives an intuition of how the data is arranged ante 1 Quote
Ellen Posted April 20, 2020 Report Posted April 20, 2020 Just now, Meowmeow said: Yeah I used tSNE for Euclidian distance. I need an algorithm for probability distribution of the cluster. There are some divergence based clustering. Special cases which measure probability distribution. try kl-divergence or u can simply use chi-squared Quote
Meowmeow Posted April 20, 2020 Author Report Posted April 20, 2020 1 minute ago, ring_master said: Clustering will be based on similiarity measure... so distance is used as similarity measure . You don;t want to use distance to measure similarity? Is that you're trying to do? Kind of, if I cluster based on Euclidean distance, I can only tell that the points in a single cluster are similar. But I cannot tell how different one cluster is from another cluster based on how close/far it is. So, I am trying to get that information. Quote
Meowmeow Posted April 20, 2020 Author Report Posted April 20, 2020 1 minute ago, kathanayaka said: why did even tSNE come into picture Its just for data exploration and visualizing high-dimensional data and gives an intuition of how the data is arranged ante I am trying to visualize high dimensional data, but I want the distance between the clusters to indicate how similar/or not they are to each other. Quote
Meowmeow Posted April 20, 2020 Author Report Posted April 20, 2020 I might just give up this question, this is for extra credit and I think I dided everything else right. Quote
Scada Posted April 20, 2020 Report Posted April 20, 2020 Just now, Meowmeow said: I might just give up this question, this is for extra credit and I think I dided everything else right. Final ga nuv bussu anna mata.. Quote
kathanayaka Posted April 20, 2020 Report Posted April 20, 2020 5 minutes ago, Meowmeow said: I am trying to visualize high dimensional data, but I want the distance between the clusters to indicate how similar/or not they are to each other. yes you need a clustering algorithm . hence K means or SVM give a shot Quote
ring_master Posted April 20, 2020 Report Posted April 20, 2020 8 minutes ago, Meowmeow said: Kind of, if I cluster based on Euclidean distance, I can only tell that the points in a single cluster are similar. But I cannot tell how different one cluster is from another cluster based on how close/far it is. So, I am trying to get that information. Ok I got it . Quote
Meowmeow Posted April 20, 2020 Author Report Posted April 20, 2020 3 minutes ago, kathanayaka said: yes you need a clustering algorithm . hence K means or SVM give a shot Arey k means clustering answer kadu ra, this is a traditional approach he thought in the class, I used this algorithm for all the earlier problems. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.