r - Hierarchical clustering on rows of varying length with sequence of numbers -


i want hierarchical clustering in 1 of project.

my original problem have huge graph on have iterated large number of paths , reported nodes of path in below format. each number in below sample represents graph node , row represents path. want cluster these paths on basis of number of sharing nodes way segregate similar kind of paths.
1210, 158, 1222, 1468 1210, 1222, 198 158, 1468, 25, 26, 27, 28

now want hierarchical clustering between rows based upon number of similar nodes. in table above, rows(paths) 1 , 2 part of 1 cluster due same nodes 1210 , 1222. rows(paths) 1 , 3 part of cluster due similar nodes 158 , 1468.

i checked can use hclust function hierarchical clustering. function takes dissimilarity matrix argument. not sure how create distance metric. seems use jaccard similarity measure. don't find option in dist method jaccard similarity , and variable column format above.

regards,

here's example of hclust jaccard distance (using vegdist in vegan package), based on abstraction of data binary dataset:

dat   25 26 27 28 158 198 1210 1222 1468 1  0  0  0  0   1   0    1    1    1 2  0  0  0  0   0   1    1    1    0 3  1  1  1  1   1   0    0    0    1  library(vegan) dist<-vegdist(dat, method="jaccard") hclust(dist) %>% plot 

enter image description here


Comments

Popular posts from this blog

javascript - How to get current YouTube IDs via iMacros? -

c# - Maintaining a program folder in program files out of date? -

emulation - Android map show my location didn't work -