Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Importing data for k-means clustering

I’m trying to follow this

https://uc-r.github.io/kmeans_clustering

library(tidyverse)  # data manipulation
library(cluster)    # clustering algorithms
library(factoextra) # clustering algorithms & visualization

distance <- get_dist(df)
fviz_dist(distance, gradient = list(low = "#00AFBB", mid = "white", high = "#FC4E07"))

Which, as expected works great.

It may be something really simple but why is there no column name for, what is obviously, the state field?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

If I try and use this methodology with a dataset like this

ipl <- read.csv("https://query.data.world/s/3kadbuzyj25jwe42k6tgij56gscept?dws=00000", header=TRUE, stringsAsFactors=FALSE)
ipl <- na.omit(ipl)

distanceipl <- get_dist(ipl)
fviz_dist(distanceipl, gradient = list(low = "#00AFBB", mid = "white", high = "#FC4E07"))

then instead of the players names on each axis, I get what I think are the row numbers. How do I get the player names in PLAYER on the axes?

>Solution :

From the docs:

fviz_dist(): returns a ggplot2

So you can just add labels the way you would with a normal ggplot2 object, i.e.:

fviz_dist(distanceipl, gradient = list(low = "#00AFBB", mid = "white", high = "#FC4E07")) + scale_y_discrete(labels = ipl$PLAYER)

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading