In this tutorial, I would like to introduce wordcloud2
package which is used to build word cloud in R. First of all, install the library using install.packages ()
function if it hasn’t installed yet. Then, call the package from the CRAN using library()
function.
#install.packages("wordcloud2")
library(wordcloud2)
You should also consider webshot
and htmlwidgets
packages for rendering the output of wordcloud2
in R Markdown.
#install.packages("webshot")
#install.packages("htmlwidgets")
library(webshot)
library(htmlwidgets)
#webshot::install_phantomjs()
In order to create the cloud, wordcloud2()
function is used. You can arrange size of the words and shape of the cloud by using size
and shape
arguments.
Let’s start with an default example by considering demoFreq data being available at the interested package.
head(demoFreq)
## word freq
## oil oil 85
## said said 73
## prices prices 48
## opec opec 42
## mln mln 31
## the the 26
As seen that, the dataset for word cloud is supposed to have words and the corresponding frequencies.
cloud1=wordcloud2(data=demoFreq, size=1.6)
saveWidget(cloud1,"1.html",selfcontained = F)
webshot::webshot("1.html","1.png",vwidth = 700, vheight = 500, delay =10)
If you change shape,
cloud2=wordcloud2(data=demoFreq, size=0.5,shape='cardioid')
saveWidget(cloud2,"2.html",selfcontained = F)
webshot::webshot("2.html","2.png",vwidth = 700, vheight = 500, delay =10)
If you change color and background,
cloud3=wordcloud2(data=demoFreq, size=0.5,shape='cardioid',color = "red",backgroundColor = "yellow")
saveWidget(cloud3,"3.html",selfcontained = F)
webshot::webshot("3.html","3.png",vwidth = 700, vheight = 500, delay =10)
Let’s try this package on the real dataset. For this example, I will use the dataset taken from Istanbul Municipality Open Data Portal.[http://data.ibb.gov.tr] The dataset includes the observation for the total amount of domestic waste produced by each district in Istanbul between 2004 and 2019.
atik=read.table("atik.txt",header=T)
head(atik)
## Ilce Toplam
## 1 Adalar 197941.9
## 2 Arnavutkoy 831453.6
## 3 Atasehir 2105712.7
## 4 Avcilar 1899055.5
## 5 Bahcelievler 3027749.0
## 6 Bagcilar 3736761.3
str(atik)
## 'data.frame': 39 obs. of 2 variables:
## $ Ilce : Factor w/ 39 levels "Adalar","Arnavutkoy",..: 1 2 3 4 6 5 7 8 9 10 ...
## $ Toplam: num 197942 831454 2105713 1899055 3027749 ...
We need change the class of Toplam column from factor to numeric in order to avoid any problems that can be occured in the creation part.
atik$Toplam=as.numeric(as.character(atik$Toplam))
str(atik)
## 'data.frame': 39 obs. of 2 variables:
## $ Ilce : Factor w/ 39 levels "Adalar","Arnavutkoy",..: 1 2 3 4 6 5 7 8 9 10 ...
## $ Toplam: num 197942 831454 2105713 1899055 3027749 ...
Then
cloud_ist=wordcloud2(data=atik, size=0.25,minRotation = -pi/9, maxRotation = -pi/9, rotateRatio = 1)
saveWidget(cloud_ist,"4.html",selfcontained = F)
webshot::webshot("4.html","4.png",vwidth = 700, vheight = 500, delay =10)
cloud_ist1=wordcloud2(data=atik, size=0.25,shape='star',color='white',backgroundColor = 'red')
saveWidget(cloud_ist1,"5.html",selfcontained = F)
webshot::webshot("5.html","5.png",vwidth = 700, vheight = 500, delay =10)
You can download .Rmd file from my GitHub page. [http://github.com/ozancanozdemir]