libjohn
diff --git a/‎ANSWERS.nb.html‎
Lines changed: 0 additions & 1951 deletions b/‎ANSWERS.nb.html‎
Lines changed: 0 additions & 1951 deletions
diff --git a/‎ex_01.Rmd‎
Lines changed: 76 additions & 0 deletions b/‎ex_01.Rmd‎
Lines changed: 76 additions & 0 deletions
diff --git a/‎ANSWERS.Rmd‎ ‎ex_01_ANSWERS.Rmd‎ANSWERS.Rmd renamed to ex_01_ANSWERS.Rmd
Lines changed: 16 additions & 0 deletions b/‎ANSWERS.Rmd‎ ‎ex_01_ANSWERS.Rmd‎ANSWERS.Rmd renamed to ex_01_ANSWERS.Rmd
Lines changed: 16 additions & 0 deletions
diff --git a/‎ex_01_ANSWERS.nb.html‎
Lines changed: 440 additions & 0 deletions b/‎ex_01_ANSWERS.nb.html‎
Lines changed: 440 additions & 0 deletions
@@ -0,0 +1,76 @@
+---
+title: "Exercise 1. Your turn"
+---
+
+## Load library packages
+
+```{r message=FALSE, warning=FALSE}
+library(janeaustenr)
+library(tidyverse)
+library(tidytext)
+library(wordcloud2)
+```
+
+
+## Your Turn.  Exercise 1.
+
+Goal: Make a basic word cloud for the novel, _Pride and Predjudice_, `pride_prej_novel`
+
+a. Prepare:  Add a line number to the text
+
+```{r}
+pride_prej_novel <-  tibble(text = prideprejudice) %>%
+  mutate(line = ________________)
+```
+
+b. Tokenize `pride_prej_novel` with `unnest_tokens()`
+
+```{r}
+pride_prej_novel %>% 
+  unnest_tokens(____, _____)
+```
+
+c. Remove stop-words
+
+```{r}
+pride_prej_novel %>% 
+  unnest_tokens(____, _____) %>% 
+  anti_join(____________)
+```
+
+d. calculate word frequency
+
+```{r}
+pride_prej_novel %>% 
+  unnest_tokens(____, _____) %>% 
+  anti_join(____________) %>% 
+  count(____________) 
+```
+
+e. make a simple wordcloud
+
+```{r}
+pride_prej_novel %>% 
+  unnest_tokens(____, _____) %>% 
+  anti_join(____________) %>% 
+  count(____________)  %>% 
+  with(wordcloud::wordcloud(____, ____, max.words = ___))
+```
+
+
+f. Since "Friends don't let friends make word clouds", make a barplot of the word frequency.  
+
+```{r}
+pride_prej_novel %>% 
+  unnest_tokens(word, text) %>% 
+  anti_join(get_stopwords(), by = "word") %>% 
+  count(word, sort = TRUE) %>% 
+  slice_head(n = 10) %>% 
+  ggplot(aes(x = n, y = fct_reorder(word, n))) +
+  geom_col() +
+  labs(title = "Word Frequency",
+       subtitle = "Jane Austen novel",
+       x = "", y = "",
+       caption = "Source: janeaustenr")
+```
+
@@ -58,3 +58,19 @@ pride_prej_novel %>%
   with(wordcloud::wordcloud(word, n, max.words = 100))
 ```
 
+f. Since "Friends don't let friends make word clouds", make a barplot of the word frequency.  
+
+```{r}
+pride_prej_novel %>% 
+  unnest_tokens(word, text) %>% 
+  anti_join(get_stopwords(), by = "word") %>% 
+  count(word, sort = TRUE) %>% 
+  slice_head(n = 10) %>% 
+  ggplot(aes(x = n, y = fct_reorder(word, n))) +
+  geom_col() +
+  labs(title = "Word Frequency",
+       subtitle = "Jane Austen novel",
+       x = "", y = "",
+       caption = "Source: janeaustenr")
+```
+