Skip to content

Commit e240585

Browse files
committed
exercises
1 parent ac715a8 commit e240585

4 files changed

Lines changed: 532 additions & 1951 deletions

File tree

ANSWERS.nb.html

Lines changed: 0 additions & 1951 deletions
This file was deleted.

ex_01.Rmd

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
title: "Exercise 1. Your turn"
3+
---
4+
5+
## Load library packages
6+
7+
```{r message=FALSE, warning=FALSE}
8+
library(janeaustenr)
9+
library(tidyverse)
10+
library(tidytext)
11+
library(wordcloud2)
12+
```
13+
14+
15+
## Your Turn. Exercise 1.
16+
17+
Goal: Make a basic word cloud for the novel, _Pride and Predjudice_, `pride_prej_novel`
18+
19+
a. Prepare: Add a line number to the text
20+
21+
```{r}
22+
pride_prej_novel <- tibble(text = prideprejudice) %>%
23+
mutate(line = ________________)
24+
```
25+
26+
b. Tokenize `pride_prej_novel` with `unnest_tokens()`
27+
28+
```{r}
29+
pride_prej_novel %>%
30+
unnest_tokens(____, _____)
31+
```
32+
33+
c. Remove stop-words
34+
35+
```{r}
36+
pride_prej_novel %>%
37+
unnest_tokens(____, _____) %>%
38+
anti_join(____________)
39+
```
40+
41+
d. calculate word frequency
42+
43+
```{r}
44+
pride_prej_novel %>%
45+
unnest_tokens(____, _____) %>%
46+
anti_join(____________) %>%
47+
count(____________)
48+
```
49+
50+
e. make a simple wordcloud
51+
52+
```{r}
53+
pride_prej_novel %>%
54+
unnest_tokens(____, _____) %>%
55+
anti_join(____________) %>%
56+
count(____________) %>%
57+
with(wordcloud::wordcloud(____, ____, max.words = ___))
58+
```
59+
60+
61+
f. Since "Friends don't let friends make word clouds", make a barplot of the word frequency.
62+
63+
```{r}
64+
pride_prej_novel %>%
65+
unnest_tokens(word, text) %>%
66+
anti_join(get_stopwords(), by = "word") %>%
67+
count(word, sort = TRUE) %>%
68+
slice_head(n = 10) %>%
69+
ggplot(aes(x = n, y = fct_reorder(word, n))) +
70+
geom_col() +
71+
labs(title = "Word Frequency",
72+
subtitle = "Jane Austen novel",
73+
x = "", y = "",
74+
caption = "Source: janeaustenr")
75+
```
76+

ANSWERS.Rmd renamed to ex_01_ANSWERS.Rmd

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,3 +58,19 @@ pride_prej_novel %>%
5858
with(wordcloud::wordcloud(word, n, max.words = 100))
5959
```
6060

61+
f. Since "Friends don't let friends make word clouds", make a barplot of the word frequency.
62+
63+
```{r}
64+
pride_prej_novel %>%
65+
unnest_tokens(word, text) %>%
66+
anti_join(get_stopwords(), by = "word") %>%
67+
count(word, sort = TRUE) %>%
68+
slice_head(n = 10) %>%
69+
ggplot(aes(x = n, y = fct_reorder(word, n))) +
70+
geom_col() +
71+
labs(title = "Word Frequency",
72+
subtitle = "Jane Austen novel",
73+
x = "", y = "",
74+
caption = "Source: janeaustenr")
75+
```
76+

ex_01_ANSWERS.nb.html

Lines changed: 440 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)