Skip to content

Commit bf51cf1

Browse files
committed
slides draft
1 parent 2c55088 commit bf51cf1

53 files changed

Lines changed: 25593 additions & 0 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

slides/images/r_logo.png

92.3 KB
Loading

slides/images/rfun.png

19.1 KB
Loading

slides/images/rfun_smaller.png

10 KB
Loading

slides/images/rmarkdown.png

160 KB
Loading

slides/images/tidyverse.png

248 KB
Loading

slides/index.Rmd

Lines changed: 251 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,251 @@
1+
---
2+
title: "<br>Text Mining"
3+
subtitle: "R case study"
4+
author: "John Little"
5+
institute: "Cntr for Data & Viz"
6+
date: "April 13, 2021"
7+
output:
8+
xaringan::moon_reader:
9+
lib_dir: libs
10+
css:
11+
- mystyles/xaringan-themer.css # fonts I do want https://pkg.garrickadenbuie.com/xaringanthemer/articles/xaringanthemer.html
12+
- mystyles/adirondack/story.css # https://story.xaprb.com/slides/adirondack/
13+
- mystyles/adirondack/apron.css # layout
14+
#- mystyles/adirondack/adirondack.css # fonts I don't want
15+
- mystyles/adirondack/descartes.css # image positon
16+
- mystyles/adirondack/tachyons.min.css # color, font weights, boxes
17+
# - mystyles/adirondack/monoblock.css # part of story/adirondack
18+
- mystyles/my-theme.css
19+
nature:
20+
ratio: '16:9'
21+
highlightStyle: github
22+
highlightLines: true
23+
countIncrementalSlides: true
24+
---
25+
26+
```{r setup, include=FALSE}
27+
options(htmltools.dir.version = FALSE)
28+
library(tidyverse)
29+
library(htmltools)
30+
tagList(rmarkdown::html_dependency_font_awesome())
31+
# library(xaringanthemer) # run once; or use the pre-run css found in mystyles (xaringan-themer.css)
32+
# style_duo_accent(
33+
# primary_color = "#012169",
34+
# secondary_color = "#005587",
35+
# header_font_google = google_font("Josefin Sans"),
36+
# text_font_google = google_font("Montserrat", "300", "300i"),
37+
# code_font_google = google_font("Fira Mono")
38+
# )
39+
```
40+
41+
## Duke University: Land Acknowledgement
42+
43+
.f4[I would like to take a moment to honor the land in Durham, NC. Duke University sits on the ancestral lands of the Shakori, Eno and Catawba people. This institution of higher education is built on land stolen from those peoples. These tribes were here before the colonizers arrived. Additionally this land has borne witness to over 400 years of the enslavement, torture, and systematic mistreatment of African people and their descendants. Recognizing this history is an honest attempt to breakout beyond persistent patterns of colonization and to rewrite the erasure of Indigenous and Black peoples. There is value in acknowledging the history of our occupied spaces and places. I hope we can glimpse an understanding of these histories by recognizing the origins of collective journeys.]
44+
45+
46+
---
47+
48+
layout: true
49+
50+
.footercc[
51+
<i class="fab fa-creative-commons"></i>&nbsp; <i class="fab fa-creative-commons-by"></i><i class="fab fa-creative-commons-nc"></i> <a href = "https://JohnLittle.info"><span class = "opacity30">https://</span>JohnLittle<span class = "opacity30">.info</span></a>
52+
<span class = "opacity30"> | <a href="https://github.com/libjohn/workshop_textmining">https://github.com/libjohn/workshop_textmining</a> | `r Sys.Date()` </span>
53+
]
54+
55+
---
56+
57+
## Demonstration Goals
58+
59+
- Gather some tweets
60+
61+
- Define APIs and the Twitter Developer portal (Academic Use)
62+
63+
- Rudimentary text analysis and visualization
64+
65+
- Point out useful documentation / resources
66+
67+
68+
***
69+
70+
.f6.i.moon-gray.center[This is not a text analysis workshop. The foundations of text analysis require considerably more time that we have.
71+
This is a demonstration on leveraging the following tidy packages (tidyverse, and tidytext) and sharing resources. ]
72+
73+
74+
---
75+
76+
class: img-right-full
77+
78+
![](images/attendance.png)
79+
80+
# Three tenets
81+
82+
83+
- Just numbers
84+
- Benefits of review
85+
- Dashboard fatigue is a real thing
86+
87+
88+
???
89+
90+
- The implications of dashboard fatigue might be the most interesting thing to discuss in the QA
91+
92+
---
93+
layout: false
94+
class: img-left-full
95+
96+
![](images/by_dept_compare.png)
97+
98+
## Drivers
99+
100+
- Goal: create a dashboard of workshop attendance
101+
- CDVS motivated by the possibility of exploring data
102+
- Dashboard can be the basis of another workshop
103+
104+
.footercc[
105+
<i class="fab fa-creative-commons"></i>&nbsp; <i class="fab fa-creative-commons-by"></i><i class="fab fa-creative-commons-nc"></i> <a href = "https://JohnLittle.info"><span class = "opacity30">https://</span>JohnLittle<span class = "opacity30">.info</span></a>
106+
<span class = "opacity30"> | <a href="https://github.com/libjohn/workshop_textmining">https://github.com/libjohn/workshop_textmining</a> | `r Sys.Date()` </span>
107+
]
108+
109+
???
110+
111+
- These are not exactly the best drivers for creating a dashboard. They’re not bad either.
112+
113+
114+
---
115+
layout: false
116+
class: middle, center
117+
118+
<br>
119+
120+
.bg-washed-blue.b--navy.ba.bw2.br3.shadow-5.ph4.mt5[
121+
122+
![Rfun](images/rfun.png# fl l-4 w-2-12th)
123+
124+
## John R Little
125+
126+
.prussian[
127+
.f5[Data Science Librarian
128+
Center for Data & Visualization Sciences
129+
Duke University Libraries
130+
]
131+
]
132+
133+
.f7[https://johnlittle.info
134+
https://Rfun.library.duke.edu
135+
https://library.duke.edu/data
136+
]
137+
]
138+
139+
140+
141+
<i class="fab fa-creative-commons fa-2x"></i> &nbsp; <i class="fab fa-creative-commons-by fa-2x"></i><i class="fab fa-creative-commons-nc fa-2x"></i>
142+
.f6.moon-gray[Creative Commons: Attribution-NonCommercial 4.0]
143+
.f7.moon-gray[https://creativecommons.org/licenses/by-nc/4.0]
144+
145+
---
146+
class: inverse
147+
148+
# Appendix
149+
150+
## screen shots
151+
152+
---
153+
layout: true
154+
155+
.footercc[
156+
<i class="fab fa-creative-commons"></i>&nbsp; <i class="fab fa-creative-commons-by"></i><i class="fab fa-creative-commons-nc"></i> <a href = "https://JohnLittle.info"><span class = "opacity30">https://</span>JohnLittle<span class = "opacity30">.info</span></a>
157+
<span class = "opacity30"> | <a href="https://github.com/libjohn/workshop_textmining">https://github.com/libjohn/workshop_textmining</a> | `r Sys.Date()` </span>
158+
]
159+
160+
---
161+
162+
![Tidyverse](images/tidyverse.png# w-10pct t-1 db fr mr-4)
163+
164+
## Technology stack
165+
166+
167+
168+
- R
169+
- R is a data-first coding language
170+
- R can be a universal interface for analysis and workflow
171+
- Tidyverse is a well developed approach to workflow & the data lifecycle
172+
- Bias towards enabling reproducibility
173+
- scripting
174+
- reporting
175+
176+
177+
![flexdashboards](images/flexdashboard.png# w-10pct fr fm mr-4)
178+
179+
![r logo](images/r_logo.png# fm fr w-10pct mr-4)
180+
181+
![rmarkdown](images/rmarkdown.png# w-10pct fr mr-4)
182+
183+
184+
???
185+
186+
- Reuse analysis code to produce reports, email alerts, interactive dashboards, etc.
187+
188+
---
189+
190+
## Lesson
191+
192+
.fl-10.w-60.bg.b.ba.bw1.br3.shadow-5.ph4.mt4.center.prussian[The last thing you should do is
193+
build the dashboard
194+
]
195+
196+
- Identify target audience and scope
197+
- Create summary reports
198+
- Build a static analysis
199+
- Generate push-reports based on dynamic thresholds
200+
- Advanced: Build a reporting application
201+
202+
???
203+
Or, in this case, build a workshop attendance application
204+
205+
---
206+
## Other important question(s)
207+
208+
- If developing the dashboard in R...
209+
- Flexdashboard (dashboards)
210+
- Shiny (Web applications)
211+
212+
Not mutually exclusive but Flexdashboards has a significantly lower barrier to entry
213+
214+
.center[![people](images/happy_people2.jpg# h-10pct w-33pct)]
215+
216+
---
217+
## Actual Goals
218+
219+
- Host **cleaned and disaggregated data**
220+
221+
- Provide a **summary of attendance**
222+
223+
![survey](images/survey_1.png# absolute ofv r-3 w-75pct h-7-12th)
224+
225+
226+
227+
???
228+
229+
- Host **cleaned and disaggregated data**
230+
- A data archive for clean data
231+
- exported from the SpringShare registration system
232+
- accounts for attendance
233+
- Provide a **summary of attendance** so that staff can
234+
- Assess their workshop’s impact over time (as measured by attendance and registration)
235+
- See current semester attendance totals within the context of multi-year totals
236+
237+
---
238+
class: center
239+
240+
![](images/full_attendance.png# l-0 t-0 w-two-thirds h-80pct ofv absolute)
241+
![](images/full_demographics.png# w-third h-40pct t-0 r-0 ofv absolute)
242+
![](images/full_survey.png# w-third h-40pct t-40pct r-0 ofv absolute)
243+
![](images/slice_tables.jpg# l-0 t-80pct w-100pct ofv absolute)
244+
245+
.prussian[
246+
.absolute.w-5-12th.pa-3.l-4-12th.t-8-12th.b.ba.bw-4.br-4.shadow-5.bg-white-80[
247+
Collage of dashboard screens
248+
]
249+
]
250+
251+

0 commit comments

Comments
 (0)