This repository houses an invited presentation for Kennesaw State's DS 7900 Applied Project in Analytics and Data Science.
I use the marathon data that the New York Times article What Good Marathons and Bad Investments Have in Common used.
They provide links to the entire data of almost ten million records in csv from box.com. I have removed a few columns and provided two formats from dropbox.
You can find the same data in .feather and .parquet formats in this repository's arrow folder.
- initial_setup.R provides the script that drops columns from the original source.
- create_arrow.R provides an example of converting a large file from
.sas7bdatto.featherand.parquet. The results are inarrow. - data_digest.R provides size and parsing time for each format.
- create_arrow.py provides an example of converting a large file from
.sas7bdatto.featherand.parquet. The results are inpy_arrow - data_digest.R provides sizes and parsing for
.sas7bdatand.parquet.
The explore_bigdata.R file provides a short example.
GitHub Pages Slideshow with Remark
This template is made from Remark, an open-source tool to help create and display slideshows from markdown. For questions, see Remark's documentation.
The most important things to know are:
- Enable GitHub Pages from
masterfor the slides to work - Once enabled, the slides will be visible at
https://USERNAME.github.io/REPOSITORY-NAME/#1, like https://brianamarie.github.io/slideshow-on-pages/#1 - Edit the
index.htmlfile to edit the slides - Slides are separated by
---- - Presenter notes after
???within one slide - Toggle presenter notes during presentation with
P - Read the full guide to remark markdown
- Press
Cto clone a display; then pressPto switch to presenter mode. Open help menu withh