class: left, bottom, title-slide .title[ # 3b. MOOC Network ] .subtitle[ ## Analytics Sandbox ] .author[ ### K. Bret Staudt Willet | Florida State University ] .date[ ### February 1, 2023 ] --- class: inverse, center, middle #
<br><br> **Part 3b:** <br> MOOC Network **Data source:** [Massively Open Online Course for Educators (MOOC-Ed) network dataset](https://dataverse.harvard.edu/dataset.xhtml;jsessionid=9ad052693563b29056a88d490182?persistentId=doi%3A10.7910%2FDVN%2FZZH3UB&version=&q=&fileTypeGroupFacet=&fileAccess=&fileSortField=name&fileSortOrder=desc) --- #
MOOC Network data ```r edgelist2 <- read_csv("data/DLT1 Edgelist.csv", show_col_types = FALSE) %>% group_by(Sender, Receiver) %>% mutate(Weight = n()) %>% ungroup() %>% relocate(Sender, Receiver, Weight) glimpse(edgelist2) ``` ``` ## Rows: 2,529 ## Columns: 11 ## $ Sender <dbl> 360, 356, 356, 344, 392, 219, 318, 4, 355, 355… ## $ Receiver <dbl> 444, 444, 444, 444, 444, 444, 444, 444, 356, 4… ## $ Weight <int> 1, 2, 2, 1, 1, 3, 2, 4, 1, 2, 4, 1, 1, 2, 1, 1… ## $ Timestamp <chr> "4/4/13 16:32", "4/4/13 18:45", "4/4/13 18:47"… ## $ `Discussion Title` <chr> "Most important change for your school or dist… ## $ `Discussion Category` <chr> "Group N", "Group D-L", "Group D-L", "Group O-… ## $ `Parent Category` <chr> "Units 1-3 Discussion Groups", "Units 1-3 Disc… ## $ `Category Text` <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA… ## $ `Discussion Identifier` <chr> "Most important change for your school or dist… ## $ `Comment ID` <dbl> 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15… ## $ `Discussion ID` <chr> "2", "1", "3", "4", "5", "6", "6", "2", "3", "… ``` --- #
Class Network data ```r head(edgelist2, 10) ``` ``` ## # A tibble: 10 × 11 ## Sender Recei…¹ Weight Times…² Discu…³ Discu…⁴ Paren…⁵ Categ…⁶ Discu…⁷ Comme…⁸ ## <dbl> <dbl> <int> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> ## 1 360 444 1 4/4/13… Most i… Group N Units … <NA> Most i… 2 ## 2 356 444 2 4/4/13… Most i… Group … Units … <NA> Most i… 3 ## 3 356 444 2 4/4/13… DLT Re… Group … Units … <NA> DLT Re… 4 ## 4 344 444 1 4/4/13… Most i… Group … Units … <NA> Most i… 5 ## 5 392 444 1 4/4/13… Most i… Group … Units … <NA> Most i… 6 ## 6 219 444 3 4/4/13… Most i… Group M Units … <NA> Most i… 7 ## 7 318 444 2 4/4/13… Most i… Group M Units … <NA> Most i… 8 ## 8 4 444 4 4/4/13… Most i… Group N Units … <NA> Most i… 9 ## 9 355 356 1 4/4/13… DLT Re… Group … Units … <NA> DLT Re… 10 ## 10 355 444 2 4/4/13… Most i… Group … Units … <NA> Most i… 11 ## # … with 1 more variable: `Discussion ID` <chr>, and abbreviated variable names ## # ¹Receiver, ²Timestamp, ³`Discussion Title`, ⁴`Discussion Category`, ## # ⁵`Parent Category`, ⁶`Category Text`, ⁷`Discussion Identifier`, ## # ⁸`Comment ID` ``` --- class: inverse, center, middle #
<br><br> Try it Out! --- #
Try it Out! What do you think this code will do? ```r network_graph2 <- tidygraph::as_tbl_graph(edgelist2) %>% mutate(popularity = centrality_degree(mode = 'in')) network_graph2 ``` ``` ## # A tbl_graph: 442 nodes and 2529 edges ## # ## # A directed multigraph with 1 component ## # ## # Node Data: 442 × 2 (active) ## name popularity ## <chr> <dbl> ## 1 360 0 ## 2 356 3 ## 3 344 1 ## 4 392 0 ## 5 219 17 ## 6 318 2 ## # … with 436 more rows ## # ## # Edge Data: 2,529 × 11 ## from to Weight Timest… Discus… Discus… Parent… Catego… Discus… Commen… ## <int> <int> <int> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> ## 1 1 51 1 4/4/13… Most i… Group N Units … <NA> Most i… 2 ## 2 2 51 2 4/4/13… Most i… Group … Units … <NA> Most i… 3 ## 3 2 51 2 4/4/13… DLT Re… Group … Units … <NA> DLT Re… 4 ## # … with 2,526 more rows, and 1 more variable: `Discussion ID` <chr> ``` --- #
Try it Out! What do you think this code will do? ```r network_graph2 %>% ggraph(layout = 'kk') ``` ![](3-mooc-network_files/figure-html/unnamed-chunk-4-1.png)<!-- --> --- #
Try it Out! What do you think this code will do? ```r network_graph2 %>% ggraph(layout = 'kk') + geom_edge_arc() ``` ![](3-mooc-network_files/figure-html/unnamed-chunk-5-1.png)<!-- --> --- #
Try it Out! What do you think this code will do? ```r network_graph2 %>% ggraph(layout = 'kk') + geom_edge_arc() + geom_node_point() ``` ![](3-mooc-network_files/figure-html/unnamed-chunk-6-1.png)<!-- --> --- #
Try it Out! What do you think this code will do? ```r network_graph2 %>% ggraph(layout = 'kk') + geom_edge_arc() + geom_node_point(alpha = .4, aes(size = popularity)) + scale_size(range = c(1,10)) ``` ![](3-mooc-network_files/figure-html/unnamed-chunk-7-1.png)<!-- --> --- #
Try it Out! What do you think this code will do? ```r network_graph2 %>% ggraph(layout = 'kk') + geom_edge_arc(alpha = .2, width = .5, strength = .5, color = 'steelblue') + geom_node_point(alpha = .4, aes(size = popularity)) + scale_size(range = c(1,10)) ``` ![](3-mooc-network_files/figure-html/unnamed-chunk-8-1.png)<!-- --> --- #
Try it Out! What do you think this code will do? ```r sociogram2 <- network_graph2 %>% ggraph(layout = 'kk') + geom_edge_arc(alpha = .2, width = .5, strength = .5, color = 'steelblue' ) + geom_node_point(alpha = .4, aes(size = popularity)) + scale_size(range = c(1,10)) + theme_wsj() + scale_colour_wsj("colors6") + theme(axis.line=element_blank(), axis.text.x=element_blank(), axis.text.y=element_blank(), axis.ticks.x =element_blank(), axis.ticks.y =element_blank(), axis.title.x=element_blank(), axis.title.y=element_blank(), panel.background=element_blank(), panel.border=element_blank(), panel.grid.major=element_blank(), panel.grid.minor=element_blank()) sociogram2 ``` ![](3-mooc-network_files/figure-html/unnamed-chunk-9-1.png)<!-- --> --- #
Picture it! <img src="output/3-mooc-network.png" width="600px" style="display: block; margin: auto;" /> --- #
Look closer! **MOOC Discussion - Social Network Analysis** There are quite a few descriptive measures of networks: - **Order:** number of nodes/vertices (students, in this case) - **Size:** number of edges/connections (responses, in this case) - **Reciprocity:** mutuality - **Transitivity:** clustering - **Diameter:** similar to degrees of separation - **Density:** out of all possible connections, percentage that have been made - **Node degree:** number of connections - **Sentiment score:** how positive or negative in aggregate - Character count, Word count, Length of threads --- #
Look closer! **Order:** number of nodes/vertices (students, in this case) ```r library(igraph) gorder(network_graph2) ``` ``` ## [1] 442 ``` <hr> **Size:** number of edges/connections (responses, in this case) ```r gsize(network_graph2) ``` ``` ## [1] 2529 ``` --- #
Look closer! **Reciprocity:** mutuality ```r reciprocity(network_graph2) ``` ``` ## [1] 0.1997544 ``` <hr> **Transitivity:** clustering ```r transitivity(network_graph2) ``` ``` ## [1] 0.08880774 ``` --- #
Look closer! **Diameter:** similar to degrees of separation ```r diameter(network_graph2) ``` ``` ## [1] 8 ``` <hr> **Density:** out of all possible connections, percentage that have been made ```r edge_density(network_graph2) ``` ``` ## [1] 0.01297442 ``` --- #
Look closer! **Node degree:** number of connections ```r mean(degree(network_graph2)) ``` ``` ## [1] 11.44344 ``` <hr> ```r degree(network_graph2) %>% mean() ``` ``` ## [1] 11.44344 ``` <hr> ```r median(degree(network_graph2)) ``` ``` ## [1] 4 ``` --- #
MOOC Network data ```r nodes2 <- read_csv("data/DLT1 Nodes.csv", show_col_types = FALSE) glimpse(nodes2) ``` ``` ## Rows: 445 ## Columns: 13 ## $ UID <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,… ## $ Facilitator <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0… ## $ role1 <chr> "libmedia", "classteaching", "districtadmin", "classteachi… ## $ experience <dbl> 1, 1, 2, 2, 3, 1, 2, 1, 1, 2, 3, 3, 2, 1, 3, 1, 1, 1, 3, 1… ## $ experience2 <chr> "6 to 10", "6 to 10", "11 to 20", "11 to 20", "20+", "4 to… ## $ grades <chr> "secondary", "secondary", "generalist", "middle", "general… ## $ location <chr> "VA", "FL", "PA", "NC", "AL", "AL", "SD", "BE", "NC", "NC"… ## $ region <chr> "South", "South", "Northeast", "South", "South", "South", … ## $ country <chr> "US", "US", "US", "US", "US", "US", "US", "BE", "US", "US"… ## $ group <chr> "UZ", "DL", "OT", "N", "AC", "AC", "OT", "AC", "N", "N", "… ## $ gender <chr> "female", "female", "female", "female", "female", "female"… ## $ expert <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"… ## $ connect <chr> "1", "0", "1", "0", "0", "1", "0", "0", "0", "0", "1", "0"… ``` --- #
MOOC Network data ```r head(nodes2, 10) ``` ``` ## # A tibble: 10 × 13 ## UID Facilitator role1 exper…¹ exper…² grades locat…³ region country group ## <dbl> <dbl> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> ## 1 1 0 libmed… 1 6 to 10 secon… VA South US UZ ## 2 2 0 classt… 1 6 to 10 secon… FL South US DL ## 3 3 0 distri… 2 11 to … gener… PA North… US OT ## 4 4 0 classt… 2 11 to … middle NC South US N ## 5 5 0 othere… 3 20+ gener… AL South US AC ## 6 6 0 classt… 1 4 to 5 gener… AL South US AC ## 7 7 0 instru… 2 11 to … gener… SD Midwe… US OT ## 8 8 0 specia… 1 6 to 10 secon… BE Inter… BE AC ## 9 9 0 classt… 1 6 to 10 middle NC South US N ## 10 10 0 school… 2 11 to … middle NC South US N ## # … with 3 more variables: gender <chr>, expert <chr>, connect <chr>, and ## # abbreviated variable names ¹experience, ²experience2, ³location ``` --- class: inverse, center, middle #
<br><br> Comparing <br> Class & MOOC Discussion --- #
Picture it! <img src="output/3-both-networks.png" width="540px" style="display: block; margin: auto;" /> --- #
Picture it! <img src="output/3-comparison-table.png" width="100%" style="display: block; margin: auto;" /> --- class: inverse, center, middle #
<br><br> Try on your own! --- #
Try on your own! - Download a copy of this repository. - Use the saved data in the "data" folder to play around a bit more, changing different parameters. - Reflect: - What other comparisons might you make? - How else might you analyze these data? --- class: inverse, center, middle #
<br><br> Appendix: <br> Helpful Resources <br> and Troubleshooting --- # Resources **Beginners:** - [RStudio Beginners' Guide](https://education.rstudio.com/learn/beginner/) - Book: [*Data Science in Education Using R*](https://datascienceineducation.com) - See [Chapter 12](https://datascienceineducation.com/c12.html) - Walkthrough 6: Exploring Relationships Using Social Network Analysis With Social Media Data - [Physical copy of DSIEUR](https://www.routledge.com/Data-Science-in-Education-Using-R/Estrellado-Freer-Mostipak-Rosenberg-Velasquez/p/book/9780367422257) - [Even more resources from DSIEUR](https://datascienceineducation.com/c18.html) **Intermediates:** - [RStudio Intermediates' Guide](https://education.rstudio.com/learn/intermediate/) - [{tidytags} package notes](https://docs.ropensci.org/tidytags/index.html) - Book: [*R for Data Science*](http://r4ds.had.co.nz/) **Experts:** - [RStudio Experts' Guide](https://education.rstudio.com/learn/expert/) - Book: [*Learning Statistics with R*](https://learningstatisticswithr.com/) - [*Data Science in Education Using R*](https://datascienceineducation.com) - See [Chapter 20.3 Appendix C](https://datascienceineducation.com/c20.html#c20c) - Social Network Influence and Selection Models - SNA resources: [Dr. Ken Frank's website](https://sites.google.com/msu.edu/kenfrank/social-network-resources) --- # Troubleshooting - Try to find out what the specific problem is - Identify what is *not* causing the problem - "Unplug and plug it back in" - restart R; close and reopen R - Seek out workshops and other learning opportunities - Reach out to others! Sharing what is causing an issue can often help to clarify the problem - [RStudio Community forum](https://community.rstudio.com/) (highly recommended!) - Twitter hashtag: [#RStats](https://twitter.com/search?q=%23RStats&src=typeahead_click&f=live) - [Contact Bret!](https://bretsw.com) - General strategies on learning more: [Chapter 17 of *Data Science in Education Using R*](https://datascienceineducation.com/c17.html)