From Wrangling to Writing - Analyzing Data and Creating Production-Level Code


For our April event, we will have two guest speakers. Megan Robertson will be talking about how to write production level data science code, while Natalie O’Shea will be discussing how to use the nyccensus package and how to wrangle/use census data for equity.

This event will be hosted over Zoom, and the link will be available on this Meetup page at noon on the day of the event.

Agenda (times approximate): 6:05 - 6:10 pm: R-Ladies NYC Announcements 6:10 - 6:40 : Natalie’s talk on NYC Census data with time for Q&A 6:40-7pm: Megan’s talk on creating production level data science code with time for Q&A

— Talk Abstract —

“Introducing nyccensus: Leveraging Census Data for Equity in NYC” Speaker: Natalie O’Shea

Abstract: This talk will introduce the nyccensus package and associated Shiny dashboard and share examples of how census data can be leveraged to help address equity issues. The nyccensus package is a simple data-sharing package meant to ease the pains of accessing and wrangling publicly-available response rate data from the 2020 Decennial Census as well as other relevant demographic data from the American Community Survey at various geographic levels in New York City. I will share some lessons learned and examples of how these data can be used to better understand and address equity issues in NYC.

Bio: Natalie O’Shea is a researcher and data visualization specialist with a passion for equity, accessibility, and human-centered design. She started her data science journey during graduate school, falling in love with all things R in her coursework and research. She was part of CUNY’s first Data for Public Good cohort before taking a position as a Community Outreach Data Analyst with the NYC Census 2020 initiative and then working as a Data Visualization Specialist at NYC Test & Trace Corps. She is currently working as a Data Visualization Analyst on the Research and Learning Engineering team at Edmentum.

“Creating Production Level Data Science Code” Speaker: Megan Robertson

Abstract: Writing code is a big part of working as a data scientist. You write code throughout all stages of a project from exploratory data analysis to building models. This talk focuses on creating production code from a proof of concept analysis or model. How do you organize and adapt all the code created to get to a proof of concept? Do you feel overwhelmed by everything you need to do? At the end of this talk you’ll walk away with tips and strategies to make your own production code.

Bio: Megan Robertson is a Senior Data Scientist at Nike. She has multiple years of experience applying data science in retail settings. She first became interested in math and statistics through sports analytics, and wrote her Master’s thesis in collaboration with an NBA team. Her background includes training in Bayesian methodology, statistical modeling, machine learning and more. Megan has delivered multiple talks on various aspects of data science from building a career and project management to writing code. She is currently writing her first data science book, Mastering Communication in Data Science, in partnership with Manning.

Meetup page