🌟Creating a Data Science Portfolio Project from Start to Finish: Data Acquisition, analysis, modeling, and presentation🌟
This talk will center around an example data science project that one might use to start building a portfolio of work. A portfolio of data science projects can be useful in a number of different ways: to demonstrate your knowledge and practical skills to potential employers; to learn a new technology or methodology for your own growth and development; or even as a way to have something to talk about at a PyData meetup!
Since this talk is a day after the election in November, I will be using a number of different online sources to pull past election results and dive into some of the local elections both in King County and Washington State. Can we predict who will win between two candidates based purely on their candidate statement? What impact does spending have on the outcome? Are there other insights we can gain or questions we can ask of the data?
I will walk through the different elements and techniques I use to develop this data science project in python including
* Web scraping and data acquisition with BeautifulSoup
* EDA (Exploratory Data Analysis) using Jupyter and Matplotlib
* Text analytics and modeling with scikit-learn
* Showcasing the results with Flask
All of the code will be made available prior to the event so that you may follow along or run on your own local machine. The talk is intended to be accessable to people of all skill levels!
About the Presenter:
Jayson Stemmler is a data scientist with Neal Analytics located in Kirkland, WA. He focuses primarily on time series forecasting in the finance space.
Prior to his work at Neal he spent 5 years as a Research Scientist for the University of Washington Department of Atmospheric Science where he dealt with all manner of data tasks in Python.
Jayson has a BS in Atmospheric Science with a minor in Applied Math from the University of Washington and a MS in Atmospheric Science from the University of Wyoming Department of Engineering and Applied Science.
Geospatial data analysis with python + demo
Despite being around for years, there is a myth around geospatial data analysis that propagates the notion for advanced coding skills with elements of data analysis, cartography, web development and database management, among others. This talk is focused on giving the audience an introduction to
geospatial data handling and sparking an interest towards getting their hands dirty with quick access to geotagged data.
The talk includes a project demo of the spatial analysis of the tweets that are relevant to the #MeToo movement which began in 2017. Where did it begin? How did it spread? All this and more, will be answered in the talk !
💚💙Thank you to all for your support to @NumFOCUS, your participation help us to bring awareness to NumFOCUS a 501(c)(3) nonprofit that supports and promotes world-class, innovative, open source scientific computing projects for Data Science, including: Pandas, Numpy, Sympy, IPython, Jupyter, Matplotlib and Julia.
Learn more on the blog:
http://www.numfocus.org/blog/jupyter-2017-acm-software-system-award