Hello Everyone!!!.
It's really nice to see like-minded people here. I'm a big fan of David Robinson's Tidytuesday screencasts and I was looking for TidyTuesday in Python works. Unfortunately, there are not much, some are not persistent.
I found Michael Chow's TidyTuesday in Python screencasts (which is recommended by David) and found that his work is more R-ish than Python-ish. Michael Chow is suiba creator. After making some researches, I have decided to replicate most of David's analyses using only pandas and plotly. When I first tried to replicate one of his works, It took me a whole week to get 40/100 of his results (R was not my thing). Followed every step David did in his analysis, couldn't replicate the same results as David did, Failed!.
That failure motivated me to read Pandas Documentation and pushed me to google a lot like the followings
- What is equivalent of fct_lump in python pandas?
- What is equivalent of fct_floor in python pandas?
- What is equivalent of gather() in python pandas?
- What is equivalent of unnest()-tidytext in python nltk
- And more Whats......
I could find some but for some situations I had to implement functions with pure Python.
After two or three screencasts , I started to feel comfortable with his works and R (tidyverse) syntax. I am more of a Python guy and I don't know R programming much but now I even know how to make data analysis using tidyverse. Most important thing is, I was a guy who didn't know what to do with a dataset using pandas, how to stack datasets, how to melt them, how to do conditional count after groupby.
Now I know that I can upgrade my Pandas skill and data wrangling skill with python by doing this Project.
Plots created with Plotly
This project helps me -
- Know what kind of questions we should ask
- Know how to think like a data scientist
- Know how to approach different types of datasets
- Know a small amount of R(tidyverse).
- Know how to do advanced data analysis using pandas and numpy.
- Know how to make plots using plotly.
- Relicat all of David's TTD screencasts using python
- Build Dashboards using plotly + dash
- Planning to build/translate/port some useful R packages into Python on top of Pandas like
-> keras to Tensorflow, -> fastai to PyTorch ( i'm not smart enough to do this yet! :) )
- Write blog posts about my works (How to TidyTuesday using Python, etc).
- Don't Worry! Everything is Figurable and Doable using Python and Pandas.
- Join Plotly Community, ask a lot of questions.
- Install Rstudio to reproduce David's codes line by line, to better understand what he did in his screencasts . (David's doing data analyses like he is speed running :) )
- Watch screencasts bit by bit, open his code in new tab (we can find codes in descriptions)
.png)
.png)
.png)