In support of the #66daysofdata initiative and the data science community, this repo is a TIL collection of data engineering and data science code and experiments for 66 days+. Just updated 2025H1 - GenAI*
"Why 66 days? Because it is the average amount of days needed to establish a new habit. Creating solid data science habits is one of the most powerful things we can do to have longevity in this every dynamic field of technology. (Ken Jee)"
-
LLMs in Production: A Comparison of GCP, Azure, and AWS by Rafal Lagowski
-
Chain-of-Verification Reduces Hallucination in Large Language Models by Shehzaad Dhuliawala ,et al
-
FACTIFY: A Multi-Modal Fact Verification Dataset by Ashwarya Reganti, et al
-
LLM Engineering Resources by Ed Donner along with a 8 week hands-on course
-
Deep Learning Recommender, Survey and Perspectives, by Zhang, Yao, et al
-
Develop a NLP Model in Python & Deploy It with Flask, by Susan Li
-
NYU Deep Learning SP20, speakers Yann Le Cun, Alfredo Canziani
-
Redox Git and its HL7v2 parser/generator
-
Fireside Chat with Aparna Ramani & Yann LeCun at May 2022 Data@Scale, touches on self-supervised machine learning and emerging DS trends along with Ms. Ramani's keynote
-
Data Engineer Newsletter by Zach Wilson with even more DE resources
Microsoft Excel is still widely used today for starting analyses e.g. in healthcare, government, non-profit. Sampling of useful resources.
-
Via David Langer, free and paid training on Excel for Business, R and even courses on Regression and Machine Learning and youtube channel
-
Via Chandoo, he used to have a lot of free stuff but now he also does paid training
If you like this list consider supporting me!
Nod to respective authors above for their respective works as indicated above.

