“There are plenty of resources, and I’m confused if I’m on the right path. Could you please help me out?” She worriedly informed.
I wanted to scream, “You’ve been doing so well — I can’t believe you even doubt yourself!” but I held back, letting her finish her thoughts.
To give you context, I’m a senior data scientist, and in my free time, I mentor data enthusiasts to break into the field of data science. I look forward to these interactions as they give me a sense of what problems are faced by beginners and how I can help better.
The most recent concern they brought up was shocking and concerning.
“There’s a lot of information on how to become a data scientist to the extent that it’s overwhelming. There are plenty of resources, and I’m confused if I’m on the right path. Could you please help me out?”
What’s shocking was, not one but many of my mentees had this concern. Despite them doing everything fine, they often need assurance and want me to review their progress.
While I am happy to help whenever I can, I realized this could be a common problem amongst most beginners. Therefore, I wanted to find a solution after identifying the cause of it.
It’s Not You — It’s Us
We give you 52-week roadmaps, 25 resource sheets, 7 cheat sheets, 101 page PDFs and expect you not to be overwhelmed. Every other week there’s a new blog written on becoming a data scientist, and we expect you to filter the best amongst them.
That’s obviously not fair. You, as a beginner, will have a hard time being confident that you’re on the right path. Something has to change; I thought I’ll try my best to simplify the process of getting started in data science as much as possible.
This article will give you only 4 must-learn fundamental resources that you can use to get yourself started. Then, finally, I’ll tell you some hard-hitting truths to help you stay focused.
Let’s dive in, shall we?
The 4 Fundamental Courses to Help You Get Started
If you’re an absolute beginner, cut out all the noises on the internet and follow these courses in order. You can do them in parallel too, and if you stay on the schedule, you shouldn’t take more than 6 months. All 4 of the below-mentioned courses are free to view and only require payments to get certified.
Offered by the University of Michigan, this course teaches Python in a data science focussed manner. It goes from data wrangling to data analysis to visualization to text mining and network analysis. It gradually takes you through a programming journey without diving deep into the theory.
It’s a great hands-on start to get the overall understanding of the machine learning workflow. The exercises and assignments aid the hands-on nature of the course. So give this course a go, and you’ll know what I’m talking about.
It’ll be tempting to skip statistics but sooner or later, you’d regret it. It was worth all the time I patiently invested in this. Slowly, I started understanding all the statistical concepts.
The course is packed with many examples, case studies, and exercises that are helpful for a beginner. To date, I use these concepts at work, and you must gain clarity on these topics in your early days.
When you advance through this course, you’ll feel breaking into data science and machine learning slowly. Many professionals, including me, owe most of our knowledge to this single course.
The only drawback is that the course is back from 2012 and uses Matlab/Octave for the assignments. You can follow assignments in python from the same course available on YouTube.
Most people ignore SQL — the language of data until they realize its importance.
Sooner or later, you’ll be required to use SQL heavily in your day-to-day job — some roles are entirely focused on SQL, so you must master it early on. I’ve tried multiple courses, but this one directly focuses on what we need from a data scientist's perspective.
It’s more than sufficient if you work through the first 2 courses of the specialization. The last two are quite advanced and will only be helpful when you start working on big data in a distributed setting.
Here are Some Hard-hitting Truths to Stay Focused in Your Journey
Now there are always alternatives to the above-mentioned courses. There’ll be groups of people arguing over Python vs. R, Projects vs. Courses, Hands-on vs. Theory-First, and which courses are the best for each topic. Here’s the truth about all these thoughts:
- Opinions are biased and based on individual preferences.
- There exists more than one path to succeed in data science.
- Too much information, i.e., Information Overload, makes you overwhelmed and derails you from all paths.
- You need to be focused on at least one path consistently to become a data scientist.
- The path you choose must be a simple one that helps you take action.
This article was focused on the bare minimum you need to get started. I want you to take action without worrying much. So next time one of my mentees gets confused with all the resources out there, I’m going to send them this.
I’m sure when you’re done with these, you’d have a sense of clarity on what you need to learn next and how to approach applying for jobs. If you still aren’t sure, please feel free to reach out — I’m more than happy to help you.
As a note of disclosure, this article may have some affiliate links to share the best resources I’ve used at no extra cost to you. Thanks for your support!
For more helpful insights on breaking into data science, honest experiences, and learnings, consider joining my private list of email friends.
If you value articles like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5 a month, giving you unlimited access to stories on Medium.