From the perspective of a biologist

How a biologist navigates his way through the data revolution

SudoPurge
Towards Data Science

--

This is my first Medium article. And I would like to talk a little bit about myself, to begin with. I am currently studying Molecular, Cellular, and Developmental Biology with a focus on Bioinformatics, at the University of Alberta, Canada. I had literally zero background in Computer Science previously. In fact, I am currently taking my first CS class right now in my second year of University.

Like most biologists, I too lacked CS skills and was not the best at mathematics either. I have had a Calculus class in my first year, with a GPA of 1.3. Yeah… But that is not really an accurate representation of my Mathematical capabilities. It was the first year of University. I hated the Prof for overcomplicating most of the topics (I learned those concepts on my own the previous year when I got introduced to Calculus in high school and was awestruck by its elegance), and well, I was just reckless. Just your regular first semester of University when you are trying out so many new experiences and Calculus is really, not your first priority, to say the least. And I guess it's justified because your first semester of college is reserved for being a little reckless because you won’t be able to afford to be so later on, or at least that’s how I see it.

Anyhow, the first time I explored programming was when the pandemic hit and I had nowhere to go. I had been kind of peering into the whole realm of programming and Bioinformatics for almost a year but was never really motivated enough to get down with it. Every time I would try, even the process to set up Python and an IDE seemed extremely intimidating to me and inevitably, I would give up in an instant. Now that the pandemic would keep me inside, with no friends or family around (that was the first summer break of uni, I am an international student, and all my friends had gone back to their home because of Covid), and nothing else to do, I had ample amount of time to play around with this new thing I have been thinking about for so long. So I jumped right into it.

I opened up YouTube, and searched “python tutorials”. Python was just the first language that came to my mind. And I was lucky to have found just the perfect tutorial. What made this tutorial really stand out for me was how the instructor made the concepts really relatable and simplified, yet not oversimplified.

It's not that I only followed this tutorial and became proficient with Python. In fact, to this day, I have only watched around the first 60% of the video. But this was enough to introduce me to the basic ideas of a programming language in general, within the context of Python. After a while, I also looked at some basic Java tutorials, one of which particularly suited me, and this further enabled me to understand how computer languages, especially Object-Oriented Programming languages are created in terms of their grammar and structure.

Along with following along with this tutorial, I also looked up ways to learn a programming language effectively. Some of these tips were really essential to my development as a programmer, as well as an independent learner. I would watch countless other videos in order to familiarize myself with a concept.

Every time I would come across a new concept, I would watch a few videos about it on YouTube and this was effective in the beginning. I would watch videos on the same concept from various different YouTubers, just to understand them from a different perspective. This helped me tremendously because if one instructor could not explain the topic clearly enough or in a way that worked for me, I would have another one ready. It was all about trial and error and seeing who fits my learning style best.

Later on, as I got to more complex concepts, I would have to keep a notebook to write down what I understood. Not that I would look at these notes later. The action of retrieving the concepts from my head, writing them down in my own words after learning something new was enough to embed them into my memory, at least for a few weeks. Of course, it got easier with time.

Another crucial practice I undertook was to use a mixture of top-down and bottom-up approach throughout my learning journey. In fact, I still do so as I am learning today. The bottom-up approach is when you learn the basics first and then move onto slightly more complex concepts, and then eventually apply these concepts. Top-down is when you select a particular application or use-case, and then try to learn everything required for understanding that particular concept or use-case. I decided to follow a bottom-up approach for the long term, while a top-down approach for the short term. Let me elaborate. I started with relatively simpler concepts like loops, data types, variables, etc., and then slowly moved up to more complex concepts. This is quite standard. But what I did differently, was that I used particular use-cases in order to try to understand everything required for it.

My choice of use-cases was what got more and more complex over time, as I kept up with the long-term bottom-up strategy. For instance, I initially tested and applied what I learned from that first tutorial in the context of biology, found on the Rosalind website. This was what got me interested in the first place; answering biological questions using modern computational tools. As these problems got too complicated for my limited knowledge, I moved on and decided to create a space invader game with the Covid-19 theme. I followed a generic space invader tutorial using Pygame but added my own modifications to turn it into a Covid-19 themed game. This was fun, relatable and most importantly, one I was fairly comfortable with given my skill-level at that time and could apply the basics I learned initially but also had ample opportunities to teach me new concepts as I completed it. Striking this balance is probably the most important part of learning independently when choosing the use-case or project.

Image created by Author. The objective is to kill the viruses using injection droplets (pressing Space) before they get to the essential workers at the bottom. GitHub

Fast forward to today, I am now proficient with many Python Data Science libraries, teaching myself Machine Learning, using datasets from Kaggle, like the Covid-19 research datasets, or FIFA 19 dataset, with a similar combination of the bottom-up and top-down approach. I am by no means an expert in Data Science or even programming. I am just like you, a Data Science enthusiast, trying to learn this fascinating field on my own using the vast amount of Open Source information available on the internet, which frankly speaking can get overwhelming at times. But so far, I think I am doing quite well, and if I can do it, so can you. Think of me like your study-buddy, instead of an expert. I love to write and talk about things that interest me and this journey on Medium is just a creative outlet for me, where I combine my passion for Biology, Data Science, and Programming. Keep an eye out for more of my articles to learn how to teach these to yourself.

The data revolution is going to transform the many fundamental ways of life in the near future, and the more I learn about it, the more I fall in love with it. Let me know in the comments if you want me to talk about any particular aspect of it. I like to get straight to the point, which will probably be a consistent trait of all my articles. Happy learning!

P.S. For more short and to the point articles on Data Science, Programming and how a biologist navigates his way through the Data revolution, consider following my blog.

With thousands of videos being uploaded every minute, it’s important to have them filtered out so that you consume only the good quality data. Hand-picked by myself, I will email you educational videos of the topics you are interested to learn. Sign-up here.

Thank you for reading!

--

--