I am a member of a nice website that is a massive repository of everything related with Data Science. Its name is Data Science Central and from time to time I post some articles there with a few of them becoming featured — preferred — articles. The list of contributors to the site is massive as well, and with most of them of PhD habilitation, the quality of the content is always guaranteed.
This comes in the way of posting here one post from Data Science Central that caught my attention for the important topic of the need to raise the number of people with Data Science skills within practically all economies in the World, be they underdeveloped, emerging or Developed. The need is increasing, and the value of the expertise translates in highly paid jobs, normally coupled with rewarding and satisfying career paths.
I will leave some highlights of this post by Bernard Marr and the picture he chose for the post:
So, barring a data sciences degree program (which are popping up at prestigious universities around the world) what steps do you need to take to become a data scientist?
- Brush up on your math and statistics skills. A good data scientist must be able to understand what the data is telling you, and to do that, you must have solid basic linear algebra, an understanding of algorithms and statistics skills. More advanced mathematics may be required for certain positions, but this is a good place to start.
- Understand the concept of machine learning. Machine learning is emerging as the next buzzword but it is inextricably linked to big data. Machine learning uses artificial intelligence algorithms to turn data into value and learn without being explicitly programmed.
- Learn to code. Data scientists must know how to manipulate code in order to tell the computer how to analyse the data. Start with an open source language like Python and go from there.
- Understand databases, data lakes and distributed storage. Data is stored in databases, data lakes or across distributed networks, and how those data repositories are built can often dictate how you can access, use, and analyse that data. Failing to see the big picture or think ahead when you construct your data storage can have far-reaching consequences.
- Learn data munging and data cleaning techniques. Data munging is the process of converting “raw” data to another format that is easier to access and analyse. Data cleaning helps eliminate duplication and “bad” data. Both are essential tools in a data scientist’s toolbox.
- Understand the basics of good data visualisation and reporting. You don’t have to become a graphic designer, but you do need to be well versed in how to create data reports that a lay person — like your manager or CEO — can understand.
- Add more tools to your toolbox. Once you’ve mastered the above skills, it’s time to expand your data science toolbox to include programs like Hadoop, R and Spark. Knowledge of and experience with these tools will set you above a great many data science job applicants.
- Practice. How do you practice data science before you have a job in the field? Develop your own pet project from open source data, enter competitions, network with working data scientists, join a bootcamp, volunteer or intern. The best data scientists will have experience and intuition in the field and be able to show their work to a recruiter.
- Become a part of the community. Follow thought leaders in the industry, read industry blogs and websites, engage, ask questions, and stay abreast of current news and theory.
Sound like a lot? Well, it is. Data science isn’t for everyone, but for the interested and the dedicated, it can be incredibly rewarding. If you don’t have the money to attend a university program, check out the resources on this infographic, which spells out how to accomplish many of these steps with free resources around the web.
Featured Image: Mango Solutions