How to start with Data Science?

Data Scientist – How do you become one?

You don’t need a fancy degree for data science as long as you have the passion for it.

To become a data scientist, you don’t really need a specific university degree. As a matter of fact, you don’t need a university degree at all. Having one will definitely help you in getting hired faster in a company, but you don’t have to work only in a company to be a data scientist, nowadays freelancing is becoming more and more popular, especially among data scientist. Last time I checked, freelance sites don’t really ask you for a university degree.
Data Scientist – How do you become one?

What are the most important soft skills for data scientists?

Data scientist are the people who are thought to be the statistics wizards and tech gurus.

I have written about what tech skills does a data scientist need. In most cases, these beliefs are the truth. Data scientist are expected, most of the time, to perform wonders using some fancy algorithm names and tools. Everyone is focusing on their technical knowledge and expertise, their prior tech knowledge and the project they have been partly from. This is all great, it is needed, and it is a big part of everyday work a Data scientist should do.
The most important soft skills for data scientists

What skills does a data scientist need and how to get them?

Upgrading your skills constantly is the way to stay on the top. What skills do you need to have to become a Data Scientist?
I have written before but I’ll try to put again some more info to help the people who really want to go that path.
How to get the rest of Data Science skills

What is a good platform for Data science

The field of data science is experiencing great disruptions that are making the work of data scientists easier. There are many data science and machine-learning software products available for free and for purchase on the market today.

These products are ideal for data scientists who recognize the benefits of integrating machine-learning capabilities in their tasks. They also target data-driven organizations that want to cut on the costs of hiring expert data scientists.
Picking a one can be really confusing, so it’s better to start with the easiest to use and best-supported ones.
What tools do I choose for Data science?

How can I use the vast amounts of data I was asking for?

When you are starting with Data Science, often you are overwhelmed with all terms and methodologies that you read. Because of this, when the time comes for your first Data Science project, you simply don’t know where to start. I believe knowing where to start when you have a vast amount of data or you have some interesting project in mind is essential.
I have been asked numerous times, I got access to the database can you please tell me how to use the data? Since I have been asked this more and more, I got some time to answer it here and help people with this question.
Here I give some ideas on how to start utilizing your data.

Data Cleaning

One of the things that starters in data science especially need to pay attention is Data Cleaning.
Data Cleansing is one of the most important procedures for improving model performance.

Most times when we collect data, we must do data cleaning, to ensure that the data is as perfect as it can be.  Data cleaning can involve many assessments.  For example, let’s say a survey questionnaire was put online and data was collected via a website.
How to clean your data?

Data scientists can make great reports too!

Data Scientists are expected to develop dashboards, reports, and visualizations. Visualization is the routine of showing the final calculations, clues, and predictions.
Unfortunately, many times Data scientist fail to deliver a good dashboard or report that will be broadly used by their audience.

I keep on hearing frustrated comments on how the dashboards are too complicated, they aren’t using precise terminology, there too many filters and the reports take too slow to load.
While talking with the audience, I often realize that many users of the reports and dashboards give up, frustrated from waiting for the report that never loads.
Some tips on how to develop cool reports

Predictive analytics, the process of building it

Can predictive analytics tell my future? The sad answer is NO. Predictive analytics will not tell you for certain if you are going to be rich or not. Or will not guarantee 100% that your favorite team will win so you can put all your saving on a bet. Also, it won’t tell you where you will end up for sure next year.

However, predictive analytics can definitely forecast and give hints about what might happen in the future with an acceptable level of reliability and can include risk assessment and what – if scenarios.

While starting on building predictive analytics projects here are few things to take into account.

Introducing to SQL Server Analytics Services

Analytics Services

SQL Server Analytics Services is one of the first studios, together with IBM’s and SAS solution and some few other vendors, offered as a complete data science toolbox used for data mining, data exploration, and model creation. Surprisingly, it is still used due to the fact that then years ago many big organizations invested heavily in SQL server platforms and employed dozens of analyst who work on SQL Server BI solutions.
Analysis Services is an analytical data engine used in decision support and business analytics. It provides enterprise-grade semantic data models for business reports and client applications such as Power BI, Excel, Reporting Services reports, and other data visualization tools.

A typical workflow includes creating a tabular or multidimensional data model project in Visual Studio, deploying the model as a database to a server instance, setting up recurring data processing, and assigning permissions to allow data access by end-users. When it’s ready to go, your semantic data model can be accessed by client applications supporting Analysis Services as a data source.

Azure Analysis Services – Supports tabular models at the 1200 and higher compatibility levels. DirectQuery, partitions, row-level security, bi-directional relationships, and translations are all supported. To learn more, see Azure Analysis Services.

SQL Server Analysis Services – Supports tabular models at all compatibility levels, multidimensional models, data mining, and Power Pivot for SharePoint.

The first step to set up SQL Server Analytics Services