photo credit: http://www.visualizeu.net/
I have been asked numerous times, I got access to the database can you please tell me how to use the data?
Since I have been asked this more and more, i got some time to answer it here and help people with this question.
What do you want to do with it?
First and foremost, what are you trying to do with the data?
Ask yourself, your manager, friend or whoever is making you do something with the data, what do you want the data to show you?
Most of the times the data is powerful as much as you can understand it. Here is how you can do that:
Understand the schema
Databases no matter relational or non-relational have schemas. This shows where specific attributes are stored, in which tables or objects, also shows how tables or objects are connected between themselves, the linking.
What is an attribute? The attribute is basically everything that is descriptive, name, surname, dress, profit etc..
What is a table or object? Table or object is the structure that is holding or grouping the attributes.
What are links or keys? Links, keys and foreign keys are basically information that allows you linking one table to another. For example, you want to link the profit to a salesperson, or address to a person, you will use linking or joining to the tables. Mostly this is done using foreign keys or by joining multiple attributes – creating a composite key.
How to learn the schema?
Some people try to learn the schema all at once, seeing the tables their attributes and how they link to each other.
My suggestion is to learn by doing. Lately, in the world of big data, the database systems are getting too complex to be learned all at once and most of the times you won’t need to know it all. Learning it by practical use cases can help you understand not just table structure but also the underlying data.
How do I get the data?
Usually, we use SQL to query the databases. SQL is the fastest and best-performing way to do it.
Other ways can be using code: Java,.Net, R, Python and what not else.
Excel, you can query data easily using Excel while creating Pivot table.
Lately, Data Scientist is using tools like KNIME, Alteryx to fetch the data. Using this approach does not require knowing any query language, but you might face the risk of downloading gigabytes of data in your memory or disk if the table you are trying to query is that big.
Query with Excel:
Use your imagination.
Once you succeed in getting your data you should start using your imagination and think of making useful use case scenarios that will help your business.
Plain data is boring so that is why we visualize it.
Easy ways to do that is using Excel. Excel is really powerful by itself and can create pretty charts.
Some other popular tools are Tableau, QlikView and Jaspersoft, HighCharts
Off course, There are endless other solutions that can create pretty charts.
One word of caution about visualization, don’t try to over visualize things because they can become really confusing. Also, try to use few colors instead of using all color palette so other people can follow you.
Create a story
Now that you have your use case and cool visualization, try to create a story.
People will understand what you want to say and even get new ideas if you tell your analysis in a nice story.
Happy data mining, now when you know what to do with your data!