Everything about Data

Effective approach to analyze correlation coefficients

Learn how to use corrplot and corrr packages

Correlation analysis is a key task when you’re exploring any dataset. The principal objective is to find linear relationships between features that can help to understanding the big picture. Probably, the best way to see correlations between variables is to use scatterplots, but in most of time you’re working with a high dimensional dataset with a high number of variables, in these situations you have two major problems: It’s a high computational task to plot lots of scatterplot, specially if you have a big dataset.

How to automate exploratory plots?

An awesome package combo: ggplot2 and purrr

When you are plotting different charts during your exploratory data analysis, you sometimes end up doing a lot of repeated coding. That’s moments you feel like would faster if you go back to excel or other tools you feel more comfortable, and that’s great if you have no time to learn some new technique or adjust some parameters by coding. What I want to show here is a batter way to do your EDA, and with less unnecessary coding and more flexibility.

Hypothesis Testing by Computational Methodology - Part 1

Introduction This is the first of two articles that we’ll talk about two different approaches to perform hypotheses tests, covering the classical and computational methodologies. In the end I’ll show you one R package (Infer) capable to execute any of these methods in an easy, flexible, and less error-prone way. In the second article, we’ll go deeper in a hands-on experiment using the Infer package, if you already know the package and want to see more code than text, click here.

How to Perform Correlation Analysis in Time Series data using R?

What is it correlation analysis? The concept of correlation is the same used in non-time series data: identify and quantify the relationship between two variables. Due to the continuous and chronologically ordered nature of time series data, there is a likelihood that there will be some degree of correlation between the series observations. Measuring and analyzing the correlation between two variables, in the context of time series analysis, can be understood by two different aspects: