Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.

A very common task in data processing is the transformation of the numeric variables (continuous, discrete etc) to categorical by creating bins. For example, is quite ofter to convert the `age` to the `age group`. Let’s see how we can easily do that in R.

We will consider a random variable from the Poisson distribution with parameter λ=20

```library(dplyr)
# Generate 1000 observations from the Poisson distribution # with lambda equal to 20
df
Create specific Bins
Let’s say that you want to create the following bins:
```
• Bin 1: (-inf, 15]
• Bin 2: (15,25]
• Bin 3: (25, inf)

We can easily do that using the `cut` command. Let’s start:

```df%mutate(MySpecificBins = cut(MyContinuous, breaks = c(-Inf,15,25,Inf)))

Let’s have a look at the counts of each bin.

`df%>%group_by(MySpecificBins)%>%count() `

Notice that you can define also you own labels within the `cut` function.

## Create Bins based on Quantiles

Let’s say that you want each bin to have the same number of observations, like for example 4 bins of an equal number of observations, i.e. 25% each. We can easily do it as follows:

`numbers_of_bins = 4 df%mutate(MyQuantileBins = cut(MyContinuous, breaks = unique(quantile(MyContinuous,probs=seq.int(0,1, by=1/numbers_of_bins))), include.lowest=TRUE)) head(df,10) `

We can check the `MyQuantileBins` if contain the same number of observations, and also to look at their ranges:

`df%>%group_by(MyQuantileBins)%>%count() `

Notice that in case that you want to split your continuous variable into bins of equal size you can also use the `ntile` function of the `dplyr` package, but it does not create labels of the bins based on the ranges.