1. COMPUTING RECENCY, FREQUENCY, MONETARY VALUE

# Load text file into local variable called ‘data’
> data = read.delim(file = ‘e:/courses/Big_data/R/Marketinganalzse/purchases.txt’, header = FALSE, sep=’\t’, dec=’.’)
# Display what has been loaded.

> head(data)

V1 V2 V3

1 760 25 2009-11-06

2 860 50 2012-09-28

3 1200 100 2005-10-25

4 1420 50 2009-07-09

5 1940 70 2013-01-25

6 1960 40 2013-10-29

 

> summary(data)

V1 V2 V3

Min. : 10 Min. : 5.00 2013-12-31: 864

1st Qu.: 57720 1st Qu.: 25.00 2006-12-31: 584

Median :102440 Median : 30.00 2012-12-31: 583

Mean :108935 Mean : 62.34 2011-12-31: 510

3rd Qu.:160525 3rd Qu.: 60.00 2008-12-31: 503

Max. :264200 Max. :4500.00 2014-12-31: 485

(Other) :47714

Summary is a generic function used to produce result summaries of the results of various model fitting functions. The function invokes particular methods which depend on the class of the first argument.

 

#Here we define columns’ names. c() is a function that makes a single vector from its arguments.

> colnames(data) = c(‘customer_id’, ‘purchase_amount’, ‘data_of_purchase’)

#Checking for data

> head(data)

customer_id purchase_amount data_of_purchase

1 760 25 2009-11-06

2 860 50 2012-09-28

3 1200 100 2005-10-25

4 1420 50 2009-07-09

5 1940 70 2013-01-25

6 1960 40 2013-10-29

#Interpret the column ‘data_of_purchase’ as a date and extracting a year

> data$date_of_purchase=as.Date(data$date_of_purchase, “%Y-%m-%d”)

> data$year_of_purchase = as.numeric(format(data$date_of_purchase, “%Y”))

> head(data) customer_id purchase_amount date_of_purchase year_of_purchase

1 760 25 2009-11-06 2009

2 860 50 2012-09-28 2012

3 1200 100 2005-10-25 2005

4 1420 50 2009-07-09 2009

5 1940 70 2013-01-25 2013

6 1960 40 2013-10-29 2013

#R has a function of exploring data using SQL statement. To do it, we have to run sqldf library

> library(sqldf)

#Running a SQL code

> x = sqldf(“select count(year_of_purchase) as counter, year_of_purchase from data group by 2 order by 2” )

 

#Print a table with the result

> print(x)

counter year_of_purchase

1 1470 2005

2 2182 2006

3 4674 2007

4 4331 2008

5 5054 2009

6 4939 2010

7 4785 2011

8 5960 2012

9 5912 2013

10 5739 2014

11 6197 2015

# Draw a result in bar form

> barplot(x$counter, names.arg = x$year_of_purchase)

Advertisements
This entry was posted in Marketing analyze. Bookmark the permalink.