Dates from serial number in R
Posted on Thu 30 January 2020 in UseR
You might have encountered that MS Excel converts dates into a float number, it may have happened to you before that after performing a couple of transformations in your spreed sheet (concatenating for instance) or when importing directly form Excel a short date like this 08/06/2019 is converted into something like this 42529.9821527778.
That strange float number 42529.9821527778 is the representation in days of the actual date counted typically from “1899-12-30” (see date systems in excel).
In R to convert a date from a numeric expression we can use two
functions: as.Date
to convert to calendar dates or POSIXct
convert
to calendar dates and times.
Using POSIXlt
To convert to calendar date and time using POSIXct
it is needed to
convert the numerical date to seconds and specify the origin (where to
start counting from) as:
as.POSIXct(42529.9821527778 * (60*60*24)
, origin="1899-12-30"
, tz="GMT")
## [1] "2016-06-08 23:34:18 GMT"
To convert an entire column of a data.frame
, simply pass the column as
vector to POSIXct
(coverting it to seconds) and assign the values to
the new column:
mytable = data.frame("num_date" = c(42529.98215, 42549.15492, 42611.92123))
mytable$date = as.POSIXct(mytable$num_date * (60*60*24)
, origin="1899-12-30"
, tz="GMT")
mytable
## num_date date
## 1 42529.98 2016-06-08 23:34:17
## 2 42549.15 2016-06-28 03:43:05
## 3 42611.92 2016-08-29 22:06:34
Both functions would return an object of class “POSIXct” “POSIXt”
Using as.Date
Specify the origin and time zone and you are ready to go:
mytable = data.frame("num_date" = c(42529.98215, 42549.15492, 42611.92123))
mytable$date = as.Date(mytable$num_date
, origin="1899-12-30"
, tz="GMT")
mytable
## num_date date
## 1 42529.98 2016-06-08
## 2 42549.15 2016-06-28
## 3 42611.92 2016-08-29
The new column (or object) is one of the class “Date”.
Converting multiple columns from data.frame (data.table)
In case you have multiple columns that need to be converted to Date, you
can use a function like the proposal below that applies a coversion
using as.Date
over a list of specified column names (col.names
) to
be converted all at once. In case you don’t want to use data.table
simply replace as.data.table
with as.data.frame
to return a
data.frame
object.
DT_Date_convert <- function(DT, col.names){
numeric_to_date <- function(c){
if(c %in% col.names){
as.Date(DT[[c]]
, origin="1899-12-30"
, tz="GMT")
}else{
DT[[c]]
}
}
DT_names = names(DT)
DT.dates = as.data.table(lapply(DT_names, numeric_to_date))
names(DT.dates) = DT_names
return(DT.dates)
}