Calculating hotel occupancy with R

Putting patientcounter to the test by calculating hotel occupancy.
I saw a question on reddit today which can be answered using {patientcounter}
The asker wants to know how to work out how many folk are in a hotel, at any given time, based on check in and check out dates.
Here’s the setup:
check_in_date <- c('2010-01-01', '2010-01-02' ,'2010-01-01',
'2010-01-08', '2010-01-08', '2010-01-15',
'2010-01-15', '2010-01-16', '2010-01-19', '2010-01-22')
check_out_date <- c('2010-01-07', '2010-01-04' ,'2010-01-09',
'2010-01-21', '2010-01-11', '2010-01-22',
NA, '2010-01-20', '2010-01-25', '2010-01-29')
#NB - the original question has a string instead of the NA value.
# patientcounter can handle the NA, but not the string.
Person = c("John", "Smith", "Alex", "Peter", "Will", "Matt", "Tim", "Kevin", "Tom", "Adam")
Make the dataframe:
checkin <- as.POSIXct(as.Date(check_in_date)) # make it a datetime
checkout <- as.POSIXct(as.Date(check_out_date)) # make it a datetime
hotel <- data.frame(checkin, checkout, Person)
hotel
## checkin checkout Person
## 1 2010-01-01 2010-01-07 John
## 2 2010-01-02 2010-01-04 Smith
## 3 2010-01-01 2010-01-09 Alex
## 4 2010-01-08 2010-01-21 Peter
## 5 2010-01-08 2010-01-11 Will
## 6 2010-01-15 2010-01-22 Matt
## 7 2010-01-15 <NA> Tim
## 8 2010-01-16 2010-01-20 Kevin
## 9 2010-01-19 2010-01-25 Tom
## 10 2010-01-22 2010-01-29 Adam
Now for patientcounter to do it’s thing…it’s designed for this exact task - getting a census, or count, of people / goods/ items in or at a place at a point in time.
The interval_census
function does the work:
#remotes::install_github("johnmackintosh/patientcounter")
library(patientcounter)
# calculate the number of occupants per day
occupancy <- interval_census(hotel,
identifier = "Person",
admit = "checkin",
discharge = "checkout",
time_unit = '1 day',
results = 'total')
# grab the date and count columns:
occupancy[,.(base_date, N)]
## base_date N
## 1: 2010-01-01 2
## 2: 2010-01-02 3
## 3: 2010-01-03 3
## 4: 2010-01-04 2
## 5: 2010-01-05 2
## 6: 2010-01-06 2
## 7: 2010-01-07 1
## 8: 2010-01-08 3
## 9: 2010-01-09 2
## 10: 2010-01-10 2
## 11: 2010-01-11 1
## 12: 2010-01-12 1
## 13: 2010-01-13 1
## 14: 2010-01-14 1
## 15: 2010-01-15 3
## 16: 2010-01-16 4
## 17: 2010-01-17 4
## 18: 2010-01-18 4
## 19: 2010-01-19 5
## 20: 2010-01-20 4
## 21: 2010-01-21 3
## 22: 2010-01-22 3
## 23: 2010-01-23 3
## 24: 2010-01-24 3
## 25: 2010-01-25 2
## 26: 2010-01-26 2
## 27: 2010-01-27 2
## 28: 2010-01-28 2
## base_date N
Job Done :)
There were several answers, and maybe a package is overkill, but this function is fast and scales well. It’s also pretty flexible and has applications outside of healthcare.