Case Study: Calibrating the Courier Service Model

Submitted: 27 Sep 2009

Operations Research Topics: SimulationModelling

Application Areas: Logistics

Problem Description

This case study extends two previous case studies:

Modelling Requests to a Courier Service which builds a model of a courier service receiving requests and then making deliveries;
Input and Output for a Courier Service Model which uses Arena tools (the Input Analyzer and Output Analyzer) to find distributions for the input to the simulation model and compares the results for fitted distributions vs empirical distributions.

In this case study we will build a simulation that uses historical data directly (instead of within an empirical distribution). Using this simulation we can estimate the total time between a request for delivery arriving at the courier service and the delivery being made.

Once we have an accurate value for the actual time a delivery request takes to be delivered, we can calibrate our fitted distributions to accurately match the real-world situation.

First, we will use Arena's Process Analyzer to simultaneously compare several different possibilities for the fitted distribution using the uniform error in the estimate of average total time (i.e., the time between a delivery request arriving and the delivery being made). Then, (in Extra for Experts, we will use Arena's OptQuest to find the best parameters for the distributions of choice by minimising the uniform error in the estimate. Finally, we will add the optimised simulation to the comparison in the Process Analyzer.

Return to top

Problem Formulation

We can easily modify the existing Courier Service Model to use the data in courier.xls because both the arrival of delivery requests and the delivery runs themselves have been abstracted into submodels.

To use the historical data to generate arrivals we simply need to step through the interarrival times for the Inner City delivery requests, wait the required time and generate the appropriate arrival.

To use the historical data to implement delivery runs we wait until a delivery run is triggered (either by the number of deliveries waiting being large enough or the time since the first undelivered request being long enough) and then use the next delivery run time from the list as the time taken for the courier to make a delivery run.

The historical data simulation gives us a "real-world" value for the average time between a delivery request arriving and the corresponding delivery being made. We can use this "actual" value to calculate the uniform error in estimates of average total time

$\begin{equation*} \varepsilon_{\text{uniform}} = \left| \frac{\text{Estimate} - \text{Actual}}{\text{Actual} + 1} \right| \end{equation*}$

To calibrate our simulation we need to minimise $\varepsilon_{\text{uniform}}$ by changing the parameters to our fitted distributions. We can use black-box optimisation, in this case Tabu search via Arena's OptQuest, to determine the values of the fitted distribution parameters that provide the minimum uniform error.

Return to top

Computational Model

First, we need to name the data from each of the relevant columns of the courier.xls spreadsheet. The following flash tutorial shows how to name the Inner City data areas in an Excel 2007 spreadsheet:

Once both areas have been named: Inner_City_Deliveries; Inner_City_Return_Times; save a copy of your spreadsheet as courier-named.xls.

Now, add this file to your courier service simulation using the File data module (in the Advanced Process template). The following flash tutorial shows how add a file in Arena:

Name	`Historical Data`
Access Type	`Microsoft Excel (*.xls)`
Operating System File Name	<your directory> `\courier-named.xls`

Once the file has been added we use the names defined in courier-named.xls to define Recordsets. The following flash tutorial shows how add recordsets from a file:

Recordsets (secondary dialog via Recordsets button)
	Recordset Name	`Inner City Interarrivals`
	Named Range	`Inner_City_Deliveries`
	Recordset Name	`Inner City Delivery Times`
	Named Range	`Inner_City_Return_Times`

Now our historical data is available for use in our simulation.

It is easiest to use the data from courier-named.xls for delivery runs first. The following flash tutorial shows how add read from a file's recordset into an attribute and then use that attribute in a Process module:

Assignments (secondary dialog via Add button)
Name	`Read Inner City Delivery Time`
Type	`Read from File`
Arena File Name	`Historical Data`
Recordset ID	`Inner City Delivery Times`
	Type	`Attribute`
	Attribute Name	`Delivery Time`

Name	`Make Inner City Delivery Run`
Delay Type	`Expression`
Units	`Minutes`
Expression	`Delivery Time`

Using the historical data to generate arrivals is more complicated as we need to create entities according to the historical data. We do this by creating a logical entity in a loop. This entity is created at the start of the replication and reads the interarrival time for the next arrival. The following flash tutorial shows how create the generator logical entity and read from a file's recordset:

Name	`Create Inner City Request Generator`
Entity Type	`Request Generator`
Max Arrivals	`1`

Assignments (secondary dialog via Add button)
Name	`Read Inner City Interarrival Time`
Type	`Read from File`
Arena File Name	`Historical Data`
Recordset ID	`Inner City Interarrivals`
	Type	`Attribute`
	Attribute Name	`Interarrival Time`

Once the interarrival time has been read, the generator entity delays for the interarrival time and then uses a Separate module to create a request before looping back to read the next interarrival time. Once the new entity has been created, we need to turn it into the appropriate request and send it into the rest of our model. The following flash tutorial shows how wait for the interarrival time, create a request entity and loop back to read more interarrival times, then create a delivery request entity and send it to rest of the model:

Name	`Wait till Inner City Arrival`
Delay Time	`Interarrival Time`
Units	`Minutes`

Name	`Generate Inner City Request`

Assignments (secondary dialog via Add button)
Name	`Make Inner City Request Entity`
	Type	`Entity Type`
	Entity Type	`Inner City Delivery`

Now we have a simulation that uses historical data. However, we need to see how long we can use the historical data before it repeats (Arena goes back to the start of a Recordset when it runs out of data). If we sum each of the columns in courier-named.xls and convert it from minutes into 8-hour days, we see that the Inner City Deliveries data (the time between Inner City delivery requests) provides just over 30 days of data. All the other columns provide enough data for longer durations. We set the number of replications for the historical data simulation to be 30, with each replication running for 8 hours.

Number of Replications	`30`
Replication Length	`8`
Time Units	`Hours`

Return to top

Results

Now we have a historical simulation, we can run the simulation and look at statistics from the real-world courier service. Figure 1 shows the Category Overview report with total time both Inner City deliveries, averaged over the 30 replications.

Figure 1 Outputs from the Historical Simulation

Comparing Multiple Simulations

First, we will use the Process Analyzer to compare the historical simulation with both the simulation with fitted distributions (saved as courier-io-input-stats.doe) and the simulation with empirical distributions (saved as courier-io-empirical-stats.doe).

IMPORTANT. You should set all the simulations (historical, fitted distributions, empirical distributions) to run in batch mode. You can do this by opening the simulation in Arena and selecting Run > Run Control > Batch Run (No Animation). To make sure the simulation's run file is properly set up for the Process Analyzer you should select Run > Check Model (or use the shortcut F4). This generates the *.p files.

The following flash shows how to add the three simulation models to the Process Analyzer.

Next, we will insert some responses to compare amongst our simulation models. The following flash shows how insert responses and run the Process Analyzer:

After all the simulations have run in the Process Analyzer, we can use Box-and-Whisker plots to compare the confidence intervals for the total time from all the simulations simultaneously. The following flash shows how create Box-and-Whisker plots in the Process Analyzer:

IMPORTANT. By identifying the best scenario and choosing bigger to be better, the Box-and-Whisker will discern between scenarios that are indistinguishable from the scenario with the biggest response, i.e., those scenarios with the 95% confidence interval (the box) overlapping with the best scenario. All best scenarios are coloured red, while other scenarios are coloured blue. This gives us a crude tool for checking if scenarios are different.

Box-and-Whisker plots show that the historical simulation has the largest total time for a delivery request to be fulfilled. Only the empirical distributions simulation gives a confidence interval that overlaps with the historical simulation confidence intervals for Inner City total delivery time (hence Fitted is blue while the other scenarios are red). However, their is no discernable difference between the fitted distributions simulation, the empirical distributions simulation and the historical simulation for Inner City Courier number busy. Note that since the historical simulation gave a small number of Inner City Couriers busy we used Smaller is better for this plot.

Changing Distribution Parameters

We can experiment with different distribution values for the fitted distributions to see if we can get a better match. The current distributions are:

Inner City Interarrivals	`EXPO(4.38)`
Inner City Delivery	`15 + 4 * BETA(4.28, 5.95)`

In order to change these values dynamically we need to parameterise the distributions. Add the following variables to the Variable data module:

Name	`Inner City Lambda`
Initial Values	`4.38`
Name	`Inner City Delivery Constant`
Initial Values	`15`
Name	`Inner City Delivery Multiplier`
Initial Values	`4`
Name	`Inner City Delivery Beta1`
Initial Values	`4.28`
Name	`Inner City Delivery Beta2`
Initial Values	`5.95`

and use them to define the appropriate distributions

Name	`Generate Inner City Delivery`
Expression	`EXPO(Inner City Lambda)`
First Creation	`EXPO(Inner City Lambda)`

Name	`Make Inner City Delivery Run`
Delay Type	`Expression`
Units	`Minutes`
Expression	`Inner City Delivery Constant + Inner City Delivery Multiplier * BETA(Inner City Delivery Beta1, Inner City Delivery Beta2)`

Now select Run > Check Model to create the simulation run file. Next, we create 2 new scenarios in your Process Analyzer file, both that use your new simulation run file. We want to add the distribution parameters as controls so we can experiment with them in our 3 scenarios. We will see what happens when we experiment with the constant term Inner City Delivery Constant as a control. The following flash shows how to edit the Fitted Distributions scenario, add new scenarios and add controls to scenarios:

After editing and running these scenarios (you may need to select Run > Reset for you original Fitted Distributions scenario) you can generate your Box-and-Whisker plot for Inner City Delivery.!TotalTime again. The following flash shows how to edit the controls for the new Fitted Distributions scenarios and redraws the Box-and-Whisker plots:

You should see results like those in Figure 2.

Figure 2 Comparing scenarios in the Process Analyzer

Our experiments show that increasing the constant term in the Inner City Delivery distribution expression gives a better "match" for the total time to deliver Inner City requests, but a worse "match" for the number of busy Inner City couriers!To fully explore the possibilities for the fitted distributions we can will minimise the uniform error using OptQuest (see Extra for Experts).

Return to top

Conclusions

In this case study we have used historical data and the Process Analyzer to calibrate our simulation model.

The historical data was used within our simulation model to get actual values for the total time for courier deliveries.

Once these actual values were calculated, we compared to our previous models using the Process Analyzer to experiment with different parameters for the distributions previously fitted to the historical data (in Input and Output for a Courier Service Model) to try and calibrate our model.

In Extra for Experts we use OptQuest to minimise the uniform error and automatically calibrate our fitted distributions. The calibrated fitted model is then compared to the simulation model with historical data. The calibrated model seems to provide a better fit, although further simulation work and statistical analysis is needed to confirm this.

Return to top

Extra for Experts

Minimising Uniform Error via _OptQuest

The "actual" values for the total delivery time is 23.7312 minutes for an Inner City request and on average there are 0.7958 Inner City couriers busy. We can add these values to the Variable module in our model and then use OptQuest to find the best overall fit to the historical data. The following flash shows how add the actual values to the Variable module and how to start OptQuest:

Next, we select all the parameters of the fitted distributions as well as our new Actual Total Time and Actual Num Busy (from our Variable module) as controls. (We assume that we have selected the correct distributions and only need to fine tune their parameters.) The following flash shows how set the OptQuest controls:

As responses we only need the total time for the Inner City requests to go through the courier system and the number of busy Inner City couriers. The following flash shows how set the OptQuest responses and move to the objective:

There are no constraints for this optimisation problem, so we move on to the objective (i.e., to calibrate the distribution parameters). Rather than use the absolute value for the uniform error, we calculate the squared uniform error for both Inner City deliveries' total time and Inner City couriers' number busy and use the sum of these values. This squared sum will hopefully provide the Tabu search with better impetus to find the minimum uniform error (if the solution is far from the minimum error, the squared error will be much greater than the absolute error). The following flash shows how set the OptQuest objective:

The objective is minimised by changing the controls within the ranges specified (we accepted the defaults when we set the controls). However, we don't want Actual Total Time and Actual Num Busy to change, so we must fix their ranges. The following flash shows how adjust possible ranges of values for controls in OptQuest:

Now, we are all set to run our optimisation. We set some options to allow for the number of replication to vary between 10 and 50 until the 95% confidence interval is within 10% of the mean and then run OptQuest. The following flash shows how set options and start OptQuest solving:

After 380 simulation runs in OptQuest, the best values for the parameters (from simulation run 314) can be seen in Figure 3.

Figure 3 Automatic calibration using OptQuest

Using these parameters and checking the Box-and-Whisker plot from the Process Analyzer shows that the best solution after 380 simulations is a reasonable match to the historical simulation (see Figure 4).

Figure 4 Comparing OptQuest solution using Process Analyzer

Return to top

Attachments

Topic attachments
I	Attachment	History	Action	Size	Date	Who
flv	CourierServiceCalibration_Delivery.flv	r2 r1	manage	2276.8 K	2012-08-07 - 02:24	TWikiAdminUser
flv	CourierServiceCalibration_File.flv	r2 r1	manage	1896.6 K	2012-08-07 - 02:24	TWikiAdminUser
flv	CourierServiceCalibration_Generate.flv	r2 r1	manage	2302.4 K	2012-08-07 - 02:26	TWikiAdminUser
flv	CourierServiceCalibration_Interarrival.flv	r2 r1	manage	2235.8 K	2012-08-07 - 02:27	TWikiAdminUser
flv	CourierServiceCalibration_Naming.flv	r3 r2 r1	manage	2690.1 K	2012-08-07 - 02:29	TWikiAdminUser
flv	CourierServiceCalibration_OptControls.flv	r2 r1	manage	454.5 K	2012-08-07 - 11:34	TWikiAdminUser
flv	CourierServiceCalibration_OptFix.flv	r2 r1	manage	544.6 K	2012-08-07 - 11:34	TWikiAdminUser
flv	CourierServiceCalibration_OptObjective.flv	r2 r1	manage	1345.7 K	2012-08-07 - 11:35	TWikiAdminUser
flv	CourierServiceCalibration_OptResponses.flv	r2 r1	manage	323.4 K	2012-08-07 - 11:35	TWikiAdminUser
flv	CourierServiceCalibration_OptSolve.flv	r2 r1	manage	889.2 K	2012-08-07 - 11:36	TWikiAdminUser
flv	CourierServiceCalibration_OptStart.flv	r2 r1	manage	1435.9 K	2012-08-07 - 11:37	TWikiAdminUser
flv	CourierServiceCalibration_PANBox.flv	r3 r2 r1	manage	1200.6 K	2012-08-08 - 02:32	MichaelOSullivan
flv	CourierServiceCalibration_PANControls.flv	r2 r1	manage	2837.6 K	2012-08-07 - 11:44	TWikiAdminUser
flv	CourierServiceCalibration_PANInitial.flv	r2 r1	manage	2089.5 K	2012-08-07 - 02:37	TWikiAdminUser
flv	CourierServiceCalibration_PANResponses.flv	r2 r1	manage	2185.6 K	2012-08-07 - 02:37	TWikiAdminUser
flv	CourierServiceCalibration_PANRetry.flv	r2 r1	manage	2433.5 K	2012-08-08 - 02:48	MichaelOSullivan
flv	CourierServiceCalibration_Recordsets.flv	r2 r1	manage	854.3 K	2012-08-07 - 02:34	TWikiAdminUser
xls	courier.xls	r3 r2 r1	manage	229.5 K	2012-08-07 - 01:22	TWikiAdminUser

Topic revision: r27 - 2018-11-23 - TWikiAdminUser

Case Study: Calibrating the Courier Service Model

Submitted: 27 Sep 2009

Operations Research Topics: SimulationModelling

Application Areas: Logistics

Contents

Problem Description

Problem Formulation

Computational Model

Results

Comparing Multiple Simulations

Changing Distribution Parameters

Conclusions

Extra for Experts

Minimising Uniform Error via _OptQuest