06 October 2012

Formulation of a Service Quality Metric

The quantitative formulation of an overall quality metric, which can be extracted from an arbitrary timetable, is necessary to objectively answer the question “is proposed timetable A better than proposed timetable B?”

Such metrics facilitate the trade-study and optimization process of planning a new timetable, and must take into account several factors, including not just the quality of the service provided to passengers but also other factors that passengers don’t think about, such as robustness to disruption, fleet size and crew time considerations.

For today, however, we will focus exclusively on quantifying the quality of the service provided to passengers.  This particular formulation proceeds in eight reasonably simple steps, pulling together earlier information on timetable metrics and demographics.  It is only one example of how one might formulate a service quality metric, something that Caltrain has never explicitly done and could benefit greatly from doing as they share the pros and cons of various blended service plans.  This is one way to do it; what's theirs?

Step 1: Extract trip time and wait time statistics for each origin and destination pair.  By straightforward analysis of the timetable, one can figure all the possible trips between any origin station A and destination station B (including transfers) during a one-hour span during the morning peak.  One can then determine (in units of time):
  1. The average trip time between A and B (Tmean_AB)
  2. The fastest trip time between A and B (Tmin_AB)
  3. The average wait between trips that connect A and B (Wmean_AB)
  4. The longest wait between trips that connect A and B (Wmax_AB)
The first two metrics measure trip time on board the train, and the next two can be used as a proxy for measuring typical wait times on the platform.  The trip time and wait time figures are intrinsic to the timetable and can be extracted by a computer program.

Step 2: Compute an “effective” trip time from A to B by computing a weighted sum of the time components extracted above. This is where judgment calls start to be made. Taking into account the waiting times Wmean and Wmax is just as important as the actual trip times Tmin and Tmean, in order to properly account for the frequency of service. For example, the effective trip time could be defined as:

Teff_AB = (30% of Tmin_AB + 70% of Tmean_AB) + (20% of Wmean_AB + 15% of Wmax_AB)

The trip time term (30% of Tmin_AB + 70% of Tmean_AB) accounts for some trips being shortened by express service. The waiting time term (20% of Wmean_AB + 15% of Wmax_AB) properly penalizes long service gaps, but remains shorter than the waiting time incurred when the passenger shows up randomly, which is 50% of Wmean_AB.  This lower weighting reflects the fact that passengers don’t show up randomly, but usually time their arrival at origin A for a particular trip to destination B.  For example, when trips are available every 15 minutes, the waiting term works out to a quite reasonable 5 minutes. The effective trip time is a reasonably good measure of how long it will take you to get from A to B.

Step 3: Determine the “effective” speed between origin A and origin B. This is simply distance divided by time, or: V_AB = d_AB / Teff_AB where d_AB is the distance between A and B. This process is repeated for every origin and destination pair A-B, and describes not the speed of a train, but the average speed of a typical trip from A to B including waiting time, based only on the available service provided by the specific timetable being considered.

Step 4: Compute weighting by population and jobs.  This is where census data enters the calculation, as it must.  For the morning rush hour, since ridership consists primarily of people going from their home near A to their work near B, we calculate a potential ridership weight based on how many people live near A and how many people work near B.  This simply reflects that if a lot of people live near A and work near B, it is more important to provide fast service between A and B than between other station pairs where fewer people and jobs are located.

The “home weight” Whome_A of origin station A is a simple gravity sum (1/r squared law) of the residential population, taken from the 2010 census, as described previously in greater detail.  Each person is divided by the square of how far they live from station A, to reflect that people who live further away from the station are less likely to use it. To prevent over-counting people who live very close to the station (where the 1/r squared term diverges), anyone living closer than ¼ mile from the station is considered to live ¼ mile away.  The resulting weights are shown at left, in orange.

Similarly, the “work weight” Wwork_B of destination station B is a simple gravity sum of the number of jobs over $40k, again taken from census data. Each job is divided by the square of how far it is from station B, to reflect that people who work further away from the station are less likely to use it. Once again, to prevent over-counting jobs located very close to the station, any job closer than ¼ mile from the station is considered ¼ mile away.  The resulting weights are shown at right, in blue.

Step 5: Compute weighting by distance. Regardless of where people live and work, there are upper and lower limits to how far they will typically commute by rail. Extremely short trips are less likely because of the overhead of access and egress to and from the station at each end of the journey. Conversely, extremely long trips are less likely because of their sheer duration.  As it turns out, the typical rush hour trip on Caltrain turns out to be about 25 miles, or 40 km.

For our purposes, the distance weighting is constructed by drawing a curve with a peak at 40 km. This distance weight starts off at zero for a trip distance of less than 7 km (reflecting no demand for such short trips), peaks at a distance of 40 km, and decays slowly thereafter.  Converted to miles, it looks like the figure at left.  The underlying math to draw this curve is a Rayleigh distribution with a peak at (d-7) = 33, where d is the trip distance in km.

Step 6: Combine the population, jobs and distance weights to obtain a ridership potential matrix.  The ridership potential matrix R is a matrix of size N squared, where N is the number of stations.  Each element R_AB of this matrix represents the "potential" ridership (in arbitrary relative units) that can be tapped into during the morning commute from origin A to destination B.  This ridership potential matrix has an important property: it is independent of any timetable, and concisely describes the underlying demand that inherently exists out there--regardless of how or whether that demand is met by rail service.  Each element R_AB is given by the product:

R_AB = Whome_A * Wwork_B * Wdistance_AB

Note that the matrix R is not symmetric, because the number of residents and jobs near each station differs.  For example, far more people will want to commute to SF Transbay in the morning than from it, since the number of jobs within a half mile of that station is greater than all the jobs within a half mile of every other Caltrain station all the way to Gilroy combined.

Step 7: Compute the service quality matrix. The service quality matrix Q is again a matrix of size N squared, where N is the number of stations. Each element Q_AB of this matrix represents the quality of morning rush hour service from station A to station B, and is given by the following formula:

Q_AB = R_AB * V_AB

This combines R_AB, the timetable-independent ridership potential from origin A to destination B, with V_AB, the timetable-dependent effective speed from A to B.  If you have a preferred AM origin and destination (as most commuters do), then you can compare your Q_AB for various timetables to see how any given timetable will meet your own specific needs.

Step 8: Extract overall service quality scores. The service quality metrics must be bench marked against some reference, so they are simply normalized against the most current timetable.  That means today's timetable will score 100, by definition.  By adding the elements of Q over all possible origin and destination pairs, we can quantify the degree of service improvement and compute a score for the entire timetable as well as a score for each individual station. The overall timetable service quality score is S = ΣQ / Sref, i.e. the sum of all the elements of Q divided by the corresponding sum for today's timetable.

An entire timetable can now be distilled to its essence, a single service quality score.

We are now empowered to compare various timetables and understand quantitatively the pros and cons of each.  This method will tell you objectively whether timetable A provides better overall service than timetable B--and if you happened to disagree with the scoring outcome, then your argument would be with the scoring method and not any particular detail of this or that proposed timetable.  Beyond the mathematical minutiae of the rather simple scoring method presented here, the larger point is that there needs to be a defined scoring process and a framework for stakeholders to discuss what makes a good timetable.  This scoring process is absolutely essential for planning future blended service on the peninsula.  Caltrain's approach so far has been to prescribe a certain skip-stop pattern (see Tables 7 and 8) and restrict all analysis to that particular pattern, seemingly without regard to overall service quality!