Power Law and Exponential Decay of Inter Contact Times between Mobile Devices
Thomas Karagiannis
Microsoft Research Cambridge, UK
JeanYves Le Boudec jeanyves.leboudec@ep?.ch 1.
EPFL Switzerland
Milan Vojnovi? ? c
Microsoft Research Cambridge, UK
thomkar@microsoft.com ABSTRACT
milanv@microsoft.com
INTRODUCTION
We examine the fundamental properties that determine the basic performance metrics for opportunistic communications. We ?rst consider the distribution of intercontact times between mobile devices. Using a diverse set of measured mobility traces, we ?nd as an invariant property that there is a characteristic time, order of half a day, beyond which the distribution decays exponentially. Up to this value, the distribution in many cases follows a power law, as shown in recent work. This power law ?nding was previously used to support the hypothesis that intercontact time has a power law tail, and that common mobility models are not adequate. However, we observe that the time scale of interest for opportunistic forwarding may be of the same order as the characteristic time, and thus the exponential tail is important. We further show that already simple models such as random walk and random waypoint can exhibit the same dichotomy in the distribution of intercontact time asc in empirical traces. Finally, we perform an extensive analysis of several properties of human mobility patterns across several dimensions, and we present empirical evidence that the return time of a mobile device to its favorite location site may already explain the observed dichotomy. Our ?ndings suggest that existing results on the performance of forwarding schemes based on powerlaw tails might be overly pessimistic.
Categories and Subject Descriptors
C.2.1 [Network Architecture and Design]: Wireless communication
General Terms
Algorithms, Human Factors, Measurement, Design
Keywords
Intercontact time, mobile devices, opportunistic communications, dichotomy, characteristic time
?
Authors in alphabetical order.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for pro?t or commercial advantage and that copies bear this notice and the full citation on the ?rst page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speci?c permission and/or a fee. MobiCom’07, September 9–14, 2007, Montréal, Québec, Canada. Copyright 2007 ACM 9781595936813/07/0009 ...$5.00.
Over the past year, empirical studies have provided evidence suggesting that power laws characterize diverse aspects of human mobility patterns, such as intercontact times, contact, and pause durations. These studies are of high practical importance for (a) informed decisions in protocol design, and (b) realistic mobility models for protocol performance evaluation. Speci?cally, Chaintreau et al [2] were perhaps the ?rst to report credible empirical evidence suggesting that the CCDF (complementary cumulative distribution function) of intercontact time between humancarried mobile devices follows a power law over a wide range of values that span the timescales of a few minutes to half a day. This empirical ?nding has motivated Chaintreau et al to pose the hypothesis that intercontact time has a CCDF with power law tail. Under this assumption, they derived some interesting results on the feasibility and performance of opportunistic forwarding algorithms. In particular, their hypothesis implies that for any forwarding scheme the mean packet delay is in?nite, if the powerlaw exponent of the intercontact time is smaller than or equal to 1 (the case suggested to hold in practice by the empirical results so far). These results are in sharp contrast with previously known ?ndings on similar packet forwarding algorithms (e.g. Grossglauser and Tse [7]) which were obtained under a hypothesis of exponentially decaying CCDF of intercontact time. Furthermore, the authors argued that the powerlaw tail is not supported by common mobility models (e.g. random waypoint [9]), thus suggesting a need for new models. In this paper, we ?nd that the CCDF of intercontact time between mobile devices features a dichotomy described as follows. On the one hand, in many cases the CCDF of intercontact time follows closely a powerlaw decay up to a characteristic time, which con?rms earlier studies. On the other hand, beyond this characteristic time, we ?nd that the decay is exponential. This exponential decay appears to be a new ?nding, which we validate across a diverse set of mobility traces. The dichotomy has important implications on the performance of opportunistic forwarding algorithms and implies that recent statements on performance of such algorithms may be overpessimistic. We further provide analytical results showing that simple mobility models such as simple random walk on a circuit (onedimensional version of the Manhattan Street Network model dating from the 80’s [12] and used recently [6, 1]) and random waypoint [9] on a chain can exhibit the same qualitative properties observed in empirical traces. Whilst our results do not suggest that the considered mobility models are suf?cient for realistic simulations, they stress that existing models should not be discarded on the basis of not supporting the empirically observed dichotomy of intercontact time.
Name UCSD Vehicular MITcell MITbt Cambridge Infocom
Technology WiFi GPS GSM Bluetooth Bluetooth Bluetooth
Table 1: Traces studied. Duration Devices Contacts 77 days 275 116,383 6 months 196 9,588 16 months 89 1,891,024 16 months 89 114,046 11.5 days 36 21,203 3 days 41 28,216
Mean Intercontact Time 24 hours 20.8 hours 3.5 hours 87 hours 14 hours 3.3 hours
Year 2002 2004 2004 2004 2005 2005
To understand the origins of the observed dichotomy, we then examine several properties of device contacts across various dimensions. We provide empirical evidence suggesting that the return time of a mobile device to its favorite location site features the same dichotomy as the one observed for the intercontact time between device pairs. This is an interesting hypothesis as it refers to the return time of a device to a site, which is a more elementary characterization of human mobility than that of the intercontact times. Also, it is a particularly important metric for cases where some devices have ?xed locations (e.g. throwboxes [19]). It is also noteworthy that we established the same qualitative equivalence between return time to a site and intercontact time for our simple random walk model. Further, our empirical analysis suggests that mobile devices are typically in contact in a few sites which are speci?c to the given pair of devices. Combining the latter with the observed dichotomy of the return time to a site may already explain the intercontact time dichotomy in real settings. We also investigate different viewpoints on device inter contacts such as that of a speci?c device pair and the random observer viewpoint. The examination of these viewpoints are of interest in order to evaluate how representative the CCDF of intercontact time is, which following earlier studies, is de?ned as the CCDF of intercontact time samples aggregated over all device pairs over a measurement period. We argue that the aggregate CCDF of intercontact time when used to derive residual time until a contact, refers to a viewpoint that corresponds to a random observer and for a device pair picked uniformly at random. Finally, we also report on various breakdowns of the aggregate behavior across pairs to investigate time nonstationarity due to the synchronization of timeofday human activities. Overall our results provide valuable insights towards the design of opportunistic packet forwarding schemes. Our contributions can be summarized in the following points: ? Using 6 distinct traces, we verify the powerlaw decay of intercontact time CCDF between mobile devices. The datasets range from campuswide logs of device associations to existing infrastructure, to direct contact bluetooth traces and GPS logs of vehicle movements over several months. ? We demonstrate that beyond a characteristic time of the order of half a day which is present across all datasets the CCDF exhibits exponential decay. (Section 3.) The observed dichotomy and exponential decay appear to be new results. ? We provide analytical results that show that already simple mobility models can exhibit precisely the same qualitative dichotomy. (Section 4.) These results contradict statements that current mobility models cannot support power law CCDF of intercontact time. ? We examine several dimensions of device contacts in order to better understand the observed dichotomy, and we study the
time nonstationarity due to human synchrony with the time of day. In particular, we provide evidence that there exist realworld cases in which return time of a mobile device to its frequently visited site follows the same dichotomy as that of the intercontact time between devices. (Section 5.) Implications to a practitioner. Our ?ndings suggest that: (a) The issue of in?nite expected packet forwarding delay of opportunistic forwarding schemes under the power tail assumption [2] does not appear relevant with an exponentially decaying tail of the intercontact CCDF. The exponential decay beyond the characteristic time is of relevance as available data traces suggest that the mean intercontact time is in many cases of the same order as the characteristic time. (b) Widelyused mobility models should not be abandoned on the claim that they cannot support powerlaw decay of the intercontact CCDF. Finally, (c) our analysis implies that time nonstationarity and potential environmental or other idiosyncracies that may in?uence human behavior (e.g., conference sites vs. working environments) should be taken into account during the design of future systems as they can signi?cantly affect its primary performance metrics.
2. 2.1
DATASETS & DEFINITIONS Datasets
To study the properties of contacts between humancarried mobile devices, we analyze several traces with diverse characteristics in terms of their duration, wireless technology used and environment of collection. Table 1 presents a summary of the different datasets that we used, with aggregate statistics regarding the total duration of the trace, the number of mobile devices, the number of contacts and the mean intercontact time. We can group the datasets in three distinct types: ? Infrastructurebased traces that re?ect connectivity between existing infrastructure, e.g., Access Points (APs) or cells, and wireless mobile devices (UCSD [13] & MITcell [5, 4] datasets in Table 1). These datasets describe association times of a speci?c mobile device with an AP or cell. ? Direct contact traces that record contacts directly between mobile devices (e.g., imotes) and were collected by distributing devices to a number of people, usually students or conference attendees (Cambridge [2, 14], Infocom [2, 15] & MITbt [5, 4] datasets in Table 1). These datasets describe start and end contact times for each pair of mobile devices. ? GPSbased contacts through a private trace collected by tracking the movements of individual people of a large corporation through GPS units. The GPS units were placed in volunteer’s cars for approximately four months and overall the trace covers the metropolitan area of a large US city (Vehicular dataset [11] in Table 1). The dataset logs the latitude and longitude coordinates of each mobile device every approximately 10 seconds.
Figure 1: Intercontact time CCDF for the six datasets. Note that the ?rst two types of traces have been used in previous studies (e.g., [2]), while the latter which features exact mobility patterns based on the GPS trackers is unique to our study. Due to space limitations, we refer the interested reader to the references provided for a complete description of the experimental setting, the devices used and the limitations of each trace. Apart from the datasets specifying direct connectivity, contacts need to be inferred in the rest of the traces. For the infrastructurebased traces, we assume that two devices are in contact if they reside within the range of the same AP or cell in accordance with previous studies. For the vehicular trace, we assume that two mobile devices are in contact if their distance is less than or equal to a parameter r. For our experiments we chose r = 500 meters [8], while we experimented with values from 100m to 1km and found qualitatively similar results. Throughout the paper we will use all datasets interchangeably. Unless otherwise speci?ed, our observations apply to all traces listed in Table 1. traces introduced in the previous section. We have carefully examined all datasets in Table 1 and con?rmed the hypothesis that in many cases the aggregate CCDF of the intercontact times follows a powerlaw up to a characteristic time. We ?nd this time to be in the order of half a day. Note that this hypothesis was already tested in previous work [2]. However, we demonstrate here that beyond this characteristic time, the CCDF exhibits an exponential decay. To the best of our knowledge, the hypothesis that the CCDF of the intercontact times beyond the characteristic time exhibits an exponential decay has been neither posed nor tested before. We then argue that the exponential decay is an important property since it bears signi?cant impact on the mean intercontact time or more generally on the CCDF of the intercontact time observed from an arbitrary point in time. Finally, we discuss the practical implications of the observed dichotomy to opportunistic forwarding.
3.1
Power law and exponential decay
2.2
De?nitions
We use the following de?nitions. An intercontact time between two devices is de?ned as the length of the time interval over which the two devices are not in contact and are in contact at the end points of this interval. For a device pair, we call residual intercontact time, the time until the next contact of this device pair from a given observation time. A return time of a device to a set of a space is de?ned as the minimum time until the device enters the set, from a time instance at which the device exited the set. We call CCDF of intercontact time between two devices, the CCDF obtained for the intercontact time sampled per contact of these two devices. We further call the aggregate CCDF of intercontact time between all devices, the CCDF of per contact samples of intercontact time over all distinct pairs of devices. We often abuse this notation by omitting explicitly to mention the “aggregate” but the meaning should be clear. Finally, we consider the CCDF of the residual intercontact time at a speci?c observation time, de?ned for a value t ≥ 0 as the fraction of device pairs for which the residual intercontact time at the observation time is larger than t.
3.
DICHOTOMY OF INTERCONTACT TIMES
In this section, we examine the empirical distributions of intercontact times between mobile devices inferred from the mobility
In this section, we provide empirical evidence of a dichotomy in the CCDF of intercontact time. Up to a characteristic time in the order of half a day, the decay of the CCDF is well approximated as a power law, while beyond this characteristic time, the decay is exponential. Power law. We ?rst revisit the power law hypothesis in the examined datasets. To this end, we have inferred the intercontact time for each of the traces and estimated the aggregate CCDF of intercontact time between all devices. Fig. 1 shows the respective aggregate CCDFs of intercontact times in loglog scale. The CCDF values follow a straight line over a range of values spanning the order of a few minutes to half a day, thus suggesting a power law. These results are in line with observations of previous studies for datasets MIT, UCSD, Cambridge and Infocom. A new piece of information is however that the same property holds for the vehicular trace which is signi?cantly different in nature from the rest of the datasets. Exponential decay. Carefully examining Fig. 1, we observe that at roughly around half a day, the CCDF has a knee beyond which the decay is abruptly faster. We call this knee the characteristic time. In order to examine the CCDF of intercontact time beyond the characteristic time, we replot the same curves of Fig. 1 in Fig. 2 but this time in linlog scale. We now turn our attention to the distributions beyond the characteristic time. In the linlog scale (Fig. 2), the CCDF can be closely upper bounded with a straight line, thus indicating an exponential decay. In some traces, e.g., for Infocom and Cambridge, we also observe some variability in the
10
Intercontact time CCDF
0
Infocom
10
Intercontact time CCDF
0
UCSD
Intercontact time CCDF
10 10 10 10 10 10
0
Cambridge
10 10 10 10 10
1
10 10 10 10 10
1
1
2
2
2
3
3
3
4
4
mean 3.3 hours
mean 24 hours
4
mean 14 hours
5
5
5
0
0.5
1
1.5 2 Time (days)
2.5
3
0
10
20
30 40 50 Time (days)
0
60
70
0
2
4
6 8 Time (days)
10
12
10
Intercontact time CCDF
0
MIT (BT)
10
Intercontact time CCDF
Vehicular
10 10 10 10 10
1
10 10 10 10 10
1
2
2
3
3
4
4
mean 87 hours
mean 20.8 hours
5
5
0
50
100
150 200 250 Time (days)
300
350
0
10
20 Time (days)
30
40
Figure 2: Same as in Fig. 1 but plotted in linlog scale. The results con?rm the explonential decay of the CCDF beyond half a day.
tail that after close examination we found to be in line with daily periodicities (24 hours).
3.2
Implications of dichotomy to opportunistic forwarding
How does the observed dichotomy in the distribution of intercontact time affect the design of forwarding schemes? Motivated by the observed power law in the empirical aggregate CCDF of intercontact time up to half a day, Chainterau et al [2] made the hypothesis that the CCDF of intercontact time, denoted as F 0 (t), between any two mobile devices is a Pareto distribution, thus power law over [t0 , +∞), for some t0 > 0. Concretely, for α > 0, F 0 (t) = t0 t
α
, t ≥ t0 .
(1)
The authors then argued that the assumption that the CCDF of intercontact time has a power tail is in sharp contrast with prior work on packet forwarding; previous work would assume exponential tail for the CCDF distribution, such as for example, that of Grossglauser and Tse [7] that considers twohop packet relying schemes. Under the assumption that intercontact time between mobile devices are independent and identically distributed, Chaintreau et al [2] derived interesting results on the feasibility of twohop packet relying schemes. In summary, they show that under the assumptions therein, there exists a twohop relying scheme that ensures ?nite mean packet forwarding delay if α > 1 + 1/m, where
m is the number of packet replicas made from a source to distinct relay nodes, and that if α ≤ 1, for any packet forwarding scheme the mean packet forwarding delay is in?nite. It is precisely the latter case (α ≤ 1) that was suggested to hold in reallife by the mobility traces analyzed so far. However, Fig. 2 highlights that the observed dichotomy in the CCDF of intercontact times between mobile devices, rather suggests to take as a hypothesis that the CCDF of the intercontact times has exponentially decaying tail. This exponential tail entirely eliminates the issue of in?nite packet forwarding delay under the power tail assumption. Furthermore, in the datasets the mean intercontact time is of the same order as the characteristic time, and thus the exponential tail cannot be ignored by the time separation argument. This is of particular importance for practical schemes that were later proposed, such as throwboxes [19]. There, it is assumed that the mean intercontact time is ?nite and can be estimated. This is a valid hypothesis under the dichotomy that we observed in the traces is a general feature, but would not be valid under the hypotheses of the model in [2]. We further contrast the dichotomy in the distribution of intercontact time with the assumed powerlaw tail by Chainterau et al [2]. For analyses similar to those in [2], it is of interest to consider the residual intercontact time distribution, i.e. the time until the next contact for a node pair from an arbitrary point in time. Intuitively, the residual time re?ects how much time a device has to wait before being able to forward a message to another speci?c
0 m1 1
2
Figure 3: Residual intercontact time CCDF. device. Suppose for a moment that contacts between a node pair occur at instances of a stationary point process in time with a ?nite mean intercontact time. It is well known that the CCDF of residual intercontact time, F (t), relates to the CCDF of the intercontact time sampled at contact instances, F 0 (t), as follows:
+∞
Figure 4: Circuit of m sites. mobility models the dichotomy in the intercontact time CCDF is qualitatively precisely the same as observed in some empirical traces.
4.1
Simple random walk
F (t) = λ
t
F 0 (s)ds
(2)
where 1/λ is the mean intercontact time sampled at contacts, and is assumed to be ?nite. Under the assumptions of Chainterau et al, we have that Eq. (1) holds, and provided that α > 1, it follows F (t) = t0 t
α?1
We consider as mobility domain a circuit of m sites, enumerated as 0, 1, . . . , m ? 1. (Fig. 4.) A device moves according to a simple random walk: from a site i, it moves to either site i ? 1 mod m or i + 1 mod m with equal probability. We denote with Xk (n) the site on which a device k is at time n ≥ 0.
4.1.1
Return time
, t ≥ t0 ,
thus, again a Pareto distribution but with scale parameter α ? 1. As a result, if we consider the empirical CCDF of the residual time, we should observe a power law that would manifest itself as a straight line in a loglog plot. In contrast, if the empirical CCDF of intercontact time exhibits the aforementioned dichotomy, we should rather observe that the rate of decrease in a loglog plot of the residual intercontact time increases. This would be the case provided that the exponent is not too large and would follow from the tail integration in Eq. (2). In accordance with the previous discussion, Fig. (3) shows the empirical CCDF of the residual intercontact time for three datasets and con?rms the increasing rate of the decay.
We ?rst consider the return time R of a single device to an arbitrarily ?xed site. Without loss of generality, we may consider only the return time of device 1 to site 0, i.e. given that X1 (0) = 0, X1 (1) = 0, R = min{n > 0 : X1 (n) = 0}. The result shows that already the return time to a site features the aforementioned dichotomy. We show later that intercontact time CCDF features the same qualitative properties. T HEOREM 1 (R ETURN T IME ). For the return time R of a simple random walk to a ?xed site on a circuit of m sites: 1. Expected return time: E(R) = m 2. Powerlaw for in?nite circuit: P(R > n) ? 2 1 , large n π n1/2
4.
SIMPLE MOBILITY MODELS CAN SUPPORT THE DICHOTOMY
In this section, we show that already simple mobility models such as simple random walk on onedimensional torus feature the dichotomy in the CCDF of intercontact time in that it is close to a power law up to a characteristic time and beyond it has exponential decay. This model can be seen as an onedimensional version of simple random walk on a twodimensional torus, which was used as early as in [12] (Manhattan Street Network), and later used in recent studies (e.g. [6]), and can also be seen as a special case of random walk on torus model in [1]. We also show that random waypoint on a chain of discrete sites features the intercontact CCDF that is close to a power law over an interval and has exponentially decaying tail.1 These results contradict existing statements that current mobility models do not feature power law CCDF of intercontact time. The results also show that for some
1 The distribution of the intercontact time under a random waypoint model was analyzed by Sharma and Mazumdar [17]. They showed that this distribution is exponentially bounded on both sides under assumptions that (a) mobility domain is a sphere and (b) any trip between two successive waypoints is of a ?xed duration.
where for two functions f and g, f (n) ? g(n) means that f (n)/g(n) tends to 1 as n goes to in?nity. 3. Exponentially decaying tail: P(R > n) ? ?(n)e?βn , large n where ?(n) is a trigonometric polynomial in n and β > 0. We call trigonometric polynomial in n a function of the form ?(n) = K k=1 [ak cos(nωk ) + bk sin(nωk )] where ak , bk and ωk are constants. Item 1 shows that average return time to a site is equal to the time needed to circumvent the circuit. Item 2 shows that for a circle of in?nitely many sites, the asymptotic of return time CCDF is precisely the powerlaw with exponent 1/2. The result suggests that the asserted asymptotic may be a good approximation of the CCDF for large but ?nite circuit (Fig. 5 shows that this holds already for as few as 20 sites). It is noteworthy that the powerlaw
10
0
10
0
Intercontact time CCDF
10 10 10 10
1
Intercontact time CCDF
10 10 10 10
1
2
2
3
3
m = 20
4
m = 100
4
10 0 10
5
10
1
10 Time
2
10
3
10 0 10
5
10
1
10 Time
2
10
3
10
4
10
0
10
0
Intercontact time CCDF
10 10 10 10 10
1
Intercontact time CCDF
10
1
2
10
2
3
10
3
m = 20
4
10
4
m = 100
5
0
100
200 Time
300
400
500
10
5
0
1000
2000
3000
4000
5000
6000
Time
Figure 5: CCDF of intercontact time for simple random walks on a circuit of m sites in loglog and linlog scale. under item 2 holds more generally for any onedimensional aperiodic recurrent random walk with a ?nite variance σ 2 < ∞ such that we have [18, P3, p381] P(R > n) ? 2 1 σ √ , large n. π n Of our particular interest is f1 (z) that can be expressed as: √ 1 + (2a(m, z) ? 1) 1 ? z 2 f1 (z) = . z The assertion under item 2 follows by noting that
m→∞
(3)
lim a(m, z) = 0, for 0 < z < 1 √ 1 ? z2 . z
Item 3 shows that for a circle of a ?nite number of sites, the CCDF of the return time has exponential decay. P ROOF. Item 1. Let ri be the mean hitting time of site 0 starting from site i. We have that r0 = 0 and ri = 1+ 1 (ri?1 + ri+1 ), i = 1, . . . , m ? 1 2
and, hence, for an in?nite circuit, f1 (z) = 1?
Using the Binomial theorem for (1 ? z 2 )1/2 and some elementary calculus, we have
∞
where addition in indices is modulo m. It can be shown by induction on i that the solution is ri = i(m ? i), i = 0, 1, . . . , m ? 1. The assertion of item 1 follows, noting that E(R) = 1 + r1 . Item 2. We consider the ztransform of the return time R to site 0 started from a site i, i.e. fi (z) := E(z R X0 = i), i = 0, 1, . . . , m ? 1. We have the following system of linear equations f0 (z) fi (z) = = fm (z) = 1 1 z (fi?1 (z) + fi+1 (z)) , i = 1, . . . , m ? 1. 2 where
f1 (z) =
n=1: n
odd
1 2 n+1 2
(?1)
n?1 2
zn
1 2
k It thus follows that P(R > n) =
m
=
k?1 n=0
1 2
?n
k!
1 2 m+1 2
.
 odd,m>n
=
k≥ n +1 2

1 2
k

(4)
Further, using the Stirling formula 
1 2
This is a linear recurrence, which after some algebra (see [10] for details) gives f0 (z) fi (z) a(m, z) = = = 0
a(m,z)(1+
k
1  ? √ 3/2 , large k. 2 πk
(5)
√
for i = 1, . . . , n ? 1 √
z √ (1+
m
1?z 2 )i +(1?a(m,z))(1? zi 1?z 2 )m
√
1?z 2 )i
,
1 1 We now use the fact that if un ? n1 then ∞ um ? α?1 nα?1 α m=n n large, for α > 1. The asserted result follows from (5) and (4). Item 3: It follows from Lemma 1 (see below) with the subset ? reduced to the element 0.
?(1?
1?z 2 )m ?(1?
√
1?z 2 )m
.
0
m
random walk on a circuit considered in Theorem 1. In particular, item 2 asserts the same asymptotic CCDF as for the return time in Theorem 1 except only for different multiplicative constant. Similarly, item 3 is the same property as holding for the return time in Theorem 1.
0 m/2
(m/2,m/2)
m
P ROOF. We make use of a reduction to return time for a simple random walk described next. The two independent random walks amount to a random walk on a twodimensional lattice, with transition probabilities P(i,j) ((X1 (1), X2 (1)) = (i ± 1, j ± 1)) = 1 . 4
Figure 6: Reduction to return time of an onedimensional simple random walk.
The following lemma is used in the proof of Theorem 1 and in other places in this paper, so we give it in a fairly general form. L EMMA 1 ( R ETURN TIME FOR FINITE M ARKOV CHAIN ) . Let Xn be an irreducible Markov chain on some ?nite state space S and let ? be a subset of S (? = and ? = S). Let R be the return time to ?. The stationary distribution of R is such that P(R > n) ? ?(n)e?βn , large n where ?(n) is a trigonometric polynomial and β > 0. Note that R is the time between leaving ? and returning to ?. The hypotheses imply that the chain is positive recurrent and thus R is ?nite. The proof relies on spectral decomposition of nonnegative matrices, using results in [16, 3]. For lack of space, the proof is not given here. It can be found in [10].
Without loss of generality, we assume (X1 (0), X2 (0)) = (0, 0) and (X1 (1), X2 (1)) = (1, ?1). The intercontact time is the hitting time plus 1 of the twodimensional random walk (X1 , X2 ) with the hitting set {i · m + j, i, j = . . . , ?1, 0, 1, . . .}, starting from the point (1, ?1). This is equivalent to considering hitting time of the boundaries (0, i) and (m/2, i), i = . . . , ?1, 0, 1, . . ., for a simple twodimensional random walk started at the point (0,1). See Fig. 6 for an illustration. This hitting time can be represented as
H
T =1+
n=1
Vi
(6)
where H is the number of transitions along the x axis until hitting of the boundaries and Vi is the number of transitions along the vertical axis between the (i ? 1)st and ith horizontal transition. Note that H is return time to site 0 of a simple random walk on a circuit of m/2 sites started at site 1. We have that H and (V1 , V2 , . . . , VH ) are independent and that for any given H, (V1 , V2 , . . . , VH ) is a sequence of independent and identically distributed random variables with distribution 1 P(Vi = k) = k , k = 1, 2, . . . . 2 Item 1: From (6) and noted independency properties, we can use Wald’s lemma to assert E(T ) = 1 + E(H)E(V1 ). The random variable H is the return time for simple random walk on a circuit of m/2 sites, so E(H) = m/2 ? 1. Note also that E(V1 ) = 2. It follows E(T ) = m ? 1. Item 2. We consider the ztransform of the intercontact time T . We have E(z T ) = zgH z 2?z (7)
4.1.2
Intercontact time
We consider mobility of two devices according to two independent simple random walks on a circuit of m sites. We assume that m is even. The intercontact time between two devices is de?ned as, given X1 (0) = X2 (0) and X1 (1) = X2 (1), T = min{n > 0 : X1 (n) = X2 (n)}. We next examine the CCDF of intercontact time T . T HEOREM 2 (I NTER C ONTACT T IME ). Consider two independent simple random walks on a circuit of m sites, where m is assumed to be even. The intercontact time T between the two random walks has the following properties. 1. Mean intercontact time: E(T ) = m ? 1 2. Power law for an in?nite circuit: 2 1 , large n P(T > n) ? √ π n1/2 3. Exponentially decaying tail: P(T > n) ? ?(n)e
?βn
where gH (·) is the ztransform of the random variable H. To see this, ?rst note z E(z Vi ) = . 2?z The asserted identity Eq. (7) is direct by following simple calculus
H
, large n
E(z T )
= =
zE
i=1
z Vi = zgH z 2?z
where ?(n) is a trigonometric polynomial in n and β > 0. With regard to the asserted properties, the intercontact time is qualitatively the same as the return time to a site for a single simple
zE E(z V1 )H
where gH (z) := E(z H ).
10
0
10
0
Intercontact time CCDF
Intercontact time CCDF
10
1
10
1
10
2
10
2
10
3
10
3
10 1 10
4
10 10
2
4
10
3
Time
10
4
10
5
10
6
1
2 3 4 Time (x 100,000)
5
6
Figure 7: Intercontact time CCDF for random waypoint on a chain of m = 1000 sites on loglog and linlog scale
Now, in Eq. (3), we already derived the ztransform of the return time to site 0 for simple random walk on a circuit of m sites. From Eq. (7) and Eq. (3), we obtain √ (8) E(z T ) = 2 ? z + 2 (2b(m/2, z) ? 1) 1 ? z with b(n, z) = √ z n ? (2 ? z ? 2 1 ? z)n √ √ . (2 ? z + 2 1 ? z)n ? (2 ? z ? 2 1 ? z)n
The assertion under item 2 now follows from (8) by mimicking the proof of Theorem 1. Item 3. This follows from Lemma 1 with Markov chain X(n) = (X1 (n), X2 (n)), S the set of reachable states, and subset ? = {(i, i), i = 0, ..., m ? 1}.
Figure 8: Mobile positions moving according to random waypoint on
a chain of m = 1000 sites. The thick trajectory corresponds to a long intercontact time.
4.2
Random walk on a 2dim torus
Similarly one may consider mobility of a device de?ned as a random walk on a twodimensional torus of m and k sites in the respective two dimensions. One may extend the onedimensional mobility of a device to two dimensions by assuming that device mobility in each of the dimensions is a simple random walk and the two random walks are independent. This model resembles the well known Manhattangrid model [12]. From Lemma 1, we know that also for this model, the CCDF of intercontact times between two devices has exponential tail. A detailed analysis of the CCDF of intercontact time is beyond the scope of this paper. Such an analysis may follow the same steps as in this section, but for the difference random walk describing the difference between coordinates of the two independent random walkers. It is not clear that the same dichotomy would hold. For example, we know from [18, E1, p167] that the CCDF of the return time to a site for a simple random walk in two dimensions is π/ log(n), for large n. The interested reader may refer to [10] for simulation estimates of the intercontact time in two dimensions.
The results demonstrate that random waypoint can feature a power law like decay of the CCDF over an interval that covers the mean intercontact time. We also observe the presence of short and long intercontact times. Fig. 8 suggests that long intercontact times occur due to the assumptions that two devices are in contact only if in the same site.
5.
SPATIOTEMPORAL BREAKDOWN
4.3
Random waypoint on a chain
We consider random waypoint on a chain of m sites. This is a discrete time, discrete space version of well known random waypoint [9]. Each device is assumed to move stochastically independently. A movement of a device is speci?ed by its current site and next waypoint site. The device moves to its next waypoint site by one site per time instant. When it reaches the next waypoint, it updates the next waypoint to a sample drawn uniformly at random on the set of sites constituting the chain and the movement continues as described. Two devices are assumed to be in contact at a time t, if at this time they reside in the same site. We analyze this model by simulations. In Fig. 7, we show the empirical estimate of the CCDF of intercontact time, both in loglog and linlog scale.
Here, we breakdown device contacts along several dimensions. Our goal is to better understand individual elements that contribute to the aggregate measures reported in preceding sections. Note that our ?ndings thus far have been obtained by aggregating over individual device pairs and also time. First, we breakdown device inter contacts by analyzing return times of individual mobile devices to their respective most frequently visited sites. A site here refers to a location region such as a circular area for the vehicular data or an AP/cell for UCSD/MIT data. Our analysis suggests that return times exhibit the same dichotomy in the distribution as the one found for the intercontact times between device pairs. We then pose and con?rm the hypothesis that devices are in contact at a small set of distinct sites. These two ?ndings suggest that the dichotomy in the distribution of the return time may already explain the observed dichotomy in the distribution of intercontact time between devices. Second, we discuss how the aggregate CCDF of intercontact time between devices as obtained from aggregate samples of intercontact time over all device pairs over a measurement period relates to the CCDF of intercontact time for individual device pairs. Further, we ask the question what this aggregate CCDF yields when used to characterize the intercontact time between devices observed from an arbitrary point in time.
10
0
1
Return time to "home" site (CCDF)
UCSD 10
2
Sites to cover 90% of contacts pair (CDF)
10
1
0.8
UCSD
0.6 Vehicular 0.4
Vehicular 10
3
10 4 10
0
4
10
3
10
2
10 10 Time (days)
1
0
10
1
10
2
0.2 0
2
4
6 Sites
8
10
12
14
10
1
Sites to cover 90% of contact durations per pair (CDF)
0.9 0.8 0.7 0.6 0.5 0.4 0 2 4 6 Sites 8 10 12 14 Vehicular UCSD
Return time to "home" site (CCDF)
10
1
Vehicular
10
2
10
3
UCSD 10
4
0
5
10 Time (days)
15
20
Figure 9: Return time exhibits the same dichotomy as the intercontact time (in loglog and linlog scale). Third, we examine the extent of time nonstationarity in device inter contacts and non surprisingly con?rm the presence of strong time of day dependencies.
Figure 10: CDF of the number of sites that cover 90% of contacts
(top) and contact durations (bottom) per device pair.
5.1
Return versus intercontact time
Our evaluation in the previous sections reveals a characteristic time in the order of half a day, which could be attributed to the daily periodicity of human behavior. Our goal here is to capture features of human mobility that could play a role in the observed dichotomy and in the particular decay of the intercontact time distribution within the two timescales. To this end, we examine the return time of a device to a particular location or site. Note that the return time characterizes mobility of a single human and thus may be regarded as a more elementary characterization of human mobility than intercontact time. Having established in the previous section that for simple mobility models (e.g., independent random walks on a ?nite circuit) the CCDF of intercontact time between two random walkers and the CCDF of the return time to a speci?c site for a single random walker feature precisely the same dichotomy, we now examine whether this observation holds in real mobility cases. In a hypothetical scenario where two mobile devices would almost always meet at a particular site, the intercontact time between the two devices would be stochastically larger than the return time of any of the two devices to that given site. Supposing further that two devices are synchronized in time, then the return time to a site would closely characterize the intercontact time. In this section, we demonstrate that return times of a device to a speci?c site feature the observed dichotomy. The dichotomy characterizes the return times of individual devices to their “home” sites. In order to test the hypothesis of the dichotomy in the CCDF of device return time to a speci?c site, we conducted the following analysis. For each device, we infer a “home" site de?ned as the location region where the device spends
most of its time. A location region is either circular area of some radius r for the vehicular trace, or an AP/cell for the UCSD/MIT trace. In Fig. 9, we show the CCDF of the inferred return time of a device to its home site over all devices. The ?gure shows remarkable qualitative similarity to the corresponding CCDF of device intercontact time (Fig. 1). As we argued over the previous paragraphs, this is an interesting property as the return time is a more basic characterization of human mobility than intercontact time. We further test the hypothesis that typically two devices meet at a few sites. To that end, we counted the number of sites per each device pair ranked with respect to their frequency of contacts. Then for each device pair, we examined how many sites cover either 90% of their contacts or 90% of their total contact duration. The two corresponding CDFs are shown in Fig. 10, where the median number of sites is less than 2 and the 90% quantile is less than 4 sites in all the considered cases. The main theme of this paper is around the time dimension of device mobility. In this paragraph, we detour slightly to brie?y consider the spatial aspect of the return time to the home site. In Fig. 11, we show the CCDF of the trip distance incurred on the return trips to the home site of a device. We present results only for the vehicular dataset since this is the only trace with precise location information for each device. The CCDF is well approximated by a straight line in the loglog scale over a wide range of distances spanning 40 to 200 kilometers. For smaller distances, the distribution appears to decay exponentially. While the spatial aspect of human mobility is itself an interesting topic, it is beyond the scope of this paper to pursue this further in more detail.
5.2
Contacts across different viewpoints
In this section we consider different viewpoints on device inter contacts and their interpretations from the packet forwarding perspective. We address the following questions:
10
0
Vehicular
Distance traveled before returning to home site (CCDF)
10
1
10
2
10 10
0
0
10
1
Kilometres Vehicular
10
2
10
3
10
0
UCSD
Distance traveled before returning to home site (CCDF)
Intercontact time CCDF
10
1
10
1
10
2
10
2
0
200
400 Kilometres
600
800
10 2 10
3
10
1
10
0
10 Time (hours)
1
10
2
10
3
Figure 11: Travel distance on a return trip to device home site in loglog and linlog scale.
Figure 12: Aggregate vs. per devicepair CCDF of intercontact time
(top) and 7 individual devicepair CCDFs (bottom).
(a) Is the aggregate CCDF of intercontact time, derived from samples aggregated over all device pairs over a measurement interval, representative of the CCDF of intercontact time for a speci?c pair of devices? (b) What metric does the aggregate CCDF correspond to when used to evaluate the delay of an opportunistic forwarding scheme? (c) How does the intercontact time statistic depend on the time of day?
where 1(A) is the indicator whether the condition A holds2 and N (t) = p∈P Np (t) is the number of contacts over all pairs in P on [0, t]. We rewrite the above identity as: ? F 0 (t, T ) =
p∈P
Np (T ) ? 1 ? 0 Fp (t, T ) N (T )
(10)
?0 where Fp (t, T ) is the empirical CCDF of intercontact time for a device pair p given by ?0 Fp (t, T ) = 1 Np (T ) ? 1
Np (T )?1 p p 1(Tn+1 ? Tn > t). n=1
5.2.1
Aggregate vs per devicepair viewpoint
Previous studies and the analysis in Section 3 considered the CCDF of intercontact time obtained from samples aggregated over all device pairs over a measurement period. We call this the aggregate CCDF. We examine here how the aggregate CCDF of intercontact time relates to the CCDF of intercontact time for a device pair. In general, the two are different and the bias is such that the aggregate CCDF gives more weight to devices that meet more frequently. Consider a mobility trace over a measurement interval of duration T and let the time origin 0 be de?ned as the beginning of the measurement interval. We denote with P the set of all distinct device pairs that were in contact at least twice over the measurement p interval. Let Tn the time of the nth contact for a device pair p, with n = 1, 2, . . . and let for this pair, Np (t) be the number of contacts on [0, t]. The empirical aggregate CCDF for a value t ≥ 0 is de?ned as the fraction of intercontact times over all device pairs in P that are larger than t, i.e. ? F 0 (t, T ) = 1 N (T )
Np (T )?1 p p 1(Tn+1 ? Tn > t) p∈P n=1
Eq. (10) tells us that the aggregate CCDF is a weighted sum of the CCDFs over device pairs, with the weight for a device pair proportional to the number of contacts observed for this device pair. This is indeed intuitive as we expect to observe a larger number of intercontact samples for pairs of devices that meet more frequently. For the sake of discussion, suppose for a moment that contacts between mobile devices occur at instances of a point process that is assumed to be time stationary and ergodic, but not necessarily stochastically identical over pairs of devices. From Eq. (9), it fol? lows that as T tends to be large, F 0 (t, T ) converges to F 0 (t) =
p∈P
λp 0 Fp (t) λ
(11)
where 1/λp is the mean of intercontact times sampled at con0 tact instances of the device pair p and Fp (·) is the corresponding CCDF, and λ = p∈P λp is the total rate of contacts over all device pairs. We note that the aggregate CCDF, F 0 (t), exactly 0 matches each of the CCDFs Fp (t), p ∈ P, only if contacts for distinct pairs are stochastically identical. The aggregate CCDF
2
(9)
1(A) = 1 if A true, else 1(A) = 0.
18
Mean residual time (hours)
Infocom
10
Residual Time CCDF
0
Infocom
10
Intercontact time CCDF
0
UCSD
16 14 12 10 8 6 4 0 10 20 30 Time (hours) 40 50
10 10 10 10 10
1
Crossing midnight
10
1
2
3
10
2
4
5
Not crossing midnight
10 3 10
3
10
2
10 10 Time (hours)
1
0
10
1
10
2
10 3 10
6
10
2
10
1
10 10 Time ( hours)
0
1
10
2
10
3
Figure 13: LEFT: Mean residual time at various times of the day shows nonstationarity effects. MIDDLE: Residual time CCDF at different times
of day. RIGHT: CCDFs of intercontact times crossing midnight vs. intercontact times during the day.
of intercontact time is equal to the weighted sum of individual CCDFs given in Eq. (11) with weight for a device pair p proportional to the rate of contacts λp . Thec preceding discussion raises the question of how representative the aggregate CCDF of intercontact time is for an arbitrarily chosen device pair. To address this question, we explore how the aggregate CCDF differs from the CCDF of a pair of devices in the different datasets. Fig. 12top shows the aggregate CCDF of intercontact time along with percentiles of the CCDF over all devicepairs for the UCSD dataset. We observe that for each given time, more than half of node pairs have a CCDF of the intercontact time in a reasonably narrow neighborhood around the aggregate CCDF. In Fig. 12bottom, we show 7 distinct CCDFs of individual devicepairs, which on the other hand present some variability that could be hidden at the aggregate viewpoint. We have examined the discrepancy of the aggregate CCDF and the arithmetic mean of individual CCDFs and observed that the former lower bounds the latter but their difference was not substantially large.
After some straightforward but tedious calculus, it follows that we can rewrite Eq. (13) as N (T ) ? F (t, T ) = PT
+∞
? F 0 (s, T )ds + e(T )
t
(14)
where e(T ) is a term that captures the boundary effects and in all regular cases (e.g. stationary ergodic) diminishes with the length of the measurement interval T , so we ignore it for the sake of our discussion. From Eq. (12) and Eq. (14), we note that by using the aggregate CCDF of intercontact time to estimate the CCDF of the residual intercontact time, this in fact corresponds to time averaging and averaging over device pairs. This viewpoint may differ substantially from the viewpoint at a speci?c time of day due to time nonstationarity of device contacts. We explore this non stationarity in the following section.
5.2.3
Time of day viewpoint
5.2.2
Timeaverage viewpoint
In performance analyses of forwarding schemes, the CCDF of residual time until contact between two devices from an observation time is often derived from the CCDF of intercontact time sampled at contact instants of this device pair. The latter is often estimated by the aggregate CCDF of intercontact time. We would like to understand what does this residual time CCDF correspond to when we use the aggregate CCDF of intercontact time. We will see that this residual intercontact time distribution, in fact, corresponds to an observation time sampled uniformly at random on the measurement interval and for a device pair sampled uniformly at random. Hence, the resulting viewpoint is that of time averaging and averaging over device pairs. We revisit the earlier setting and now consider the fraction of device pairs for which the residual time until next contact is larger than t ≥ 0 as observed from a time instant s, i.e. 1 ? F (t, s) = P
p 1(TNp (s)+1 ? s > t). p∈P
(12)
By averaging over the measurement interval, we have 1 ? F (t, T ) = T
T
? F (t, s)ds.
0
(13)
We now con?rm from our datasets that device inter contacts exhibit strong timeofday nonstationarity. It is important to note presence of this nonstationarity as a claim based on the timeaverage viewpoint may not hold for the viewpoint of a particular time of day. Fig. 13 presents three sets of results highlighting the effects of time nonstationarity. In Fig. 13left, we show the mean residual intercontact time over all device pairs versus the time for the Infocom trace. The ?gure shows strong dependency on the time of day, with day and night periods resulting in the mean residual time ranging from about 6 to 17 hours. This effect is also evident in the aggregate CCDF of the residual intercontact time in Fig. (13)middle, where we plot the CCDF for three distinct times within the same day (midday, early evening, after midnight). We observe a signi?cant variation across the three curves in accordance with the mean variation (Fig. 13left). We further looked at the aggregate CCDF of intercontact time conditional on whether intercontact time cross over midnight or not for the UCSD trace. Fig. (13)right shows the discrepancy of the respective conditional distributions. In summary, the results con?rm the intuition that device contacts would typically exhibit strong time nonstationarity and particular time of day viewpoints may differ much from the timeaverage viewpoint. We subsequently demonstrate the time of day dependence by examining the contact durations. Fig. 14top shows samples of contact durations per device pair for the vehicular trace. These samples suggest a dichotomy of contact durations consisting of (a) short
erously provided us with the vehicular traces that they had collected and we used for analysis in this paper.
7.
REFERENCES
1
Vehicular
Contact duration CDF
0.8 0.6 0.4 Contacts at 4pm 0.2 0 0
Contacts at 9am
2
4
6 8 Time (hours)
10
12
Figure 14: TOP: Samples of contact durations over all device pairs
indicate dichotomy of contact durations. BOTTOM: CDFs of contact duration at two distinct times of day.
contacts in the order of half a minute, and (b) long contacts in the order of 10 hours. Examining the trace, we found that the short contacts occur while two vehicles drive by each other, while long contacts take place for spatially collocated vehicles during working hours. Fig. 14bottom further con?rms the previous discussion by showing two CDFs of samples of contact durations for devicepairs that initiated contacts within the hours of 9AM and 4PM. As previously mentioned, these distributions suggest that long contact durations occur during working hours while at other times short contacts may be more frequent.
6.
CONCLUDING REMARKS
The dichotomy hypothesis—power law decay of intercontact time distribution up to a point and exponential decay beyond— which we observed to hold across diverse mobility traces, implies that existing predictions on the performance of forwarding schemes based on the powerlaw tail might be overly pessimistic. The dichotomy is not at odds with current mobility models since we show that already simple models support it. The empirical results suggest the dichotomy to hold also for intercontact time between a mobile device and its frequently visited site, which may inform design of opportunistic communication systems provisioned with stationary infrastructure nodes. The diversity of viewpoints such as per device pair and at a time of day may deviate from the average viewpoint derived from the intercontact time characterization widely used in previous studies and also considered in this paper. Future work may study further the underlying mobility patterns to understand better the ?rst principles that induce the observed aggregate behavior of contact opportunities.
Acknowledgements
We are grateful to those who made their mobility traces publicly available, i.e., MIT [5, 4], UCSD [13], and Infocom [2, 15]. We also thank our colleagues John Krumm and Eric Horvitz who gen
[1] J.Y. L. Boudec and M. Vojnovi? . The Random Trip Model: c Stability, Stationary Regime, and Perfect Simulation. IEEE/ACM Trans. on Networking, 14(6):1153–1166, Dec 2006. [2] A. Chaintreau, P. Hui, J. Crowcroft, C. Diot, R. Gass, and J. Scott. Impact of Human Mobility on the Design of Opportunistic Forwarding Algorithms. In INFOCOM, 2006. [3] E. Cinlar. Introduction to Stochastic Processes. Prentice Hall, 1 edition, 1975. [4] N. Eagle and A. Pentland. CRAWDAD data set mit/reality (v. 20050701), July 2005. [5] N. Eagle and A. Pentland. Reality mining: Sensing complex social systems. In Journal of Personal and Ubiquitous Computing, 2005. [6] A. E. Gamal, J. Mammen, B. Prabhakar, and D. Shah. Optimal Throughputdelay Scaling in Wireless Networks – Part I: The Fluid Model. IEEE Trans. on Information Theory, 52(6):2568–2592, June 2006. [7] M. Grossglauser and D. Tse. Mobility increases the capacity of ad hoc wireless networks. IEEE/ACM Trans. on Networking, 10(4):477–486, 2002. [8] Intelligent Transportation Systems Standards Program. Dedicated Short Range Communications, April 2003. http://www.standards.its.dot.gov/Documents/advisories/ dsrc_advisory.htm. [9] D. B. Johnson and D. A. Maltz. Dynamic Source Routing in Ad Hoc Wireless Networks. In Mobile Computing. 1996. [10] T. Karagiannis, J.Y. L. Boudec, and M. Vojnovic. Power law and exponential decay of inter contact times between mobile devices. Technical Report MSRTR200724, Microsoft Research, March 2007. [11] J. Krumm and E. Horvitz. The Microsoft Multiperson Location Survey, August 2005. Microsoft Research Technical Report, MSRTR200513. [12] N. F. Maxemchuk. Routing in the Manhattan Street Network. In IEEE Trans. on Comm., volume 35, pages 503–512, 1987. [13] M. McNett and G. M. Voelker. Access and mobility of wireless pda users. In Mobile Computing Communications Review, 2005. [14] J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau. CRAWDAD data set cambridge/haggle (v. 20060131), Jan. 2006. [15] J. Scott, R. Gass, J. Crowcroft, P. Hui, C. Diot, and A. Chaintreau. CRAWDAD trace cambridge/haggle/ imote/infocom (v. 20060131), Jan. 2006. [16] E. Seneta. NonNegative Matrices. Wiley and Sons, 1 ed., 1973. [17] G. Sharma and R. R. Mazumdar. Delay and Capacity Tradeoff in Wireless Ad Hoc Networks with Random Waypoint Mobility, 2005. Preprint, School of ECE, Purdue University, 2005. [18] F. Spitzer. Principles of Random Walk, Graduate Texts in Mathematics. Springer, 2nd edition, 1964. [19] W. Zhao, Y. Chen, M. Ammar, M. D. Corner, B. N. Levine, and E. Zegura. Capacity Enhancement using Throwboxes in DTNs. In Proc. IEEE Intl Conf on Mobile Ad hoc and Sensor Systems (MASS), Oct 2006.