Title: | Systematic Comparison of Trip Distribution Laws and Models |
---|---|
Description: | The main purpose of this package is to propose a rigorous framework to fairly compare trip distribution laws and models as described in Lenormand et al. (2016) <doi:10.1016/j.jtrangeo.2015.12.008>. |
Authors: | Maxime Lenormand [aut, cre] |
Maintainer: | Maxime Lenormand <[email protected]> |
License: | GPL-3 |
Version: | 1.1.1 |
Built: | 2025-03-06 10:30:13 UTC |
Source: | https://github.com/epivec/tdlm |
This function returns an estimate of the optimal parameter value based on the average surface area of the locations (in square kilometers) according to the law. This estimation has only been tested on commuting data (in kilometers).
calib_param(av_surf, law = "NGravExp")
calib_param(av_surf, law = "NGravExp")
av_surf |
A positive |
law |
A |
The estimation is based on Figure 8 in Lenormand et al.
(2016) for four types of laws: the normalized gravity law with an exponential
distance decay function (law = "NGravExp"
), the normalized gravity law with
a power distance decay function (law = "NGravPow"
), Schneider's
intervening opportunities law (law = "Schneider"
), and the extended
radiation law (law = "RadExt"
).
An estimate of the optimal parameter value based on the average surface area of the locations.
Maxime Lenormand ([email protected])
Lenormand M, Bassolas A, Ramasco JJ (2016) Systematic comparison of trip distribution laws and models. Journal of Transport Geography 51, 158-169.
Associated functions:
extract_opportunities()
extract_spatial_information()
check_format_names()
data(county) res <- extract_spatial_information(county, id = "ID") av_surf <- mean(res$surface) calib_param(av_surf = av_surf, law = "NGravExp") calib_param(av_surf = av_surf, law = "NGravPow") calib_param(av_surf = av_surf, law = "Schneider") calib_param(av_surf = av_surf, law = "RadExt")
data(county) res <- extract_spatial_information(county, id = "ID") av_surf <- mean(res$surface) calib_param(av_surf = av_surf, law = "NGravExp") calib_param(av_surf = av_surf, law = "NGravPow") calib_param(av_surf = av_surf, law = "Schneider") calib_param(av_surf = av_surf, law = "RadExt")
This function checks that the TDLM's inputs have the required format (and names).
check_format_names(vectors, matrices = NULL, check = "format_and_names")
check_format_names(vectors, matrices = NULL, check = "format_and_names")
vectors |
A |
matrices |
A |
check |
A |
The TDLM
's inputs should be based on the same number of
locations sorted in the same order. check = "format"
will run basic checks
to ensure that the structure of the inputs (dimensions, class, type...) is
correct.
It is recommended to use the location ID as vector
names
, matrix
rownames
, and matrix
colnames
. Set check = "format_and_names"
to
check the inputs' names. The checks are run successively, so run the function
as many times as needed to get the message indicating that the inputs passed
the check successfully.
A message indicating if the check has passed or failed.
Maxime Lenormand ([email protected])
data(mass) data(distance) mi <- as.numeric(mass[, 1]) names(mi) <- rownames(mass) mj <- mi check_format_names( vectors = list(mi = mi, mj = mj), matrices = list(distance = distance), check = "format_and_names" )
data(mass) data(distance) mi <- as.numeric(mass[, 1]) names(mi) <- rownames(mass) mj <- mi check_format_names( vectors = list(mi = mi, mj = mj), matrices = list(distance = distance), check = "format_and_names" )
A dataset containing the geographical coordinates of US Kansas counties' centroids in 2000 (Longitude/Latitude).
coords
coords
Longitude coordinate of the centroid of the county.
Latitude coordinate of the centroid of the county.
https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html
A dataset containing the geographical coordinates of US Kansas counties' centroids in 2000 (X/Y based on Web Mercator).
coords_xy
coords_xy
X coordinate of the centroid of the county.
Y coordinate of the centroid of the county.
https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html
A dataset containing the geometry of 105 US Kansas counties.
county
county
County ID.
Longitude coordinate of the centroid of the county.
Latitude coordinate of the centroid of the county.
Surface area of the county (in square kilometers).
Geometry of the county.
https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html
A dataset containing the great-circle distance (in kilometers) between 105 US Kansas counties.
distance
distance
A matrix
with 105 rows and 105 columns. Each element of the
matrix represents the distance between two counties. County IDs are used as
row names and column names.
https://www.census.gov/geographies/mapping-files/time-series/geo/carto-boundary-file.html
This function computes the distance between pairs of locations based on geographical coordinates.
extract_distances( coords, method = "Haversine", id = NULL, show_progress = FALSE )
extract_distances( coords, method = "Haversine", id = NULL, show_progress = FALSE )
coords |
A two-column |
method |
A |
id |
A vector with length equal to the number of locations, used as
row names and column names for the output distance matrix (optional, |
show_progress |
A boolean indicating whether a progress bar should be displayed. |
coords
must contain two columns: the first one for the longitude
or "X" coordinates, and the second one for the latitude or "Y" coordinates.
The "Haversine"
method is used to compute great-circle distances from
longitude/latitude, while the "Euclidean"
method should be used for "X/Y"
coordinates.
A square matrix representing the distance (in kilometers) between locations.
The outputs are based on the locations contained in coords
, sorted
in the same order. An optional id
can also be provided to be used as
names for the outputs.
Maxime Lenormand ([email protected])
Associated functions: extract_opportunities()
data(coords) distance <- extract_distances(coords = coords, method = "Haversine", id = rownames(coords))
data(coords) distance <- extract_distances(coords = coords, method = "Haversine", id = rownames(coords))
This function computes the number of opportunities between pairs of locations as defined in Lenormand et al. (2016). For a given pair of locations, the number of opportunities between the origin location and the destination location is based on the number of opportunities within a circle of radius equal to the distance between the origin and the destination, with the origin location as the center. The number of opportunities at the origin and destination locations are not included.
extract_opportunities(opportunity, distance, check_names = FALSE)
extract_opportunities(opportunity, distance, check_names = FALSE)
opportunity |
A |
distance |
A squared |
check_names |
A |
A squared matrix
in which each element represents the number of
opportunities between a pair of locations.
opportunity
and distance
should be based on the same number of
locations sorted in the same order. It is recommended to use the location ID
as matrix
rownames
and matrix
colnames
and to set
check_names = TRUE
to verify that everything is consistent before running
this function (check_names = FALSE
by default). Note that the function
check_format_names()
can be used to validate all inputs
before running the main package's functions.
Maxime Lenormand ([email protected])
Lenormand M, Bassolas A, Ramasco JJ (2016) Systematic comparison of trip distribution laws and models. Journal of Transport Geography 51, 158-169.
Associated functions:
extract_distances()
, extract_spatial_information()
.
data(mass) data(distance) opportunity <- mass[, 1] sij <- extract_opportunities(opportunity = opportunity, distance = distance, check_names = FALSE)
data(mass) data(distance) opportunity <- mass[, 1] sij <- extract_opportunities(opportunity = opportunity, distance = distance, check_names = FALSE)
This function returns a matrix
of distances between locations (in
kilometers) along with a vector of surface areas for the locations (in square
kilometers).
extract_spatial_information(geometry, id = NULL, show_progress = FALSE)
extract_spatial_information(geometry, id = NULL, show_progress = FALSE)
geometry |
A spatial object that can be handled by the |
id |
The name or number of the column to use as |
show_progress |
A |
The geometry
must be projected in a valid coordinate reference
system. It will be reprojected in degrees longitude/latitude to compute the
great-circle distances between centroids of locations using an internal
function and to compute the surface area using the function
st_area from the
sf package.
A list
composed of two elements. The first element is a square matrix
representing the great-circle distances (in kilometers) between locations.
The second element is a vector containing the surface area of each location
(in square kilometers).
The outputs are based on the locations contained in geometry
and
sorted in the same order. An optional id
can also be provided to be used as
names for the outputs.
Maxime Lenormand ([email protected])
'Associated functions:
extract_distances()
extract_opportunities()
data(county) res <- extract_spatial_information(county, id = "ID") dim(res$distance) length(res$surface)
data(county) res <- extract_spatial_information(county, id = "ID") dim(res$distance) length(res$surface)
This function returns a data.frame
where each row provides one or
several goodness-of-fit measures between a simulated and an observed
Origin-Destination (OD) matrix.
gof( sim, obs, measures = "all", distance = NULL, bin_size = 2, use_proba = FALSE, check_names = FALSE )
gof( sim, obs, measures = "all", distance = NULL, bin_size = 2, use_proba = FALSE, check_names = FALSE )
sim |
An object of class |
obs |
A square |
measures |
A |
distance |
A square |
bin_size |
A |
use_proba |
A |
check_names |
A |
Several goodness-of-fit measures are considered, such as the Common Part of Commuters (CPC), the Common Part of Links (CPL), and the Common Part of Commuters based on the distance (CPC_d), as described in Lenormand et al. (2016). It also includes classical metrics such as the Normalized Root Mean Square Error (NRMSE), the Kullback–Leibler divergence (KL), and the Kolmogorov-Smirnov statistic and p-value (KS). These measures are based on the observed and simulated flow distance distributions and are computed using the ks_test function from the Ecume package.
A data.frame
providing one or several goodness-of-fit measures between
simulated OD(s) and an observed OD. Each row corresponds to a matrix sorted
according to the list (or list of lists) elements (names are used if
provided).
By default, if sim
is an output of run_law_model()
,
the measure(s) are computed only for the simulated OD matrices and
not for the proba
matrix (included in the output when
write_proba = TRUE
). The argument use_proba
can be used to compute the
measure(s) based on the proba
matrix instead of the simulated
OD matrix. In this case, the argument obs
should also be a proba
matrix.
All inputs should be based on the same number of
locations, sorted in the same order. It is recommended to use the location ID
as matrix
rownames
and matrix
colnames
and to set
check_names = TRUE
to verify that everything is consistent before running
this function (check_names = FALSE
by default). Note that the function
check_format_names()
can be used to validate all inputs
before running the main package's functions.
Maxime Lenormand ([email protected])
Lenormand M, Bassolas A, Ramasco JJ (2016) Systematic comparison of trip distribution laws and models. Journal of Transport Geography 51, 158-169.
For more details illustrated with a practical example, see the vignette: https://epivec.github.io/TDLM/articles/TDLM.html#goodness-of-fit-measures.
Associated functions:
run_law()
, run_model()
, run_law_model()
.
data(mass) data(distance) data(od) mi <- as.numeric(mass[, 1]) mj <- mi Oi <- as.numeric(mass[, 2]) Dj <- as.numeric(mass[, 3]) res <- run_law_model(law = "GravExp", mass_origin = mi, mass_destination = mj, distance = distance, opportunity = NULL, param = 0.01, model = "DCM", nb_trips = NULL, out_trips = Oi, in_trips = Dj, average = FALSE, nbrep = 1, maxiter = 50, mindiff = 0.01, write_proba = FALSE, check_names = FALSE) gof(sim = res, obs = od, measures = "CPC", distance = NULL, bin_size = 2, use_proba = FALSE, check_names = FALSE)
data(mass) data(distance) data(od) mi <- as.numeric(mass[, 1]) mj <- mi Oi <- as.numeric(mass[, 2]) Dj <- as.numeric(mass[, 3]) res <- run_law_model(law = "GravExp", mass_origin = mi, mass_destination = mj, distance = distance, opportunity = NULL, param = 0.01, model = "DCM", nb_trips = NULL, out_trips = Oi, in_trips = Dj, average = FALSE, nbrep = 1, maxiter = 50, mindiff = 0.01, write_proba = FALSE, check_names = FALSE) gof(sim = res, obs = od, measures = "CPC", distance = NULL, bin_size = 2, use_proba = FALSE, check_names = FALSE)
A dataset containing the number of inhabitants, in-commuters, and out-commuters for 105 US Kansas counties in 2000.
mass
mass
A data.frame
with 105 rows and 4 columns:
County ID.
Number of inhabitants.
Number of out-commuters.
Number of in-commuters.
https://www2.census.gov/programs-surveys/decennial/tables/2000/county-to-county-worker-flow-files/
A dataset containing the number of commuters between 105 US Kansas counties in 2000.
od
od
A matrix
with 105 rows and 105 columns. Each element of the
matrix represents the number of commuters between two counties. County IDs
are used as row names and column names.
https://www2.census.gov/programs-surveys/decennial/tables/2000/county-to-county-worker-flow-files/
This function estimates mobility flows using different distribution laws and models. As described in Lenormand et al. (2016), the function uses a two-step approach to generate mobility flows by separating the trip distribution law (gravity or intervening opportunities) from the modeling approach used to generate the flows based on this law. This function only uses the first step to generate a probability distribution based on the different laws.
run_law( law = "Unif", mass_origin, mass_destination = mass_origin, distance = NULL, opportunity = NULL, param = NULL, check_names = FALSE )
run_law( law = "Unif", mass_origin, mass_destination = mass_origin, distance = NULL, opportunity = NULL, param = NULL, check_names = FALSE )
law |
A |
mass_origin |
A |
mass_destination |
A |
distance |
A squared |
opportunity |
A squared |
param |
A |
check_names |
A |
We compute the matrix proba
estimating the probability to observe a
trip from one location to another. This probability is based on the demand
(argument mass_origin
) and the attractiveness (argument
mass_destination
). Note that the population is typically used as a
surrogate for both quantities (this is why mass_destination = mass_origin
by default). It also depends on the distance between locations
(argument distance
) OR the number of opportunities between locations
(argument opportunity
) depending on the chosen law. Both the effect of the
distance and the number of opportunities can be adjusted with a parameter
(argument param
) except for the original radiation law and the uniform law.
In this package we consider eight probabilistic laws described in details in Lenormand et al. (2016). Four gravity laws (Barthelemy, 2011), three intervening opportunity laws (Schneider, 1959; Simini et al., 2012; Yang et al., 2014) and a uniform law.
Gravity law with an exponential distance decay function
(law = "GravExp"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Normalized gravity law with an exponential distance decay function
(law = "NGravExp"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Gravity law with a power distance decay function
(law = "GravPow"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Normalized gravity law with a power distance decay function
(law = "NGravPow"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Schneider's intervening opportunities law (law = "Schneider"
). The
arguments mass_origin
, mass_destination
(optional), opportunity
and
param
will be used.
Radiation law (law = "Rad"
). The arguments mass_origin
,
mass_destination
(optional) and opportunity
will be used.
Extended radiation law (law = "RadExt"
). The arguments mass_origin
,
mass_destination
(optional), opportunity
and param
will be used.
Uniform law (law = "Unif"
). The argument mass_origin
will be used to
extract the number of locations.
An object of class TDLM
. An object of class TDLM
. A list
of list
of
matrice containing for each parameter value the matrix of probabilities
(called proba
). If length(param) = 1
or law = "Rad"
or law = "Unif"
only a list of matrices will be returned.
All inputs should be based on the same number of
locations, sorted in the same order. It is recommended to use the location ID
as matrix
rownames
and matrix
colnames
and to set
check_names = TRUE
to verify that everything is consistent before running
this function (check_names = FALSE
by default). Note that the function
check_format_names()
can be used to validate all inputs
before running the main package's functions.
Maxime Lenormand ([email protected])
Barthelemy M (2011). Spatial Networks. Physics Reports 499, 1-101.
Lenormand M, Bassolas A, Ramasco JJ (2016) Systematic comparison of trip distribution laws and models. Journal of Transport Geography 51, 158-169.
Schneider M (1959) Gravity models and trip distribution theory. Papers of the regional science association 5, 51-58.
Simini F, González MC, Maritan A & Barabási A (2012) A universal model for mobility and migration patterns. Nature 484, 96-100.
Yang Y, Herrera C, Eagle N & González MC (2014) Limits of Predictability in Commuting Flows in the Absence of Data for Calibration. Scientific Reports 4, 5662.
For more details illustrated with a practical example, see the vignette: https://epivec.github.io/TDLM/articles/TDLM.html#run-functions.
Associated functions:
run_law_model()
, run_model()
, gof()
.
data(mass) data(distance) mi <- as.numeric(mass[, 1]) mj <- mi res <- run_law( law = "GravExp", mass_origin = mi, mass_destination = mj, distance = distance, opportunity = NULL, param = 0.01, check_names = FALSE ) # print(res)
data(mass) data(distance) mi <- as.numeric(mass[, 1]) mj <- mi res <- run_law( law = "GravExp", mass_origin = mi, mass_destination = mj, distance = distance, opportunity = NULL, param = 0.01, check_names = FALSE ) # print(res)
This function estimates mobility flows using different distribution laws and models. As described in Lenormand et al. (2016), the function uses a two-step approach to generate mobility flows by separating the trip distribution law (gravity or intervening opportunities) from the modeling approach used to generate the flows based on this law.
run_law_model( law = "Unif", mass_origin, mass_destination = mass_origin, distance = NULL, opportunity = NULL, param = NULL, model = "UM", nb_trips = 1000, out_trips = NULL, in_trips = out_trips, average = FALSE, nbrep = 3, maxiter = 50, mindiff = 0.01, write_proba = FALSE, check_names = FALSE )
run_law_model( law = "Unif", mass_origin, mass_destination = mass_origin, distance = NULL, opportunity = NULL, param = NULL, model = "UM", nb_trips = 1000, out_trips = NULL, in_trips = out_trips, average = FALSE, nbrep = 3, maxiter = 50, mindiff = 0.01, write_proba = FALSE, check_names = FALSE )
law |
A |
mass_origin |
A |
mass_destination |
A |
distance |
A squared |
opportunity |
A squared |
param |
A |
model |
A |
nb_trips |
A |
out_trips |
A |
in_trips |
A |
average |
A |
nbrep |
An |
maxiter |
An |
mindiff |
A |
write_proba |
A |
check_names |
A |
First, we compute the matrix proba
estimating the probability to observe a
trip from one location to another. This probability is based on the demand
(argument mass_origin
) and the attractiveness (argument
mass_destination
). Note that the population is typically used as a
surrogate for both quantities (this is why mass_destination = mass_origin
by default). It also depends on the distance between locations
(argument distance
) OR the number of opportunities between locations
(argument opportunity
) depending on the chosen law. Both the effect of the
distance and the number of opportunities can be adjusted with a parameter
(argument param
) except for the original radiation law and the uniform law.
In this package we consider eight probabilistic laws described in details in Lenormand et al. (2016). Four gravity laws (Barthelemy, 2011), three intervening opportunity laws (Schneider, 1959; Simini et al., 2012; Yang et al., 2014) and a uniform law.
Gravity law with an exponential distance decay function
(law = "GravExp"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Normalized gravity law with an exponential distance decay function
(law = "NGravExp"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Gravity law with a power distance decay function
(law = "GravPow"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Normalized gravity law with a power distance decay function
(law = "NGravPow"
). The arguments mass_origin
, mass_destination
(optional), distance
and param
will be used.
Schneider's intervening opportunities law (law = "Schneider"
). The
arguments mass_origin
, mass_destination
(optional), opportunity
and
param
will be used.
Radiation law (law = "Rad"
). The arguments mass_origin
,
mass_destination
(optional) and opportunity
will be used.
Extended radiation law (law = "RadExt"
). The arguments mass_origin
,
mass_destination
(optional), opportunity
and param
will be used.
Uniform law (law = "Unif"
). The argument mass_origin
will be used to
extract the number of locations.
Second, we propose four constrained models to generate the flows from these
distribution of probability as described in Lenromand et al. (2016).
These models respect different level of constraints. These constraints can
preserve the total number of trips (argument nb_trips
) OR the number of
out-going trips (argument out_trips
) AND/OR the number of in-coming
(argument in_trips
) according to the model. The sum of out-going trips
should be equal to the sum of in-coming trips.
Unconstrained model (model = "UM"
). Only nb_trips
will be preserved
(arguments out_trips
and in_trips
will not be used).
Production constrained model (model = "PCM"
). Only out_trips
will be
preserved (arguments nb_trips
and in_trips
will not be used).
Attraction constrained model (model = "ACM"
). Only in_trips
will be
preserved (arguments nb_trips
and out_trips
will not be used).
Doubly constrained model (model = "DCM"
). Both out_trips
and
in_trips
will be preserved (arguments nb_trips
will not be used). The
doubly constrained model is based on an Iterative Proportional Fitting
process (Deming & Stephan, 1940). The arguments maxiter
(50 by
default) and mindiff
(0.01 by default) can be used to tune the model.
mindiff
is the minimal tolerated relative error between the
simulated and observed marginals. maxiter
ensures that the algorithm stops
even if it has not converged toward the mindiff
wanted value.
By default, when average = FALSE
, nbrep
matrices are generated from
proba
with multinomial random draws that will take different forms
according to the model used. In this case, the models will deal with positive
integers as inputs and outputs. Nevertheless, it is also possible to generate
an average matrix based on a multinomial distribution (based on an infinite
number of drawings). In this case, the models' inputs can be either positive
integer or real numbers and the output (nbrep = 1
in this case) will be a
matrix of positive real numbers.
An object of class TDLM
. A list
of list
of matrices containing for each
parameter value the nbrep
simulated matrices and the matrix of
probabilities (called proba
) if write_proba = TRUE
. If
length(param) = 1
or law = "Rad"
or law = "Unif"
only a list of
matrices will be returned.
All inputs should be based on the same number of
locations, sorted in the same order. It is recommended to use the location ID
as matrix
rownames
and matrix
colnames
and to set
check_names = TRUE
to verify that everything is consistent before running
this function (check_names = FALSE
by default). Note that the function
check_format_names()
can be used to validate all inputs
before running the main package's functions.
Maxime Lenormand ([email protected])
Barthelemy M (2011). Spatial Networks. Physics Reports 499, 1-101.
Deming WE & Stephan FF (1940) On a Least Squares Adjustment of a Sample Frequency Table When the Expected Marginal Totals Are Known. Annals of Mathematical Statistics 11, 427-444.
Lenormand M, Bassolas A, Ramasco JJ (2016) Systematic comparison of trip distribution laws and models. Journal of Transport Geography 51, 158-169.
Schneider M (1959) Gravity models and trip distribution theory. Papers of the regional science association 5, 51-58.
Simini F, González MC, Maritan A & Barabási A (2012) A universal model for mobility and migration patterns. Nature 484, 96-100.
Yang Y, Herrera C, Eagle N & González MC (2014) Limits of Predictability in Commuting Flows in the Absence of Data for Calibration. Scientific Reports 4, 5662.
For more details illustrated with a practical example, see the vignette: https://epivec.github.io/TDLM/articles/TDLM.html#run-functions.
Associated functions:
run_law()
, run_model()
, gof()
.
data(mass) data(distance) mi <- as.numeric(mass[, 1]) mj <- mi N <- 1000 res <- run_law_model(law = "GravExp", mass_origin = mi, mass_destination = mj, distance = distance, opportunity = NULL, param = 0.01, model = "UM", nb_trips = N, out_trips = NULL, in_trips = NULL, average = TRUE, nbrep = 2, maxiter = 50, mindiff = 0.01, write_proba = FALSE, check_names = FALSE) print(res)
data(mass) data(distance) mi <- as.numeric(mass[, 1]) mj <- mi N <- 1000 res <- run_law_model(law = "GravExp", mass_origin = mi, mass_destination = mj, distance = distance, opportunity = NULL, param = 0.01, model = "UM", nb_trips = N, out_trips = NULL, in_trips = NULL, average = TRUE, nbrep = 2, maxiter = 50, mindiff = 0.01, write_proba = FALSE, check_names = FALSE) print(res)
This function estimates mobility flows using different distribution laws and models. As described in Lenormand et al. (2016), the function uses a two-step approach to generate mobility flows by separating the trip distribution law (gravity or intervening opportunities) from the modeling approach used to generate the flows based on this law. This function only uses the second step to generate mobility flow based on a matrix of probabilities using different models.
run_model( proba, model = "UM", nb_trips = 1000, out_trips = NULL, in_trips = out_trips, average = FALSE, nbrep = 3, maxiter = 50, mindiff = 0.01, check_names = FALSE )
run_model( proba, model = "UM", nb_trips = 1000, out_trips = NULL, in_trips = out_trips, average = FALSE, nbrep = 3, maxiter = 50, mindiff = 0.01, check_names = FALSE )
proba |
A squared |
model |
A |
nb_trips |
A |
out_trips |
A |
in_trips |
A |
average |
A |
nbrep |
An |
maxiter |
An |
mindiff |
A |
check_names |
A |
We propose four constrained models to generate the flows from these
distribution of probability as described in Lenromand et al. (2016).
These models respect different level of constraints. These constraints can
preserve the total number of trips (argument nb_trips
) OR the number of
out-going trips (argument out_trips
) AND/OR the number of in-coming
(argument in_trips
) according to the model. The sum of out-going trips
should be equal to the sum of in-coming trips.
Unconstrained model (model = "UM"
). Only nb_trips
will be preserved
(arguments out_trips
and in_trips
will not be used).
Production constrained model (model = "PCM"
). Only out_trips
will be
preserved (arguments nb_trips
and in_trips
will not be used).
Attraction constrained model (model = "ACM"
). Only in_trips
will be
preserved (arguments nb_trips
and out_trips
will not be used).
Doubly constrained model (model = "DCM"
). Both out_trips
and
in_trips
will be preserved (arguments nb_trips
will not be used). The
doubly constrained model is based on an Iterative Proportional Fitting
process (Deming & Stephan, 1940). The arguments maxiter
(50 by
default) and mindiff
(0.01 by default) can be used to tune the model.
mindiff
is the minimal tolerated relative error between the
simulated and observed marginals. maxiter
ensures that the algorithm stops
even if it has not converged toward the mindiff
wanted value.
By default, when average = FALSE
, nbrep
matrices are generated from
proba
with multinomial random draws that will take different forms
according to the model used. In this case, the models will deal with positive
integers as inputs and outputs. Nevertheless, it is also possible to generate
an average matrix based on a multinomial distribution (based on an infinite
number of drawings). In this case, the models' inputs can be either positive
integer or real numbers and the output (nbrep = 1
in this case) will be a
matrix of positive real numbers.
An object of class TDLM
. A list
of matrices containing the nbrep
simulated matrices.
All inputs should be based on the same number of
locations, sorted in the same order. It is recommended to use the location ID
as matrix
rownames
and matrix
colnames
and to set
check_names = TRUE
to verify that everything is consistent before running
this function (check_names = FALSE
by default). Note that the function
check_format_names()
can be used to validate all inputs
before running the main package's functions.
Maxime Lenormand ([email protected])
Deming WE & Stephan FF (1940) On a Least Squares Adjustment of a Sample Frequency Table When the Expected Marginal Totals Are Known. Annals of Mathematical Statistics 11, 427-444.
Lenormand M, Bassolas A, Ramasco JJ (2016) Systematic comparison of trip distribution laws and models. Journal of Transport Geography 51, 158-169.
For more details illustrated with a practical example, see the vignette: https://epivec.github.io/TDLM/articles/TDLM.html#run-functions.
Associated functions:
run_law_model()
, run_law()
, gof()
.
data(mass) data(od) proba <- od / sum(od) Oi <- as.numeric(mass[, 2]) Dj <- as.numeric(mass[, 3]) res <- run_model( proba = proba, model = "DCM", nb_trips = NULL, out_trips = Oi, in_trips = Dj, average = FALSE, nbrep = 3, maxiter = 50, mindiff = 0.01, check_names = FALSE ) # print(res)
data(mass) data(od) proba <- od / sum(od) Oi <- as.numeric(mass[, 2]) Dj <- as.numeric(mass[, 3]) res <- run_model( proba = proba, model = "DCM", nb_trips = NULL, out_trips = Oi, in_trips = Dj, average = FALSE, nbrep = 3, maxiter = 50, mindiff = 0.01, check_names = FALSE ) # print(res)