src.create_initial_states.add_weekly_ids

Module Contents

Functions

add_weekly_ids(states, weekly_dist, seed, query, col_prefix, county_assortativeness)

Add a column for every possible weekly contact.

_create_pairs(states, nr_of_weekly_contacts, county_assortativeness, seed)

_create_pairs_numba(to_match, indexer, first_stage_cum_probs, group_codes_per_individual, seed)

param to_match

2d boolean array with one row per individual

_create_participation_array(nr_of_contacts, seed)

Draw randomly in which pairs an individual participates.

_create_group_indexer(states: pandas.DataFrame, assort_by: Dict[str, List[str]]) → numba.typed.List

Create the group indexer.

add_weekly_ids(states, weekly_dist, seed, query, col_prefix, county_assortativeness)[source]

Add a column for every possible weekly contact.

We draw from the number of weekly contacts distribution in how many weekly contact models a person participates and then randomly choose in which column she’ll be paired with someone. Lastly, for each column we match people who participate in the respective contact model with a specified geographic assortativeness.

Parameters
  • states (pandas.DataFrame) – sid states DataFrame

  • weekly_dist (pandas.Series) – the index is the support of the number of weekly contacts that are possible, the values are the frequencies in the synthetic population we aim for of each number of weekly contacts. One pair id column will be created for every possible contact.

  • seed (int) – seed.

  • query (str) – query which subset of the population participates. If None, everyone is grouped. (e.g. “occupation ==’working’”)

  • col_prefix (str) – prefix for the columns to be created.

  • county_assortativeness (float) – share of weekly contacts that should belong to the same county.

Returns

sit states DataFrame with additional columns

specifying the weekly work group pairs.

Return type

states (pandas.DataFrame)

_create_pairs(states, nr_of_weekly_contacts, county_assortativeness, seed)[source]
_create_pairs_numba(to_match, indexer, first_stage_cum_probs, group_codes_per_individual, seed)[source]
Parameters
  • to_match (np.ndarry) – 2d boolean array with one row per individual and one column sub-contact model.

  • indexer (numba.List) – Numba list that maps id of county to a numpy array with the row positions of all individuals from that county.

  • first_stage_cum_probs (numpy.ndarray) – Array of shape n_group, n_groups. cum_probs[i, j] is the probability that an individual from group i meets someone from group j or lower.

  • group (np.ndarray) – 1d array with assortative matching group ids, coded as integers.

Returns

2d integer array with meeting ids.

Return type

pairs_of_workers (np.ndarray)

_create_participation_array(nr_of_contacts, seed)[source]

Draw randomly in which pairs an individual participates.

Parameters
  • nr_of_contacts (pandas.Series) – number of contacts, i.e. number of pairs in which every individual is supposed to participate. The specific pair columns will be randomly drawn here.

  • seed (int) – seed

Returns

boolean array of shape

(len(nr_of_contacts), nr_of_contacts.max()). If participation[i, mod] is True, individual i was drawn to participate in mod.

Return type

participation (numpy.ndarray)

_create_group_indexer(states: pandas.DataFrame, assort_by: Dict[str, List[str]]) numba.typed.List[source]

Create the group indexer.

The indexer is a list where the positions correspond to the group number defined by assortative variables. The values inside the list are one-dimensional integer arrays containing the indices of states belonging to the group.

If there are no assortative variables, all individuals are assigned to a single group with code 0 and the indexer is a list where the first position contains all indices of states.

For efficiency reasons, we assign each group a number instead of identifying by the values of the assort_by variables directly.

Note: This function is from sid commit 206886a14eeb3257deb71db91aba4e7fb2385fc2.

Parameters
  • states (pandas.DataFrame) – The states.

  • assort_by (List[str]) – List of variables that influence matching probabilities.

Returns

The i_th entry are the indices of the i_th group.

Return type

indexer (numba.typed.List)