src.calculate_moments

Module Contents

Functions

calculate_weekly_incidences_from_results(results, outcome, groupby=None)

Create the weekly incidences from a list of simulation runs.

smoothed_outcome_per_hundred_thousand_sim(df, outcome, groupby=None, window=DEFAULT_WINDOW, min_periods=DEFAULT_MIN_PERIODS, take_logs=DEFAULT_TAKE_LOGS, center=DEFAULT_CENTER)

Calculate a daily smoothed outcome on the per 100 000 people level on simulated data.

calculate_period_outcome_sim(df, outcome, groupby=None)

Calculate an outcome on a dataset of one period.

aggregate_and_smooth_period_outcome_sim(simulate_result, outcome, groupby=None, window=DEFAULT_WINDOW, min_periods=DEFAULT_MIN_PERIODS, take_logs=DEFAULT_TAKE_LOGS, center=DEFAULT_CENTER)

Aggregate and smooth a list of per period outcomes in simulate_results.

smoothed_outcome_per_hundred_thousand_rki(df, outcome, groupby=None, window=DEFAULT_WINDOW, min_periods=DEFAULT_MIN_PERIODS, group_sizes=None, take_logs=DEFAULT_TAKE_LOGS, center=DEFAULT_CENTER)

Calculated a smoothed outcome on the per 100 000 people level on empirical data.

_smooth_and_scale_daily_outcome_per_individual(sr, window, min_periods, groupby, take_logs, center)

_process_inputs(window, min_periods, groupby)

Attributes

DEFAULT_WINDOW = 7[source]
DEFAULT_TAKE_LOGS = True[source]
DEFAULT_CENTER = False[source]
DEFAULT_MIN_PERIODS = 1[source]
calculate_weekly_incidences_from_results(results, outcome, groupby=None)[source]

Create the weekly incidences from a list of simulation runs.

Parameters

results (list) – list of dask DataFrames with the time series data from sid simulations.

Returns

every column is the

weekly incidence over time for one simulation run. The index are the dates of the simulation period if groupby is None, else the index is a MultiIndex with date and the groups.

Return type

weekly_incidences (pandas.DataFrame)

smoothed_outcome_per_hundred_thousand_sim(df, outcome, groupby=None, window=DEFAULT_WINDOW, min_periods=DEFAULT_MIN_PERIODS, take_logs=DEFAULT_TAKE_LOGS, center=DEFAULT_CENTER)[source]

Calculate a daily smoothed outcome on the per 100 000 people level on simulated data.

Parameters
  • df (pandas.DataFrame or dask.dataframe) – Simulated time series.

  • outcome (str) – Selects a column in df.

  • groupby (list, str or None) – Defines the subgroups for which the outcome is calculated.

  • window (int) – Over how many days results are averaged to smooth the outcome.

  • min_periods (int) – Minimum number of days that need to be present in the smoothing window for the outcome to be not NaN.

  • take_logs (int) – Whether the log of the outcome should be returned. If True, smoothing is already done in logs.

  • center (bool) – Whether the smoothing window is centered or forward looking.

Returns

Series with a smoothed outcome. The first index level is date. If

groupby is specified, there are additional index levels.

Return type

pd.Series

calculate_period_outcome_sim(df, outcome, groupby=None)[source]

Calculate an outcome on a dataset of one period.

This uses a groupby over the date column such that the date is preserved as the first index level of the result. Only meant to be used during the msm estimation.

Parameters
  • df (pandas.DataFrame) – Simulated states DataFrame for one period.

  • outcome (str) – Selects a column in df.

  • groupby (list, str or None) – Defines the subgroups for which the outcome is calculated.

Returns

Series with an unsmoothed outcome for one day. The first index

level is date, even though it is meant to be the same for all entries. If groupby is specified, there are additional index levels.

Return type

pd.Series

aggregate_and_smooth_period_outcome_sim(simulate_result, outcome, groupby=None, window=DEFAULT_WINDOW, min_periods=DEFAULT_MIN_PERIODS, take_logs=DEFAULT_TAKE_LOGS, center=DEFAULT_CENTER)[source]

Aggregate and smooth a list of per period outcomes in simulate_results.

Parameters
  • simulate_results (dict) – Dictionary with a “period_outputs” entry.

  • outcome (str) – The name of the outcome in simulate_result[“period_outputs”] that should be selected.

  • groupby (list, str or None) – Defines the subgroups for which the outcome is calculated.

  • window (int) – Over how many days results are averaged to smooth the outcome.

  • min_periods (int) – Minimum number of days that need to be present in the smoothing window for the outcome to be not NaN.

  • take_logs (int) – Whether the log of the outcome should be returned. If True, smoothing is already done in logs.

  • center (bool) – Whether the smoothing window is centered or forward looking.

Returns

Series with a smoothed outcome. The first index level is date. If

groupby is specified, there are additional index levels.

Return type

pd.Series

smoothed_outcome_per_hundred_thousand_rki(df, outcome, groupby=None, window=DEFAULT_WINDOW, min_periods=DEFAULT_MIN_PERIODS, group_sizes=None, take_logs=DEFAULT_TAKE_LOGS, center=DEFAULT_CENTER)[source]

Calculated a smoothed outcome on the per 100 000 people level on empirical data.

Parameters
  • df (pandas.DataFrame) – Empirical dataset.

  • outcome (str) – Selects a column in df.

  • groupby (list, str or None) – Defines the subgroups for which the outcome is calculated.

  • window (int) – Over how many days results are averaged to smooth the outcome.

  • min_periods (int) – Minimum number of days that need to be present in the smoothing window for the outcome to be not NaN.

  • take_logs (int) – Whether the log of the outcome should be returned. If True, smoothing is already done in logs.

  • center (bool) – Whether the smoothing window is centered or forward looking.

Returns

Series with a smoothed outcome. The first index level is date. If

groupby is specified, there are additional index levels.

Return type

pd.Series

_smooth_and_scale_daily_outcome_per_individual(sr, window, min_periods, groupby, take_logs, center)[source]
_process_inputs(window, min_periods, groupby)[source]