Add Reporting Information to `print/summary.epi_df` by JavierMtzRdz · Pull Request #691 · cmu-delphi/epiprocess

JavierMtzRdz · 2026-02-28T01:51:16Z

Checklist

Please:

Make sure this PR is against "dev", not "main" (unless this is a release
PR).
Request a review from one of the current main reviewers:
brookslogan, nmdefries.
Makes sure to bump the version number in DESCRIPTION. Always increment
the patch version number (the third number), unless you are making a
release PR from dev to main, in which case increment the minor version
number (the second number).
Describe changes made in NEWS.md, making sure breaking changes
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
1.7.2, then write your changes under the 1.8 heading).
Styling and documentation checks. Make a PR comment with:
- /style to check the style and fix any issues.
- /document to check the package documentation and fix any issues.
- /preview-docs to preview the docs.
- See Actions GitHub tab to track progress of these commands.
See DEVELOPMENT.md for more information on the development
process.

Add Reporting Information to `print/summary.epi_df`

The print.epi_df method now provides a signal-level latency report that calculates reporting lags relative to the as_of metadata. Signals with notable latencies are marked with an alert flag, and the output specifically identifies any lagging keys. For objects with many signals, the output is now truncated with a summary of the remaining variables to preserve readability.

The summary.epi_df method has been expanded to include a regularity analysis, as mentioned in #688. This feature identifies whether the minimum and maximum time values are even or uneven across all epikeys. The summary also diagnoses implicit and explicit gaps.

I included the time analysis in summary.epi_df because print.epi_df was becoming too long. They are provided as helper functions in case we want to move them around. Though it may be more efficient to combine them if they are put together.

musing: Using cli to print those summaries could improve their appearance.

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Resolves In print.epi_df: add notes about even/uneven min and max time_value by epikey, whether there are gaps, implicit or explicit #688

Examples:

# Demonstrating Improved print.epi_df and summary.epi_df reporting
pkgload::load_all(".")
#> ℹ Loading epiprocess
#> Loading required package: epidatasets
#> 
#> Registered S3 method overwritten by 'tsibble':
#>   method               from 
#>   as_tibble.grouped_df dplyr
library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.5.2
#> 
#> Attaching package: 'dplyr'
#> 
#> The following object is masked from 'package:epiprocess':
#> 
#>     between
#> 
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> 
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

# Setup basic parameters
start_date <- as.Date("2024-01-01")
as_of_date <- as.Date("2024-01-15")

# Standard clean data
(case1 <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = c(start_date + 0:5, start_date + 0:5),
  value = 1:12
) %>% as_epi_df(as_of = as_of_date))
#> An `epi_df` object, 12 x 3 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-15
#> Latency info:
#> * value: lag 9 days (max time 2024-01-06) (!)
#> (!): notable latency (lag > 7 days, lagging keys, or deviates from mode by > 7 days)
#> 
#> # A tibble: 12 × 3
#>    geo_value time_value value
#>  * <chr>     <date>     <int>
#>  1 ca        2024-01-01     1
#>  2 ca        2024-01-02     2
#>  3 ca        2024-01-03     3
#>  4 ca        2024-01-04     4
#>  5 ca        2024-01-05     5
#>  6 ca        2024-01-06     6
#>  7 hi        2024-01-01     7
#>  8 hi        2024-01-02     8
#>  9 hi        2024-01-03     9
#> 10 hi        2024-01-04    10
#> 11 hi        2024-01-05    11
#> 12 hi        2024-01-06    12

summary(case1)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-15
#> ----------
#> * min time value              = 2024-01-01 (even across epikeys)
#> * max time value              = 2024-01-06 (even across epikeys)
#> * time gaps                   = none detected
#> * average rows per time value = 2


# Uneven coverage and gaps
(edf_uneven <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = c(start_date + 0:5, start_date + 0:5),
  value = 1:12
))
#> # A tibble: 12 × 3
#>    geo_value time_value value
#>    <chr>     <date>     <int>
#>  1 ca        2024-01-01     1
#>  2 ca        2024-01-02     2
#>  3 ca        2024-01-03     3
#>  4 ca        2024-01-04     4
#>  5 ca        2024-01-05     5
#>  6 ca        2024-01-06     6
#>  7 hi        2024-01-01     7
#>  8 hi        2024-01-02     8
#>  9 hi        2024-01-03     9
#> 10 hi        2024-01-04    10
#> 11 hi        2024-01-05    11
#> 12 hi        2024-01-06    12
edf_uneven <- edf_uneven[-7, ] # 'hi' starts at day 2 (uneven min)
edf_uneven <- edf_uneven[-3, ] # 'ca' missing day 3 (implicit gap)

(case2 <- as_epi_df(edf_uneven, as_of = as_of_date))
#> An `epi_df` object, 10 x 3 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-15
#> Latency info:
#> * value: lag 9 days (max time 2024-01-06) (!)
#> (!): notable latency (lag > 7 days, lagging keys, or deviates from mode by > 7 days)
#> 
#> # A tibble: 10 × 3
#>    geo_value time_value value
#>  * <chr>     <date>     <int>
#>  1 ca        2024-01-01     1
#>  2 ca        2024-01-02     2
#>  3 ca        2024-01-04     4
#>  4 ca        2024-01-05     5
#>  5 ca        2024-01-06     6
#>  6 hi        2024-01-02     8
#>  7 hi        2024-01-03     9
#>  8 hi        2024-01-04    10
#>  9 hi        2024-01-05    11
#> 10 hi        2024-01-06    12
summary(case2)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-15
#> ----------
#> * min time value              = 2024-01-01 (uneven across epikeys)
#> * max time value              = 2024-01-06 (even across epikeys)
#> * time gaps                   = implicit (in 1/2 epikeys)
#> * average rows per time value = 1


# Explicit gaps and lagging
edf_lags <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = c(start_date + 0:5, start_date + 0:5),
  value = 1:12
)
edf_lags$value[12] <- NA # 'hi' ends at day 5 (lagging key vs ca)
edf_lags$value[2] <- NA # Explicit NA row for 'ca' at day 2

(case3 <- as_epi_df(edf_lags, as_of = start_date + 7) )
#> An `epi_df` object, 12 x 3 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-08
#> Latency info:
#> * value: lag 2 days (max time 2024-01-06); lagging keys: hi (!)
#> (!): notable latency (lag > 7 days, lagging keys, or deviates from mode by > 7 days)
#> 
#> # A tibble: 12 × 3
#>    geo_value time_value value
#>  * <chr>     <date>     <int>
#>  1 ca        2024-01-01     1
#>  2 ca        2024-01-02    NA
#>  3 ca        2024-01-03     3
#>  4 ca        2024-01-04     4
#>  5 ca        2024-01-05     5
#>  6 ca        2024-01-06     6
#>  7 hi        2024-01-01     7
#>  8 hi        2024-01-02     8
#>  9 hi        2024-01-03     9
#> 10 hi        2024-01-04    10
#> 11 hi        2024-01-05    11
#> 12 hi        2024-01-06    NA
summary(case3)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-08
#> ----------
#> * min time value              = 2024-01-01 (even across epikeys)
#> * max time value              = 2024-01-06 (uneven across epikeys)
#> * time gaps                   = explicit (in 2/2 epikeys)
#> * average rows per time value = 2


# Many signals
df_many <- tibble(geo_value = "ca", time_value = start_date + 0:5)
for (i in 1:10) {
  df_many[[paste0("sig", i)]] <- 1:6
}
df_many$sig5[5:6] <- NA

(case4 <- as_epi_df(df_many, as_of = start_date + 7))
#> An `epi_df` object, 6 x 12 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-08
#> Latency info:
#> * sig1: lag 2 days (max time 2024-01-06)
#> * sig2: lag 2 days (max time 2024-01-06)
#> * sig3: lag 2 days (max time 2024-01-06)
#> * ... and 7 other signals
#> 
#> # A tibble: 6 × 12
#>   geo_value time_value  sig1  sig2  sig3  sig4  sig5  sig6  sig7  sig8  sig9
#> * <chr>     <date>     <int> <int> <int> <int> <int> <int> <int> <int> <int>
#> 1 ca        2024-01-01     1     1     1     1     1     1     1     1     1
#> 2 ca        2024-01-02     2     2     2     2     2     2     2     2     2
#> 3 ca        2024-01-03     3     3     3     3     3     3     3     3     3
#> 4 ca        2024-01-04     4     4     4     4     4     4     4     4     4
#> 5 ca        2024-01-05     5     5     5     5    NA     5     5     5     5
#> 6 ca        2024-01-06     6     6     6     6    NA     6     6     6     6
#> # ℹ 1 more variable: sig10 <int>
summary(case4)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-08
#> ----------
#> * min time value              = 2024-01-01 (even across epikeys)
#> * max time value              = 2024-01-06 (even across epikeys)
#> * time gaps                   = none detected
#> * average rows per time value = 1

^{Created on 2026-02-27 with reprex v2.1.1}

…ency flags

…ct implicit/explicit gaps.

JavierMtzRdz · 2026-02-28T01:52:09Z

/style

JavierMtzRdz · 2026-02-28T02:27:05Z

/document

JavierMtzRdz · 2026-02-28T02:27:16Z

/preview-docs

…mmands workflow.

github-actions · 2026-02-28T06:36:09Z

🚀 Deployed on https://69a28cd1fbb453711951fddb--epiprocess.netlify.app

brookslogan

Thanks for all the test cases! This is looking generally good. I just want to try to make this a bit quicker to understand.

brookslogan · 2026-03-25T21:12:59Z

+  max_even <- length(unique(smry$max_t[!is.na(smry$max_t)])) <= 1
+
+  min_desc <- if (min_even) "even across epikeys" else "uneven across epikeys"
+  max_desc <- if (max_even) "even across epikeys" else "uneven across epikeys"


issue: "even" didn't first parse as an adjective for me
issue: I don't think we have user-facing definitions of "epikey", and it isn't standard in the community
issue: this could be misleading if we have differing time ranges for different signals.

suggestion: either
(a) change this to be about epikey x signal combos, and the messages to "same for every time series", "but some of the time series start later", "but some of the time series end earlier"
(b) re-use some of the by-signal latency information added to print.epi_df

(a) Done! It is worth noting that I used epikey in accordance with revision_analysis, which prints this term. Should we remove it from there as well?
(b) Done! Also, the by-signal latency information was moved here.

(a) Yes, we probably should. (File an Issue?)

Co-authored-by: brookslogan <lcbrooks+github@andrew.cmu.edu>

Changes include: * displaying only the latency range * printing a message about empty time series

…e, gap, and latency reporting based on feedback

…nfo for better code organization

… integers

JavierMtzRdz

I have attended the previous issues. Since we provide latency summaries in both summary and print, I moved all the key combination x signal calculations in epi_ts_range. To address the potential problem with empty time series and missing signal columns, I returned calculations for all the rows, but added a column indicating that it is empty.

The remaining print latency is handled in print_latency_info as a result.

As the signal latency was moved to the summary, summary_time_latency reuses the epi_ts_range output of time range, latency, and gap information. Each message for the sections is handled within their respective functions for clarity.

Finally, I reused time_delta_to_n_steps to calculate latency. However, since as_of does not necessarily fall on the same day when using time_type = "week", I can get non-integer results. To prevent such issues, I added a require_integer parameter to time_delta_to_n_steps.

Below is a detailed list of examples.

# Demonstrating Improved print.epi_df and summary.epi_df reporting
pkgload::load_all(".")
#> ℹ Loading epiprocess
#> Loading required package: epidatasets
library(dplyr)
#> Warning: package 'dplyr' was built under R version 4.5.2
#> 
#> Attaching package: 'dplyr'
#> 
#> The following object is masked from 'package:epiprocess':
#> 
#>     between
#> 
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> 
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tibble)
#> Warning: package 'tibble' was built under R version 4.5.2

# Setup basic parameters
start_date <- as.Date("2024-01-01")
as_of_date <- as.Date("2024-01-15")

# Standard clean data ---
## No signal
(case0 <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = rep(start_date + 0:5, 2),
) %>% as_epi_df(as_of = as_of_date))
#> An `epi_df` object, 12 x 2 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-15
#> Latency (lag from as_of to latest observation by time series):
#> * No time series detected
#> # A tibble: 12 × 2
#>    geo_value time_value
#>  * <chr>     <date>    
#>  1 ca        2024-01-01
#>  2 ca        2024-01-02
#>  3 ca        2024-01-03
#>  4 ca        2024-01-04
#>  5 ca        2024-01-05
#>  6 ca        2024-01-06
#>  7 hi        2024-01-01
#>  8 hi        2024-01-02
#>  9 hi        2024-01-03
#> 10 hi        2024-01-04
#> 11 hi        2024-01-05
#> 12 hi        2024-01-06

summary(case0)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-15
#> ----------
#> Time range:
#> * min time value              = 2024-01-01
#> * max time value              = 2024-01-06
#> Gaps:
#> * time gaps                   = none detected
#> * average rows per time value = 2.00
#> Latency (lag from as_of to latest observation by time series):
#> * No time series detected

## Standard
(case1 <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = rep(start_date + 0:5, 2),
  value = 1:12
) %>% as_epi_df(as_of = as_of_date))
#> An `epi_df` object, 12 x 3 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-15
#> Latency (lag from as_of to latest observation by time series):
#> * lag  = 9 days
#> 
#> # A tibble: 12 × 3
#>    geo_value time_value value
#>  * <chr>     <date>     <int>
#>  1 ca        2024-01-01     1
#>  2 ca        2024-01-02     2
#>  3 ca        2024-01-03     3
#>  4 ca        2024-01-04     4
#>  5 ca        2024-01-05     5
#>  6 ca        2024-01-06     6
#>  7 hi        2024-01-01     7
#>  8 hi        2024-01-02     8
#>  9 hi        2024-01-03     9
#> 10 hi        2024-01-04    10
#> 11 hi        2024-01-05    11
#> 12 hi        2024-01-06    12

summary(case1)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-15
#> ----------
#> Time range:
#> * min time value              = 2024-01-01 (same for every time series)
#> * max time value              = 2024-01-06 (same for every time series)
#> Gaps:
#> * time gaps                   = none detected
#> * average rows per time value = 2.00
#> Latency (lag from as_of to latest observation by time series):
#> * value: lag 9 days (max time 2024-01-06) (!)
#> (!): notable latency (lag > 7 days)

## all NA signal
(case1.5 <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = rep(start_date + 0:5, 2),
  value = 1:12,
  value2 = NA
) %>% as_epi_df(as_of = as_of_date))
#> An `epi_df` object, 12 x 4 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-15
#> Latency (lag from as_of to latest observation by time series):
#> * lag  = 9 days
#> * Empty time series detected
#> # A tibble: 12 × 4
#>    geo_value time_value value value2
#>  * <chr>     <date>     <int> <lgl> 
#>  1 ca        2024-01-01     1 NA    
#>  2 ca        2024-01-02     2 NA    
#>  3 ca        2024-01-03     3 NA    
#>  4 ca        2024-01-04     4 NA    
#>  5 ca        2024-01-05     5 NA    
#>  6 ca        2024-01-06     6 NA    
#>  7 hi        2024-01-01     7 NA    
#>  8 hi        2024-01-02     8 NA    
#>  9 hi        2024-01-03     9 NA    
#> 10 hi        2024-01-04    10 NA    
#> 11 hi        2024-01-05    11 NA    
#> 12 hi        2024-01-06    12 NA

summary(case1.5)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-15
#> ----------
#> Time range:
#> * min time value              = 2024-01-01 (same for every time series)
#> * max time value              = 2024-01-06 (same for every time series)
#> Gaps:
#> * time gaps                   = none detected
#> * average rows per time value = 2.00
#> Latency (lag from as_of to latest observation by time series):
#> * value: lag 9 days (max time 2024-01-06) (!)
#> * value2: all NA
#> (!): notable latency (lag > 7 days)

# Integer time indices ---
(case2 <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = rep(100:105, 2),
  value = 1:12
) %>% as_epi_df(as_of = 110))
#> An `epi_df` object, 12 x 3 with metadata:
#> * geo_type  = state
#> * time_type = integer
#> * as_of     = 110
#> Latency (lag from as_of to latest observation by time series):
#> * lag  = 5
#> 
#> # A tibble: 12 × 3
#>    geo_value time_value value
#>  * <chr>          <int> <int>
#>  1 ca               100     1
#>  2 ca               101     2
#>  3 ca               102     3
#>  4 ca               103     4
#>  5 ca               104     5
#>  6 ca               105     6
#>  7 hi               100     7
#>  8 hi               101     8
#>  9 hi               102     9
#> 10 hi               103    10
#> 11 hi               104    11
#> 12 hi               105    12

summary(case2)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 110
#> ----------
#> Time range:
#> * min time value              = 100 (same for every time series)
#> * max time value              = 105 (same for every time series)
#> Gaps:
#> * time gaps                   = none detected
#> * average rows per time value = 2.00
#> Latency (lag from as_of to latest observation by time series):
#> * value: lag 5 (max time 105) (!)
#> (!): notable latency (lag > 2 )

# Other Keys  ---
(case3 <- tibble(
  geo_value = rep(c("ca", "hi"), each = 4),
  age_group = rep(rep(c("0-17", "18+"), each = 2), 2),
  time_value = rep(c(start_date, start_date + 1), 4),
  value = 1:8
) %>% as_epi_df(as_of = as_of_date, other_keys = "age_group"))
#> An `epi_df` object, 8 x 4 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * other_keys = age_group
#> * as_of     = 2024-01-15
#> Latency (lag from as_of to latest observation by time series):
#> * lag  = 13 days
#> 
#> # A tibble: 8 × 4
#>   geo_value age_group time_value value
#> * <chr>     <chr>     <date>     <int>
#> 1 ca        0-17      2024-01-01     1
#> 2 ca        0-17      2024-01-02     2
#> 3 ca        18+       2024-01-01     3
#> 4 ca        18+       2024-01-02     4
#> 5 hi        0-17      2024-01-01     5
#> 6 hi        0-17      2024-01-02     6
#> 7 hi        18+       2024-01-01     7
#> 8 hi        18+       2024-01-02     8

summary(case3)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * other_keys = age_group
#> * as_of     = 2024-01-15
#> ----------
#> Time range:
#> * min time value              = 2024-01-01 (same for every time series)
#> * max time value              = 2024-01-02 (same for every time series)
#> Gaps:
#> * time gaps                   = none detected
#> * average rows per time value = 4.00
#> Latency (lag from as_of to latest observation by time series):
#> * value: lag 13 days (max time 2024-01-02) (!)
#> (!): notable latency (lag > 7 days)

# Late start and implicit gaps ---
edf_base <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = rep(start_date + 0:5, 2),
  value = 1:12
)

edf_uneven <- edf_base[-c(3, 7), ]
# 'hi' starts late
# 'ca' missing day 2

(case4 <- as_epi_df(edf_uneven, as_of = as_of_date))
#> An `epi_df` object, 10 x 3 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-15
#> Latency (lag from as_of to latest observation by time series):
#> * lag  = 9 days
#> 
#> # A tibble: 10 × 3
#>    geo_value time_value value
#>  * <chr>     <date>     <int>
#>  1 ca        2024-01-01     1
#>  2 ca        2024-01-02     2
#>  3 ca        2024-01-04     4
#>  4 ca        2024-01-05     5
#>  5 ca        2024-01-06     6
#>  6 hi        2024-01-02     8
#>  7 hi        2024-01-03     9
#>  8 hi        2024-01-04    10
#>  9 hi        2024-01-05    11
#> 10 hi        2024-01-06    12
summary(case4)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-15
#> ----------
#> Time range:
#> * min time value              = 2024-01-01 (but some time series start later)
#> * max time value              = 2024-01-06 (same for every time series)
#> Gaps:
#> * implicit (missing rows in 1/2 key combinations, affecting 1 signal)
#> * average rows per time value = 1.67
#> Latency (lag from as_of to latest observation by time series):
#> * value: lag 9 days (max time 2024-01-06) (!)
#> (!): notable latency (lag > 7 days)

# Explicit NAs ---
edf_lags <- tibble(
  geo_value = rep(c("ca", "hi"), each = 6),
  time_value = rep(start_date + 0:5, 2),
  value = 1:12
)
edf_lags$value[2] <- NA  # 'ca' gap at Jan 2
edf_lags$value[12] <- NA # 'hi' missing Jan 6 (lag)

(case5 <- as_epi_df(edf_lags, as_of = start_date + 7))
#> An `epi_df` object, 12 x 3 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-08
#> Latency (lag from as_of to latest observation by time series):
#> * lag  = 2–3 days
#> 
#> # A tibble: 12 × 3
#>    geo_value time_value value
#>  * <chr>     <date>     <int>
#>  1 ca        2024-01-01     1
#>  2 ca        2024-01-02    NA
#>  3 ca        2024-01-03     3
#>  4 ca        2024-01-04     4
#>  5 ca        2024-01-05     5
#>  6 ca        2024-01-06     6
#>  7 hi        2024-01-01     7
#>  8 hi        2024-01-02     8
#>  9 hi        2024-01-03     9
#> 10 hi        2024-01-04    10
#> 11 hi        2024-01-05    11
#> 12 hi        2024-01-06    NA
summary(case5)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-08
#> ----------
#> Time range:
#> * min time value              = 2024-01-01 (same for every time series)
#> * max time value              = 2024-01-06 (but some time series end earlier)
#> Gaps:
#> * explicit (non-lag NAs in 1/2 key combinations, affecting 1 signal)
#> * average rows per time value = 2.00
#> Latency (lag from as_of to latest observation by time series):
#> * value: lag 2–3 days (max time 2024-01-06); lagging keys: hi (!)
#> (!): notable latency (lagging keys)

# Many signals and multivariate lags ---
df_many <- tibble(geo_value = "ca", time_value = start_date + 0:9)
for (i in 1:10) {
  df_many[[paste0("sig", i)]] <- 1:10
}
# sig5 lags by 2 days, sig7 has an internal hole
df_many$sig5[9:10] <- NA
df_many$sig7[4:6] <- NA

(case6 <- as_epi_df(df_many, as_of = start_date + 12))
#> An `epi_df` object, 10 x 12 with metadata:
#> * geo_type  = state
#> * time_type = day
#> * as_of     = 2024-01-13
#> Latency (lag from as_of to latest observation by time series):
#> * lag across all time series = 3–5 days
#> 
#> # A tibble: 10 × 12
#>    geo_value time_value  sig1  sig2  sig3  sig4  sig5  sig6  sig7  sig8  sig9
#>  * <chr>     <date>     <int> <int> <int> <int> <int> <int> <int> <int> <int>
#>  1 ca        2024-01-01     1     1     1     1     1     1     1     1     1
#>  2 ca        2024-01-02     2     2     2     2     2     2     2     2     2
#>  3 ca        2024-01-03     3     3     3     3     3     3     3     3     3
#>  4 ca        2024-01-04     4     4     4     4     4     4    NA     4     4
#>  5 ca        2024-01-05     5     5     5     5     5     5    NA     5     5
#>  6 ca        2024-01-06     6     6     6     6     6     6    NA     6     6
#>  7 ca        2024-01-07     7     7     7     7     7     7     7     7     7
#>  8 ca        2024-01-08     8     8     8     8     8     8     8     8     8
#>  9 ca        2024-01-09     9     9     9     9    NA     9     9     9     9
#> 10 ca        2024-01-10    10    10    10    10    NA    10    10    10    10
#> # ℹ 1 more variable: sig10 <int>
summary(case6)
#> An `epi_df` x, with metadata:
#> * geo_type  = state
#> * as_of     = 2024-01-13
#> ----------
#> Time range:
#> * min time value              = 2024-01-01 (same for every time series)
#> * max time value              = 2024-01-10 (but some time series end earlier)
#> Gaps:
#> * explicit (non-lag NAs in 1/1 key combinations, affecting 1 signal)
#> * average rows per time value = 1.00
#> Latency (lag from as_of to latest observation by time series):
#> * sig1: lag 3 days (max time 2024-01-10)
#> * sig2: lag 3 days (max time 2024-01-10)
#> * sig3: lag 3 days (max time 2024-01-10)
#> * sig4: lag 3 days (max time 2024-01-10)
#> * sig5: lag 5 days (max time 2024-01-08)
#> * sig6: lag 3 days (max time 2024-01-10)
#> * sig7: lag 3 days (max time 2024-01-10)
#> * sig8: lag 3 days (max time 2024-01-10)
#> * ... and 2 other signals

^{Created on 2026-04-02 with reprex v2.1.1}

JavierMtzRdz · 2026-04-02T20:38:39Z

+  max_even <- length(unique(smry$max_t[!is.na(smry$max_t)])) <= 1
+
+  min_desc <- if (min_even) "even across epikeys" else "uneven across epikeys"
+  max_desc <- if (max_even) "even across epikeys" else "uneven across epikeys"


(a) Done! It is worth noting that I used epikey in accordance with revision_analysis, which prints this term. Should we remove it from there as well?
(b) Done! Also, the by-signal latency information was moved here.

JavierMtzRdz added 6 commits February 27, 2026 14:35

fead(print.epi_pdf): add latency info

66de8b5

enh(print.epi_df): add limit of info printed

d955252

refactor(latency_info_epi_df) reporting with mode lag and notable lat…

5e5e736

…ency flags

chore: update.Rd docs

561cede

docs: Update NEWS.md to document print.epi_df() latency info.

f2fcc98

feat: enhance summary.epi_df to report time value evenness and dete…

cd245ac

…ct implicit/explicit gaps.

JavierMtzRdz requested a review from brookslogan February 28, 2026 01:51

fix: prevent git commit failures when no changes are present in PR co…

320d23c

…mmands workflow.

JavierMtzRdz marked this pull request as ready for review March 2, 2026 15:00

brookslogan requested changes Mar 25, 2026

View reviewed changes

JavierMtzRdz and others added 8 commits March 27, 2026 19:10

Update R/methods-epi_df.R

bab97c7

Co-authored-by: brookslogan <lcbrooks+github@andrew.cmu.edu>

feat: rename latency_info_epi_df to print_latency_info

1b8bf8f

Changes include: * displaying only the latency range * printing a message about empty time series

refactor: overhaul epi_df summary logic to provide detailed time rang…

8e016e2

…e, gap, and latency reporting based on feedback

refactor: move epi_ts_range function definition below print_latency_i…

91e5589

…nfo for better code organization

fix: resolved issue with reusing time_delta_to_n_steps for non-strict…

f24ff9a

… integers

Merge branch 'origin/jmr/print.epi_df' into jmr/print.epi_df

c13481b

style: add relevant comment about require_integer = FALSE

d4dfb41

style: identation correction

c9cbf0c

JavierMtzRdz commented Apr 2, 2026

View reviewed changes

JavierMtzRdz requested a review from brookslogan April 2, 2026 21:16

Merge remote-tracking branch 'upstream/dev' into jmr/print.epi_df

f1486b0

brookslogan approved these changes Apr 20, 2026

View reviewed changes

brookslogan merged commit 3263371 into dev Apr 20, 2026
3 checks passed

brookslogan deleted the jmr/print.epi_df branch April 20, 2026 07:46

Conversation

JavierMtzRdz commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Add Reporting Information to print/summary.epi_df

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Examples:

Uh oh!

JavierMtzRdz commented Feb 28, 2026

Uh oh!

JavierMtzRdz commented Feb 28, 2026

Uh oh!

JavierMtzRdz commented Feb 28, 2026

Uh oh!

github-actions Bot commented Feb 28, 2026

Uh oh!

brookslogan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

brookslogan Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

JavierMtzRdz Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

brookslogan Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JavierMtzRdz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JavierMtzRdz Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JavierMtzRdz commented Feb 28, 2026 •

edited

Loading

Add Reporting Information to `print/summary.epi_df`

brookslogan Apr 20, 2026 •

edited

Loading