Skip to content

Join method for epi_df#690

Open
JavierMtzRdz wants to merge 11 commits intodevfrom
joins
Open

Join method for epi_df#690
JavierMtzRdz wants to merge 11 commits intodevfrom
joins

Conversation

@JavierMtzRdz
Copy link
Copy Markdown
Contributor

Checklist

Please:

  • Make sure this PR is against "dev", not "main" (unless this is a release
    PR).
  • Request a review from one of the current main reviewers:
    brookslogan, nmdefries.
  • Makes sure to bump the version number in DESCRIPTION. Always increment
    the patch version number (the third number), unless you are making a
    release PR from dev to main, in which case increment the minor version
    number (the second number).
  • Describe changes made in NEWS.md, making sure breaking changes
    (backwards-incompatible changes to the documented interface) are noted.
    Collect the changes under the next release number (e.g. if you are on
    1.7.2, then write your changes under the 1.8 heading).
  • Styling and documentation checks. Make a PR comment with:
    • /style to check the style and fix any issues.
    • /document to check the package documentation and fix any issues.
    • /preview-docs to preview the docs.
    • See Actions GitHub tab to track progress of these commands.
  • See DEVELOPMENT.md for more information on the development
    process.

Join method for epi_df

Here's how *_join.epi_df performs across different scenarios:

  • When keys are unique and valid, the output remains an epi_df:
    • Joining two epi_df objects with different keys correctly updates the combined other_keys metadata.
    • The original grouping structure of the input epi_df is preserved.
    • When *_type internal attributes do not match, it issues a warning.
  • The output converts to a standard tibble if:
    • The join results in multiple rows for the same key combination. This includes cases where two epi_df objects are not joined by geo_value or time_value.
    • A key column is missing from the result. A warning is issued.
    • The join introduces NA values into key columns. A warning is issued suggesting the use of inner_join.
    • cross_join() is used.

I also added an internal force_meta to easily adjust geo_type and time_type when using a pipe, as shown in the example below.

week <- tibble::tibble(geo_value = "ca", time_value = as.Date("2020-01-01"), val2 = 2) %>%
    as_epi_df() %>%
    epiprocess:::force_meta(time_type = "week")

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

@JavierMtzRdz JavierMtzRdz changed the title Joins Join method for epi_df Feb 14, 2026
@JavierMtzRdz JavierMtzRdz marked this pull request as ready for review February 17, 2026 20:42
Comment thread R/methods-epi_df.R

# NA checks for essential keys
has_na_geo <- anyMissing(res$geo_value)
has_na_time <- anyMissing(res$time_value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: anyMissing is sort of deprecated; replace with vctrs::vec_any_missing or anyNA; vctrs one is a bit more general, which might make it easier to generalize the types we accept for geo & time values later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

epi_df joins to data frames with richer keys yields invalid epi_df

2 participants