A Python package named rlearner that runs the R learner (Nie and Wager, 2021) for heterogeneous treatment effect estimation and validation with flexible choices.
Install the package via pip:
pip install "git+https://github.com/andyjiayuwang/Python-based-R-Learner.git"A full example workflow is available in demo.ipynb.
A minimal import example is:
from rlearner import (
CrossFittedNuisanceEstimator,
RLearner,
RLossStacking,
SuperLearnerClassifier,
SuperLearnerRegressor,
)The first step estimates the nuisance functions needed by the R learner:
m(X) = E[Y | X], the outcome regressione(X) = E[W | X], the propensity score
These nuisance estimates are used to build the residualized quantities
Y_tilde = Y - m_hat(X)W_tilde = W - e_hat(X)
which are then passed into the second-stage R-loss optimization.
The package provides two ways to handle step 1.
Use CrossFittedNuisanceEstimator when you want the package to fit nuisance models directly. It supports:
- K-fold cross-fitting for both the outcome model and the treatment model
- Any sklearn-style regressor for the outcome model
- Any sklearn-style binary classifier with
predict_probafor the treatment model - Optional grid search on the full nuisance model object through
outcome_param_gridandtreatment_param_grid - Full-sample refitting after cross-fitting so the fitted nuisance models can be reused for prediction
Default settings for CrossFittedNuisanceEstimator are:
n_folds=10shuffle=Truerandom_state=42propensity_clip=1e-6stratify_treatment=Truerefit_full=Trueoutcome_search_cv=5treatment_search_cv=5treatment_scoring="neg_log_loss"
Use manual nuisance inputs when you already have trusted out-of-fold nuisance predictions from an external workflow. In that case, pass:
y_hat, the out-of-fold estimate ofm(X)d_hat, the out-of-fold estimate ofe(X)
through ManualNuisanceEstimator or directly through RLearner.fit(..., y_hat=..., d_hat=...).
The package also provides constrained super learners for nuisance prediction:
SuperLearnerRegressorSuperLearnerClassifier
These models support:
- Multiple base learners
- Nonnegative stacking weights
- Optional normalization of weights to sum to 1 through
normalize_weights=True - Separate grid search for each base learner via
estimator_param_grids - Stable internal sample splitting for hyperparameter tuning
- Weight inspection through
get_weights() - Best-parameter inspection through
get_best_params()
Default settings for the super learners are:
search_cv=5search_shuffle=Truerandom_state=42normalize_weights=Falsetolerance=1e-10max_iter=1000
For treatment prediction, the built-in step 1 implementation currently assumes a binary treatment indicator.
The second step learns the conditional average treatment effect tau(X) using the residualized outcome and treatment from step 1.
The package provides two main components for this stage.
Use RLossWrapper to fit a single sklearn-style regressor under the R-loss construction. This is the simplest way to estimate a single CATE model once Y_tilde and W_tilde are available.
Use RLearner with cate_learners={...} when you want to fit multiple second-stage learners and combine them. The package then:
- Fits one
RLossWrapperper learner - Produces one CATE estimate from each learner
- Optionally combines them with
RLossStacking
RLossStacking follows the positive linear-combination idea used in the R-loss stacking step. The fitted object reports:
a_hat, the constant shift termb_hat, the scale of the coefficient vectoralpha_hat, the nonnegative relative weights of the second-stage learners
Default settings for RLossStacking are:
lambda_reg=1.0tolerance=1e-10max_iter=1000
In step 2, the stacking weights are constrained to be nonnegative, but they are not required to sum to 1.
The third step validates the fitted treatment-effect model using the out-of-fold nuisance estimates and the fitted CATE predictions. The validation routines implemented here follow the discussions in Chernozhukov et al. (2024).
All validation routines are available in two ways:
- As standalone functions in
rlearner - As convenience methods on a fitted
RLearnerinstance
The BLP test runs the no-intercept regression
Y_tilde = alpha * W_tilde + beta * W_tilde * tau_hat(X)
and reports:
- Point estimates for
alphaandbeta - HC2 standard errors
- Normal-based z statistics
- p-values
- Confidence intervals
Default setting:
confidence_level=0.95
The calibration test bins observations by predicted treatment effect and compares:
- The average predicted treatment effect within each bin
- The doubly robust bin-level treatment effect estimate
It returns both:
CAL_1, the weighted L1 calibration criterionCAL_2, the weighted L2 calibration criterion
and also exposes the full bin-level table.
Default setting:
n_bins=5
The uplift test performs ranking-based validation using a DR uplift curve. Observations are sorted by tau_hat(X) from high to low, top-fraction subgroups are formed, and a DR subgroup effect is computed for each fraction.
The output includes:
- The uplift curve table
(fraction, subgroup size, theta_dr) AUUC, the area under the uplift curve
Default setting:
fractions = 0.1, 0.2, ..., 1.0
- The import name is
rlearner, even though the GitHub repository is namedPython-based-R-Learner. - The package currently declares support for Python
>=3.10.