Skip to content

Ideas for refining the structure of FFTrees objects #226

@ndphillips

Description

@ndphillips

@hneth In the context of #224 I realized that the structure of FFTrees objects is making development challenging - such as knowing how to modularize the plot.FFTrees() function.

The current code to create FFTrees objects is contained here:

FFTrees/R/fftrees_create.R

Lines 765 to 852 in 71b9da0

# Create x (as list):
x <- list(
# Names of criterion vs. cues:
criterion_name = criterion_name,
cue_names = cue_names,
# Formula:
formula = formula, # original formula
# Tree info:
trees = list(
n = NULL,
best = NULL,
definitions = NULL,
inwords = NULL,
stats = NULL,
level_stats = NULL,
decisions = list(
train = list(),
test = list()
)
),
# Raw training data:
data = list(
train = data,
test = data.test
),
# Store parameters (as list):
params = list(
algorithm = algorithm,
#
goal = goal,
goal.chase = goal.chase,
goal.threshold = goal.threshold,
#
max.levels = max.levels,
numthresh.method = numthresh.method,
numthresh.n = numthresh.n,
repeat.cues = repeat.cues,
stopping.rule = stopping.rule,
stopping.par = stopping.par,
#
sens.w = sens.w,
#
cost.outcomes = cost.outcomes,
cost.cues = cost.cues,
#
main = main,
decision.labels = decision.labels,
#
my.goal = my.goal,
my.goal.fun = my.goal.fun,
my.tree = my.tree,
#
quiet = quiet
),
# One row per algorithm competition:
competition = list(
train = data.frame(
algorithm = NA,
n = NA,
hi = NA, fa = NA, mi = NA, cr = NA,
sens = NA, spec = NA, far = NA,
ppv = NA, npv = NA,
acc = NA, bacc = NA,
cost = NA, cost_dec = NA, cost_cue = NA
),
test = data.frame(
algorithm = NA,
n = NA,
hi = NA, fa = NA, mi = NA, cr = NA,
sens = NA, spec = NA, far = NA,
ppv = NA, npv = NA,
acc = NA, bacc = NA,
cost = NA, cost_dec = NA, cost_cue = NA
),
models = list(lr = NULL, cart = NULL, rf = NULL, svm = NULL)
) # competition.
) # x.

Here are the core issues I see:

  • Inconsistent and confusing naming
    • Ex) How do the definitions, inwords, stats, level_stats, and decisions objects in FFTrees.relate to each other?
  • Information at different levels of abstraction aren't consistently stored
    • Ex) Why are tree definitions stored at the tree level but not the node level? Why aren't overall tree accuracy stats located close to the tree level accuracy stats?
  • Inconsistent storage locations
    • Ex) Why are criterion_name, cue_names and formula stored at the same level as trees, data and params? Could these be stored in a list such as metadata?

To solve these issues, I'm drafted an object design doc at https://github.com/ndphillips/FFTrees/wiki/%5B80%25%5D-FFTrees-Object-Design. I'm eager for feedback

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions