Skip to content

reduce on GroupedDTable of DTable of DataFrames returns NamedTuple #65

@schlichtanders

Description

@schlichtanders

I think it should be returning a DataFrame, preserving the inner type

here an example

using Distributed
# add two further julia processes which could run on other machines
addprocs(2, exeflags="--threads=2")
# Distributed.@everywhere execute code on all machines
@everywhere using Dagger  # needed for all_processors
# Dagger uses both Threads and Machines as processes
Dagger.all_processors()

using DTables, DataFrames, CSV

url = "https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv"
files = [url, url, url, url, url]

d = DTable(DataFrame  CSV.File  download, files)
g = DTables.groupby(d, :species)
r = reduce(+, g, cols=[:sepal_width])
fetch(r)
# returns
# (species = String15["virginica", "setosa", "versicolor"], result_sepal_width = [743.5, 856.9999999999998, 692.4999999999995])

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions