Skip to content

deprecate aggregate functions in larray module namespace (la.min, la.mean, ...) #1167

@gdementen

Description

@gdementen

Steps

  • allow specifying the with_total op by name
  • deprecate the agg functions

Historic context of why this exist

I implemented those aggregate functions to allow specifying the aggregate function to use in with_total more easlily (to avoid having to specify the aggregate by name). For example:

arr.with_total(op=mean, label='mean')

The ugly min() implementation was done because at that time we recommended using from larray import *, so the min() aggregate collided with the builtin min().

Here is an excerpt from my email to Vesna:

Normally, in Python without LArray involved, min(*args) returns the lower of N scalar values. It only works on scalars and if you try to use it on arrays, you get a weird error message.

When LArray is involved, you suddenly have 4 different versions to compute the “minimum” :

  • la.minimum(a, b) which does the same thing as the Python builtin function but supports arrays in either or both arguments. This is probably what you should use. EDIT: the crux of the problem is that she wants to compute the minimum of more than 2 values, that's why she looked for something else than la.minimum() and found la.min()

    >>> arr1 = ndtest(3)
    >>> arr2 = ndtest(3) * 0.5
    >>> arr1
    a  a0  a1  a2
        0   1   2
    >>> arr2
    a   a0   a1   a2
       0.0  0.5  1.0
    >>> la.minimum(arr1, arr2)
    a   a0   a1   a2
       0.0  0.5  1.0
  • Array.min() which is an aggregate function taking an axis to compute the minimum of an array (along one/several or all axes)

    >>> arr = ndtest((3, 4))
    >>> arr
    a\b  b0  b1  b2  b3
    a0   0   1   2   3
    a1   4   5   6   7
    a2   8   9  10  11
    
    >>> arr.min('a')
    b  b0  b1  b2  b3
        0   1   2   3
    
    >>> arr.min()
    0
  • You still have the original Python builtin “min” function, which is always available from the “builtins” module and depending on how you imported larray, is available as just “min()”. In case you imported larray as “from larray import *”, the original python min() is overwritten by a weird larray function, see below. In case you need the original Python “min” function unmodified and you imported larray with “from larray import *”, you can retrieve it like this:

    >>> from builtins import min as py_min
    >>> py_min(1, 2)
    1
  • Finally, you have la.min() which is a weird beast which tries to be clever, but I should probably never have written that one ☹. This function is also what you get if you use just “min()” but imported larray using “from larray import *” What it does depends on the type of the first argument: if it is an Array, consider the second argument as an axis and do first_array.min(second_argument), i.e. do the aggregate-over-axes version. But if the first argument is a scalar, use the Python builtin and thus return the minimum of the two (supposedly scalar) arguments. I realize now that this can lead to several problems and can have a very unexpected behavior. I am sorry for the confusion.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions