Skip to content

Abstract optimizations #5

@botev

Description

@botev

For begging more or less we need only a single optimization - Kernel fusion. The optimization basically states that whenever we have a composition of nodes, but the first one will never be used again we can "fuse" the computation avoiding extra looping. This would require a special FusionOp which would contain a graph initself.

Example: f = tanh(a + b). This normally would become: n0 = a + b and n0 = tanh(n0) (assuming the memory optimizer works well). However, on a GPU these are still 2 kernels. on a CPU two loops. Fusing this would mean we move from:

for (ni, ai, bi) in Zip::new((&mut n0, &a, &b)) {
    *ni = ai + bi;
}
for &mut ni in &mut n0 {
    *ni = tanh(*ni);
}

To

for (ni, ai, bi) in Zip::new((&mut n0, &a, &b)) {
    *ni = tanh(ai + bi);
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions