Skip to content

mean vs. geomean #37

@grst

Description

@grst

Hi @ebecht and @FPetitprez,

I've noticed some inconsistency between the code and the paper and I'm wondering what was the original intention.

The paper states:

Given a set of transcriptomic markers of a given category, we computed a corresponding per-sample score, called hereafter a MCP-counter score, using the log2 geometric mean of this set of markers.

While the implementation just calculates an arithmetic mean:

apply(xp[intersect(row.names(xp),x),,drop=F],2,mean,na.rm=T)

Now if the input data were log2-transformed it would be somewhat like the geometric mean, but also not precisely, because the geometric mean would require an exp() after the arithmetic means of the logarithm.

geomean = exp(mean(log(X)))

I'm mostly asking because in immunedeconv we recommend the users to specify raw TPM and forward them to MCPcounter unchanged, because I was assuming that it calculates a geometric mean internally. However, given the actual implementation I think it would be more appropriate to log1p transform TPM values first to not give disproportional weight to more highly expressed genes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions