I have the following setup:
- Start Julia with
julia --project -t1 --heap-size-hint=3G
- Add 4 processes with
addprocs(4; exeflags = "--heap-size-hint=3G")
- Worker 1 receives a query request and then tells worker 2 to do the work
The actual query includes loading a table from a .csv file into a DTable (with a DataFrame table type). Operations include selecting columns, fetching the table into a DataFrame for adding/removing rows/columns and other processing as needed, and re-wrapping the table in a DTable to later be processed further. At the end of processing, the result is returned as a DataFrame.
The .csv file contains a table with 233930 rows and 102 columns: 1 column of InlineStrings.String15, 2 columns of InlineStrings.String1, 45 columns of Int64, and 54 columns of Float64.
The issue: I noticed that if I keep running the same query repeatedly, the MemPool.datastore on worker 2 consumes more and more memory, as determined by
remotecall_fetch(2) do
Base.summarysize(MyPackage.Dagger.MemPool.datastore)
end
Eventually, the memory usage grows enough to cause my WSL 2 Linux OOM manager to kill worker 2, crashing my program.
Notably, I do not observe this growth in memory usage in the following scenarios:
- when running everything on a single process (i.e., not calling
addprocs), or
- when using
DataFrames exclusively (i.e., not using DTables.jl at all).
I do observe this growth in memory usage in the following additional scenarios:
- when using
NamedTuple as the table type for the DTables, or
- when running everything on a single process, but with multiple processes available. (To clarify, my code exclusively uses worker 1 in this scenario, but it appears DTables.jl/Dagger.jl uses the other available workers. And in this case the
MemPool.datastore on worker 1 (not worker 2) is what consumes more and more memory. However, I never ran into any issues with the OOM manager killing my processes.)
I'm posting this issue in DTables.jl in case there's something DTables.jl is doing that somehow causes the MemPool.jl data store to keep references around longer than expected, but of course please transfer this issue to Dagger.jl or MemPool.jl as needed.
Please let me know if there is any other information that would help with finding the root cause of this issue.
I have the following setup:
julia --project -t1 --heap-size-hint=3Gaddprocs(4; exeflags = "--heap-size-hint=3G")The actual query includes loading a table from a .csv file into a
DTable(with aDataFrametable type). Operations includeselecting columns,fetching the table into aDataFramefor adding/removing rows/columns and other processing as needed, and re-wrapping the table in aDTableto later be processed further. At the end of processing, the result is returned as aDataFrame.The .csv file contains a table with 233930 rows and 102 columns: 1 column of
InlineStrings.String15, 2 columns ofInlineStrings.String1, 45 columns ofInt64, and 54 columns ofFloat64.The issue: I noticed that if I keep running the same query repeatedly, the
MemPool.datastoreon worker 2 consumes more and more memory, as determined byEventually, the memory usage grows enough to cause my WSL 2 Linux OOM manager to kill worker 2, crashing my program.
Notably, I do not observe this growth in memory usage in the following scenarios:
addprocs), orDataFrames exclusively (i.e., not using DTables.jl at all).I do observe this growth in memory usage in the following additional scenarios:
NamedTupleas the table type for theDTables, orMemPool.datastoreon worker 1 (not worker 2) is what consumes more and more memory. However, I never ran into any issues with the OOM manager killing my processes.)I'm posting this issue in DTables.jl in case there's something DTables.jl is doing that somehow causes the MemPool.jl data store to keep references around longer than expected, but of course please transfer this issue to Dagger.jl or MemPool.jl as needed.
Please let me know if there is any other information that would help with finding the root cause of this issue.