Skip to content

Commit 4ab3e8e

Browse files
Renato2000Renato AzevedoArtur Pedrosoartpdr
authored
Add support for Alpine on LGBM (#145)
* Add support for Alpine on LGBM This commit bumps lightgbm to v0.9.14 and makes it possible to use LGBM on nodes with AMD64 architecture using Alpine. * Change build structure when loading dependencies This commit changes the paths when loading the dependencies on amd64 architecture in order to reflect the changes made on the jar. * Change build structure when loading dependencies This commit changes the path when loading the dependencies on amd64 architecture in order to reflet the change on the jar structure from amd64/alpine to amd64/musl. * Update make-lightgbm submodule * Refactor code and add tests This commit changes the code structure by adding a new enum representing the libc implementation available. In order to detect the libc implementation, an env variable FDZ_OPENML_JAVA_LIBC must be set with either 'glibc' or 'musl', and using glibc if this env variable is not defined. This new enum is also used along side with the cpu architecture enum to represent the infrastructure running the code. * Add github action to test LightGBM on Alpine * Try action change * Try action change * Try action change * Try action change * Try action change * Add toString method to Infrastructure * Add unit test for Infrastructure * Try action change * Try action change * Try action change * Try action change * Try action change * Try action change * Update READEME This commit includes information in the README about the new ENV variable used to load dependencies for musl-based OS. * Add Override annotation to toString method on Infrastructure Co-authored-by: Artur Pedroso <artpdr@users.noreply.github.com> * Fix toString method on Infrastructure * Add ENV var name as a constant * Make FDZ_OPENML_JAVA_LIBC variable private * Refactor Infrastructure#getLgbmNativeLibsFolder This commits changes the if statment on the method Infrastructure#getLgbmNativeLibsFolder into a switch statement. * Fix unknownLibcImplementationThrowsException test * Remove unnecessary variable on getLgbmNativeLibsFolder * Make getLibcImplementation method protected --------- Co-authored-by: Renato Azevedo <renato.azevedo@feedzai.com> Co-authored-by: Artur Pedroso <artur.pedroso@feedzai.com> Co-authored-by: Artur Pedroso <artpdr@users.noreply.github.com>
1 parent fe58343 commit 4ab3e8e

11 files changed

Lines changed: 228 additions & 13 deletions

File tree

.github/workflows/build.yml

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,9 +49,26 @@ jobs:
4949
- name: Build and test
5050
run: mvn test -B
5151

52+
# The skipped tests is because h2o-xgboost isn't ready to use with musl
53+
- name: Test with musl
54+
run: |
55+
docker run --rm -t -v ~/.m2:/root/.m2 -v $(pwd):/feedzai-openml-java -e FDZ_OPENML_JAVA_LIBC="musl" alpine:3.18.4 \
56+
/bin/sh -c 'apk add clang openjdk8 maven bash git && \
57+
git config --global --add safe.directory /feedzai-openml-java && \
58+
git config --global --add safe.directory /feedzai-openml-java/openml-lightgbm/lightgbm-builder/make-lightgbm && \
59+
cd /feedzai-openml-java && \
60+
cp openml-lightgbm/lightgbm-builder/make-lightgbm/build/amd64/musl/lib_lightgbm.so /lib/lib_lightgbm.so && \
61+
mvn test -B -Dsurefire.failIfNoSpecifiedTests=false -Dtest=!ClassifyUnknownCategoryTest#test,!H2OModelProviderTrainTest#trainModelsForAllAlgorithms'
62+
5263
# The skipped tests is because h2o-xgboost isn't ready to use on arm64
5364
- name: Test on arm64
54-
run: docker run --rm --platform=arm64 -t -v ~/.m2:/root/.m2 -v $(pwd):/feedzai-openml-java maven:3.8-openjdk-8-slim /bin/bash -c 'apt update && apt install -y --no-install-recommends git && git config --global --add safe.directory /feedzai-openml-java && git config --global --add safe.directory /feedzai-openml-java/openml-lightgbm/lightgbm-builder/make-lightgbm && cd /feedzai-openml-java && mvn test -B -DfailIfNoTests=false -Dtest=!ClassifyUnknownCategoryTest#test,!H2OModelProviderTrainTest#trainModelsForAllAlgorithms'
65+
run: |
66+
docker run --rm --platform=arm64 -t -v ~/.m2:/root/.m2 -v $(pwd):/feedzai-openml-java maven:3.8-openjdk-8-slim \
67+
/bin/bash -c 'apt update && apt install -y --no-install-recommends git && \
68+
git config --global --add safe.directory /feedzai-openml-java && \
69+
git config --global --add safe.directory /feedzai-openml-java/openml-lightgbm/lightgbm-builder/make-lightgbm && \
70+
cd /feedzai-openml-java && \
71+
mvn test -B -Dsurefire.failIfNoSpecifiedTests=false -Dtest=!ClassifyUnknownCategoryTest#test,!H2OModelProviderTrainTest#trainModelsForAllAlgorithms'
5572
5673
- uses: codecov/codecov-action@v1
5774
with:

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,9 @@ If the build fails compiling for ARM64 you may need to run:
2020
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
2121
```
2222

23+
To use the models on Operating Systems with musl (supported only for AMD64 architectures), the ENV variable `FDZ_OPENML_JAVA_LIBC`
24+
must be set to `musl`. This variable can take the values `glibc` and `musl`. H2O-xgboost can't be used with musl.
25+
2326
## Releasing
2427

2528
For all releases, as the hotfix branch is ready all that's needed to actually release is to create an annotated tag pointing to the hotfix branch head.

openml-lightgbm/lightgbm-builder/pom.xml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@
1515

1616
<groupId>com.feedzai.openml.lightgbm</groupId>
1717
<artifactId>lightgbm-lib</artifactId>
18-
<version>v0.1.0</version>
18+
<version>v0.9.14</version>
1919

2020
<packaging>jar</packaging>
2121
<name>Openml LightGBM lib</name>
@@ -33,8 +33,8 @@
3333
<!-- Feedzai's FairGBM! -->
3434
<lightgbm.repo.url>https://github.com/feedzai/fairgbm.git</lightgbm.repo.url>
3535

36-
<lightgbm.version>v0.1.0</lightgbm.version>
37-
<lightgbmlib.version>v0.1.0</lightgbmlib.version>
36+
<lightgbm.version>v0.9.14</lightgbm.version>
37+
<lightgbmlib.version>v0.9.14</lightgbmlib.version>
3838
</properties>
3939

4040
<build>

openml-lightgbm/lightgbm-provider/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
<description>OpenML LightGBM Machine Learning Model and Classifier provider</description>
2626

2727
<properties>
28-
<lightgbmlib.version>v0.1.0</lightgbmlib.version>
28+
<lightgbmlib.version>v0.9.14</lightgbmlib.version>
2929
</properties>
3030

3131
<dependencies>
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
package com.feedzai.openml.provider.lightgbm;
2+
3+
import java.io.IOException;
4+
5+
6+
/**
7+
* Enum that represents the infrastructure where code is running and consequent lgbm native libs locations.
8+
*/
9+
public class Infrastructure {
10+
11+
/**
12+
* The CPU architecture used.
13+
*/
14+
private final CpuArchitecture cpuArchitecture;
15+
16+
/**
17+
* The libc implementation available.
18+
*/
19+
private final LibcImplementation libcImpl;
20+
21+
public Infrastructure(final CpuArchitecture cpuArchitecture, final LibcImplementation libcImpl) {
22+
this.cpuArchitecture = cpuArchitecture;
23+
this.libcImpl = libcImpl;
24+
}
25+
26+
@Override
27+
public String toString() {
28+
return "Infrastructure{" +
29+
"cpuArchitecture=" + cpuArchitecture +
30+
", libcImpl=" + libcImpl +
31+
'}';
32+
}
33+
34+
/**
35+
* Gets the native libraries folder name according to the cpu architecture and libc implementation.
36+
*
37+
* @return the native libraries folder name.
38+
*/
39+
public String getLgbmNativeLibsFolder() throws IOException {
40+
41+
switch (cpuArchitecture) {
42+
case AARCH64:
43+
if (libcImpl == LibcImplementation.MUSL) {
44+
throw new IOException("Trying to use LightGBM on a musl-based OS with unsupported arm64 architecture.");
45+
}
46+
return cpuArchitecture.getLgbmNativeLibsFolder() + "/";
47+
case AMD64:
48+
return cpuArchitecture.getLgbmNativeLibsFolder() + "/" + libcImpl.getLibcImpl() + "/";
49+
default:
50+
throw new IllegalStateException("Unexpected value: " + cpuArchitecture);
51+
}
52+
}
53+
}
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
package com.feedzai.openml.provider.lightgbm;
2+
3+
/**
4+
* Enum that represents the libc implementation available on the machine and consequent lgbm native libs locations.
5+
*/
6+
public enum LibcImplementation {
7+
MUSL("musl"),
8+
GLIBC("glibc");
9+
10+
/**
11+
* This is the name of available libc implementation and indicates the folder where the lightgbm native libraries are.
12+
*/
13+
private final String libcImpl;
14+
15+
LibcImplementation(final String libcImpl){
16+
this.libcImpl = libcImpl;
17+
}
18+
19+
/**
20+
* Gets the native libraries folder name according to the libc implementation.
21+
*
22+
* @return the native libraries folder name according to the libc implementation.
23+
*/
24+
public String getLibcImpl() {
25+
return libcImpl;
26+
}
27+
}

openml-lightgbm/lightgbm-provider/src/main/java/com/feedzai/openml/provider/lightgbm/LightGBMUtils.java

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,11 @@ public class LightGBMUtils {
4646
*/
4747
static final int BINARY_LGBM_NUM_CLASSES = 1;
4848

49+
/**
50+
* Environment variable with the implementation of libc available on the system.
51+
*/
52+
private static final String FDZ_OPENML_JAVA_LIBC = "FDZ_OPENML_JAVA_LIBC";
53+
4954
/**
5055
* State variable to know if it loadLibs was ever called.
5156
*/
@@ -60,10 +65,13 @@ static public void loadLibs() {
6065

6166
if (!libsLoaded) {
6267
final CpuArchitecture cpuArchitecture = getCpuArchitecture(System.getProperty("os.arch"));
68+
final LibcImplementation libcImpl = getLibcImplementation(System.getenv(FDZ_OPENML_JAVA_LIBC));
69+
final Infrastructure infrastructure = new Infrastructure(cpuArchitecture, libcImpl);
70+
6371
try {
64-
loadSharedLibraryFromJar("libgomp.so.1.0.0", cpuArchitecture);
65-
loadSharedLibraryFromJar("lib_lightgbm.so", cpuArchitecture);
66-
loadSharedLibraryFromJar("lib_lightgbm_swig.so", cpuArchitecture);
72+
loadSharedLibraryFromJar("libgomp.so.1.0.0", infrastructure);
73+
loadSharedLibraryFromJar("lib_lightgbm.so", infrastructure);
74+
loadSharedLibraryFromJar("lib_lightgbm_swig.so", infrastructure);
6775
} catch (final IOException ex) {
6876
throw new RuntimeException("Failed to load LightGBM shared libraries from jar.", ex);
6977
}
@@ -82,22 +90,37 @@ static CpuArchitecture getCpuArchitecture(final String cpuArchName) {
8290
}
8391
}
8492

93+
protected static LibcImplementation getLibcImplementation(final String libcImpl) {
94+
if (libcImpl == null || libcImpl.isEmpty()) {
95+
logger.debug("{} not set, assuming glibc as libc implementation.", FDZ_OPENML_JAVA_LIBC);
96+
return LibcImplementation.GLIBC;
97+
}
98+
99+
try {
100+
return LibcImplementation.valueOf(libcImpl.toUpperCase().trim());
101+
} catch (final IllegalArgumentException ex) {
102+
logger.error("Trying to use LightGBM with an unsupported libc implementation {}.", libcImpl, ex);
103+
throw ex;
104+
}
105+
}
106+
85107
/**
86108
* Loads a single shared library from the Jar.
87109
*
88110
* @param sharedLibResourceName library "filename" inside the jar.
89-
* @param cpuArchitecture cpu architecture.
111+
* @param infrastructure infrastructure used composed of CPU architecture and libc implementation.
90112
* @throws IOException if any error happens loading the library.
91113
*/
92114
static private void loadSharedLibraryFromJar(
93115
final String sharedLibResourceName,
94-
final CpuArchitecture cpuArchitecture
116+
final Infrastructure infrastructure
95117
) throws IOException {
96118

97-
logger.debug("Loading LightGBM shared lib: {} for {}.", sharedLibResourceName, cpuArchitecture);
119+
logger.debug("Loading LightGBM shared lib: {} for {}.", sharedLibResourceName, infrastructure);
120+
final String libraryPath = infrastructure.getLgbmNativeLibsFolder() + sharedLibResourceName;
98121

99122
final InputStream inputStream = LightGBMUtils.class.getClassLoader()
100-
.getResourceAsStream(cpuArchitecture.getLgbmNativeLibsFolder() + "/" + sharedLibResourceName);
123+
.getResourceAsStream(libraryPath);
101124
final File tempFile = File.createTempFile("lib", ".so");
102125
final OutputStream outputStream = new FileOutputStream(tempFile);
103126

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
/*
2+
* The copyright of this file belongs to Feedzai. The file cannot be
3+
* reproduced in whole or in part, stored in a retrieval system,
4+
* transmitted in any form, or by any means electronic, mechanical,
5+
* photocopying, or otherwise, without the prior permission of the owner.
6+
*
7+
* © 2023 Feedzai, Strictly Confidential
8+
*/
9+
package com.feedzai.openml.provider.lightgbm;
10+
11+
import org.assertj.core.api.Assertions;
12+
import org.junit.Test;
13+
14+
import java.io.IOException;
15+
16+
/**
17+
* Tests the retrieval of native libs folder path for Infrastructure.
18+
*
19+
* @author Renato Azevedo (renato.azevedo@feedzai.com)
20+
*/
21+
public class TestInfrastructure {
22+
@Test
23+
public void unknownInfrastructureCombination() {
24+
Infrastructure infrastructure = new Infrastructure(CpuArchitecture.AARCH64, LibcImplementation.MUSL);
25+
26+
Assertions.assertThatThrownBy(infrastructure::getLgbmNativeLibsFolder)
27+
.isInstanceOf(IOException.class);
28+
}
29+
30+
@Test
31+
public void knowsCorrectInfrastructureCombination() throws IOException {
32+
33+
Infrastructure infra_arm64_glibc = new Infrastructure(CpuArchitecture.AARCH64, LibcImplementation.GLIBC);
34+
Assertions.assertThat(infra_arm64_glibc.getLgbmNativeLibsFolder())
35+
.isEqualTo("arm64/");
36+
37+
Infrastructure infra_amd64_glibc = new Infrastructure(CpuArchitecture.AMD64, LibcImplementation.GLIBC);
38+
Assertions.assertThat(infra_amd64_glibc.getLgbmNativeLibsFolder())
39+
.isEqualTo("amd64/glibc/");
40+
41+
Infrastructure infra_amd64_musl = new Infrastructure(CpuArchitecture.AMD64, LibcImplementation.MUSL);
42+
Assertions.assertThat(infra_amd64_musl.getLgbmNativeLibsFolder())
43+
.isEqualTo("amd64/musl/");
44+
}
45+
}
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
/*
2+
* The copyright of this file belongs to Feedzai. The file cannot be
3+
* reproduced in whole or in part, stored in a retrieval system,
4+
* transmitted in any form, or by any means electronic, mechanical,
5+
* photocopying, or otherwise, without the prior permission of the owner.
6+
*
7+
* © 2023 Feedzai, Strictly Confidential
8+
*/
9+
package com.feedzai.openml.provider.lightgbm;
10+
11+
import org.assertj.core.api.Assertions;
12+
import org.junit.Test;
13+
14+
/**
15+
* Tests the retrieval of libc implementation.
16+
*
17+
* @author Renato Azevedo (renato.azevedo@feedzai.com)
18+
*/
19+
public class TestLibcImplementation {
20+
@Test
21+
public void unknownLibcImplementationThrowsException() {
22+
Assertions.assertThatThrownBy(() -> LightGBMUtils.getLibcImplementation("klibc"))
23+
.isInstanceOf(IllegalArgumentException.class);
24+
}
25+
26+
@Test
27+
public void defaultLibcImplementationIsGlibc() {
28+
Assertions.assertThat(LightGBMUtils.getLibcImplementation(""))
29+
.isEqualTo(LibcImplementation.GLIBC);
30+
31+
Assertions.assertThat(LightGBMUtils.getLibcImplementation(null))
32+
.isEqualTo(LibcImplementation.GLIBC);
33+
}
34+
35+
@Test
36+
public void canReadKnownLibcImplementation() {
37+
Assertions.assertThat(LightGBMUtils.getLibcImplementation("glibc"))
38+
.isEqualTo(LibcImplementation.GLIBC);
39+
40+
Assertions.assertThat(LightGBMUtils.getLibcImplementation("musl"))
41+
.isEqualTo(LibcImplementation.MUSL);
42+
}
43+
}

0 commit comments

Comments
 (0)