- Fixes vcpkg bad hashes (vcpkg/#38974).
- Updates arrow to 17.0.0.
-
Changed the build process to statically link Apache Arrow. With this change and using the PyCapsule interface, PyBNesian can interoperate with different versions of
pyarrow>=14.0.0. You can now upgrade pyarrow (pip install --upgrade pyarrow) without breaking PyBNesian. The dependencies are also managed by vcpkg, so the build process is simpler and orchestrated by scikit-build-core and a CMakeLists.txt. -
Some tests failed because
pandasandscipywere updated. These issues have been fixed. -
A bug in the
DiscreteFactor.sample()function has been fixed. The previous implementation sampled equally from the first and last category of theDiscreteFactor.
-
Fixed a bug in
DiscreteFactorand others hybrid factors, such asCLinearGaussianCPDandHCKDE, where categorical data would not be correctly validated. This could lead to erroneous results or undefined behavior (often leading to segmentation fault). Thanks to Carlos Li for reporting this bug. -
Support for Python 3.10 and
pyarrow>=9.0has been added. Support for Python 3.6 has been deprecated, aspyarrowno longer supports it. -
manylinux2014 wheels are now used instead of manylinux2010, since
pyarrowno longer provides manylinux2010 wheels.
- Fixed important bug in OpenCL for NVIDIA GPUs, as they define small OpenCL constant memory. See https://stackoverflow.com/questions/63080816/opencl-small-constant-memory-size-on-nvidia-gpu.
- Added support for Apache Arrow 7.0.0.
-
Added method
ConditionalBayesianNetworkBase.interface_arcs(). -
GreedyHillClimbingandMMHCnow accepts a blacklist ofFactorType. -
BayesianNetworkType.data_default_node_type()now returns a list ofFactorTypeindicating the priority of eachFactorTypefor each data type. -
BayesianNetworkBase.set_unknown_node_types()now accepts an argument ofFactorTypeblacklist. -
Change
HeterogeneousBNconstructor andHeterogeneousBNType.default_node_types()to accept lists of defaultFactorType. -
Adds constructors for
HeterogeneousBNandCLGNetworkthat can set theFactorTypefor each node. -
Bug Fixes:
- An overflow error in
ChiSquarehypothesis test was raised when the statistic were close to 0. - Arc blacklists/whitelists with repeated arcs were not correctly processed.
- Fixed an error in the use of the patience parameter. Previously, the algorithm was executed as with a
patience - 1value. - Improve the validation of objects returned from Python class extensions, so it errors when the extensions are not correctly implemented.
- Fixed many serialization bugs. In particular, there were multiple bugs related with the serialization of models with Python extensions.
- Included a fix for the Windows build (by setting a correct
__cplusplusvalue). - Fixed a bug in
LinearGaussianCPD.fit()with 2 parents. In some cases, it was detecting a linear dependence between the parents that did not exist. - Fixes a bug which causes that the Python-class extension functionality is removed. Related to: pybind/pybind11#1333.
- An overflow error in
- Improvements on the code that checks that a matrix is positive definite.
- A bug affecting the learning of conditional Bayesian networks with
MMHChas been fixed. This bug also affectedDMMHC. - Fixed a bug that affected the type of the parameter
bn_typeofMMHC.estimate(),MMHC.estimate_conditional()andDMMHC.estimate().
- Adds support for pyarrow 5.0.0 in the PyPi wheels.
- Added
Arguments.args()to access theargsandkwargsfor a node. - Added
BayesianNetworkBase.underlying_node_type()to get the underlying node type of a node given some data. - Improves the fitting of hybrid factors. Now, an specific discrete configuration can be left unfitted if the base continuous factor raises
SingularCovarianceData. - Improves the
LinearGaussianCPDfit when the covariance matrix of the data is singular. - Improves the
NormalReferenceRule,ScottsBandwidth, andUCVestimation when the covariance of the data is singular. - Fixes a bug loading an heterogeneous Bayesian network from a file.
- Introduces a check that a needed category exists in discrete data.
Assignmentnow supports integer numbers converting them automatically to float.- Fix a bug in
GreedyHillClimbingthat caused the return of Bayesian networks withUnknownFactorType. - Reduces memory usage when fitting and printing an hybrid
Factor. - Fixes a precision bug in
GreedyHillClimbing. - Improves
CrossValidationparameter checking.
- Fixed a bug in the
UCVbandwidth selector that may cause segmentation fault. - Added some checks to ensure that the categorical data is of type string.
- Fixed the
GreedyHillClimbingiteration counter, which was begin increased twice per iteration. - Added a default parameter value for
include_cpdinBayesianNetworkBase:save()andDynamicBayesianNetworkBase::save(). - Added more checks to detect ill-conditioned regression problems. The
BICscore returns-infinityfor ill-conditioned regression problems.
- Fixed the build process to support CMake versions older than 3.13.
- Fixed a bug that might raise an error with a call to
FactorType::new_factor()with*argsand**kwargsarguments . This bug was only reproducible if the library was compiled with gcc. - Added CMake as prerequisite to compile the library in the docs.
- Removed all the submodules to simplify the imports. Now, all the classes are accessible directly from the pybnesian root module.
- Added a
ProductKDEclass that implementsKDEwith diagonal bandwidth matrix. - Added an abstract class
BandwidthSelectorto implement bandwidth selection forKDEandProductKDE. Three concrete implementations of bandwidth selection are included:ScottsBandwidth,NormalReferenceRuleandUCV. - Added
Arguments,ArgsandKwargsto store a set of arguments to be used to create new factors throughFactorType::new_factor(). TheArgumentsare accepted byBayesianNetworkBase::fit()and the constructors ofCVLikelihood,HoldoutLikelihoodandValidatedLikelihood.
- An error related to the processing of categorical data with too many categories has been corrected.
- Removed
-march=nativeflag in the build script to avoid the use of instruction sets not available on some CPUs.
- Added conditional linear Gaussian networks (
CLGNetworkType,CLGNetwork,ConditionalCLGNetworkandDynamicCLGNetwork). - Implemented
ChiSquare(andDynamicChiSquare) indepencence test. - Implemented
MutualInformation(andDynamicMutualInformation) indepencence test. This is valid for hybrid data. - Implemented
BDe(Bayesian Dirichlet equivalent) score (andDynamicBDe). - Added
UnknownFactorTypeas defaultFactorTypefor Bayesian networks when the node type could not be deduced. - Added
Assignmentclass to represent the assignment of values to variables.
API changes:
- Added method
Score::data(). - Added
BayesianNetworkType::data_default_node_type()for non-homogeneousBayesianNetworkType. - Added constructor for
HeterogeneousBNto specify a defaultFactorTypefor each data type. Also, it addsHeterogeneousBN::default_node_types()andHeterogeneousBN::single_default(). - Added
BayesianNetworkBase::has_unknown_node_types()andBayesianNetworkBase::set_unknown_node_types(). - Changed signature of
BayesianNetworkType::compatible_node_type()to include the new node type as argument. - Removed
FactorType::opposite_semiparametric(). This functionality has been replaced byBayesianNetworkType::alternative_node_type(). - Included model as parameter of
Operator::opposite(). - Added method
OperatorSet::set_type_blacklist(). Added a type blacklist argument toChangeNodeTypeSetconstructor.
- First release! =).