smart-cache-graph

A Smart Cache for the RDF Dataset of the knowledge channel.

SCG (Smart Cache Graph) provides SPARQL access using the SPARQL protocol and SPARQL Graph Store Protocol to RDF data with ABAC data security.

It is a container that consists of:

Apache Jena Fuseki server
Fuseki-Kafka bridge
- Please refer to the README.md in that repository if you require additional Kafka configuration e.g. for Kafka Authentication.
JSON Web Token (JWT) based authentication
Telicent RDF ABAC data security
Telicent Authorization policies
Telicent GraphQL extensions

Example configuration

Smart Cache Graph is configured using a Fuseki configuration file (documentation). There is an example config.ttl file.

You can find further example configurations later under Try It Out.

API Specifications

System Configuration

The following environment variables can be used to control Smart Cache Graph:

`USER_ATTRIBUTES_URL`

This is the network location of user attribute server which also includes the hierarchies' management.

The URL value is a template including {user}. Example: http://some-host/users/lookup/{user}

`JWKS_URL`

This specifies the JSON Web Key Set (JWKS) URL to use to obtain the public keys for verifying JSON Web Tokens (JWTs). The value "disabled" turns off token verification.

`USERINFO_URL`

From 0.91.0 onwards configured the User Info lookup URL that is used to exchange the authenticated JWT for a User Info response, this is used to help enforce Telicent Authorization policies. If authentication is disabled then this has no effect.

This should be set to the /userinfo, or equivalent endpoint, of your OAuth 2/OIDC compatible Identity Provider which is issuing the JWTs used to authenticate users to Smart Cache Graph.

`FEATURE_FLAG_AUTHZ`

From 0.91.0 onwards if set to false then disables the Telicent Authorization policy features. Note that this form of Authorization only applies if authentication is enabled.

Authorization Policies

Since 0.91.0 the Telicent Authorization policy feature enforces that authenticated users require specific roles and permissions in order to access different endpoints provided by the server. With this being determined from both the information in the authenticated JWT and by the User Info obtained from the configured /userinfo endpoint of the OAuth 2/OIDC compliant identity provider.

All endpoints require either the USER or ADMIN_SYSTEM roles, and additionally require specific permissions depending on the endpoint.

For each dataset configured via the Fuseki configuration file all configured endpoints will have authorization policy dynamically defined for them:

If the endpoint has a known Fuseki Operation registered for it then permissions are api.<dataset>.read for read-only operations, or api.<dataset>.read and api.<dataset>.write for read/write operations.
The catch all /<dataset> endpoint requires both api.<dataset>.read and api.<dataset>.write permissions since with that endpoint the request is dynamically dispatched to the appropriate endpoint based upon the request method and body.
If the operation is unknown no specific policy is applied, internally this causes these endpoints to default to the DENY_ALL policy.

Please refer to the SCG_AuthPolicy class for what Fuseki Operations are considered read-only, versus read/write.

For other endpoints provided by the various Telicent modules added to Fuseki the endpoints the following policies apply:

Endpoint	Roles Required	Permissions Required
`/$/backups/create`	`ADMIN_SYSTEM`	`backup.write`
`/$/backups/delete`	`ADMIN_SYSTEM`	`backup.delete`
`/$/backups/restore`	`ADMIN_SYSTEM`	`backup.restore`
`/$/backups/*`	`ADMIN_SYSTEM`	`backup.read`
`/$/compactall`	`ADMIN_SYSTEM`	`api.<dataset>.compact` for all configured datasets
`/$/compact/<dataset>`	`ADMIN_SYSTEM`	`api.<dataset>.compact`
`/$/labels/<dataset>`	`USER`	`api.<dataset>.read`
`/<dataset>/access/*`	`USER`	`api.<dataset>.read`

If your Identity Provider is not able to manage roles and permissions information in a way compatible with Smart Cache Graph then you can disable this via the aforementioned FEATURE_FLAG_AUTHZ environment variable. If you disable this you may wish to limit access to these endpoints via other mechanisms available in your deployment environment, e.g. service mesh policy, proxy server rules etc.

`ENABLE_LABELS_QUERY`

Setting this to true will enable the security label query endpoint at http://{hostname}/$/labels/{datasetName}. More information about this endpoint can be found in the API docs. You can also run a Docker container with the endpoint enabled which can be accessed from the API docs by running:

scg-docker/docker-run.sh --config config/config-labels-query-test.ttl

To populate this instance with sample security labelled data you can run:

curl --location 'http://localhost:3030/securedDataset1/upload' --header 'Security-Label: !' --header 'Content-Type: application/trig' --data-binary '@scg-system/src/test/files/sample-data-labelled-1.trig'
curl --location 'http://localhost:3030/securedDataset2/upload' --header 'Security-Label: !' --header 'Content-Type: application/trig' --data-binary '@scg-system/src/test/files/sample-data-labelled-2.trig'

You can then query these endpoints for label data, e.g. for securedDataset1:

curl --location 'http://localhost:3030/$/labels/securedDataset1' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "triples":[
        {
            "subject": "http://dbpedia.org/resource/London",
            "predicate": "http://dbpedia.org/ontology/country",
            "object": {
                "value": "http://dbpedia.org/resource/United_Kingdom"
            }
        }
    ]
}'

Which should return the following:

{
    "results": [
        {
            "subject": "http://dbpedia.org/resource/London",
            "predicate": "http://dbpedia.org/ontology/country",
            "object": "http://dbpedia.org/resource/United_Kingdom",
            "labels": [
                "everyone"
            ]
        }
    ]
}

Or to query securedDataset2:

curl --location 'http://localhost:3030/$/labels/securedDataset2' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "triples":[
        {
            "subject": "http://dbpedia.org/resource/Birmingham",
            "predicate": "http://dbpedia.org/ontology/populationTotal",
            "object": {
                "value": 2919600,
                "dataType": "xsd:nonNegativeInteger"
            }
        }
    ]
}'

Which should return the following:

{
    "results": [
        {
            "subject": "http://dbpedia.org/resource/Birmingham",
            "predicate": "http://dbpedia.org/ontology/populationTotal",
            "object": "\"2919600\"",
            "labels": [
                "census",
                "admin"
            ]
        }
    ]
}

Build

Building Smart Cache Graph is a two-step process.

The java artifacts are built using the maven release plugin. When these are released, the docker container is automatically built.

Check versions in release-setup.

Build and release the smart cache graph maven artifacts

On branch main:

Edit and commit release-setup to set the correct versions.

source release-setup

This prints the dry-run command. If you need to change this file, edit it, then simply source the file again.

Dry run:

mvn $MVN_ARGS -DdryRun=true release:clean release:prepare

and for real:

mvn $MVN_ARGS release:clean release:prepare

This updates the version number. Our automated GitHub Actions pipelines handles publishing the release build to Maven Central and Docker Hub.

After release, do git pull to sync local and remote git repositories.

To rebuild for update version for development:

mvn clean install

About the Docker Container

The docker container is automatically built by github action on a release of the Smart Cache Graph jar artifacts.

In the docker container we have:

    /fuseki/logs/
    /fuseki/databases/
    /fuseki/config/

and configuration files go into host mnt/config/.

Try it out!

The provided script, latest-docker-run.sh, runs the latest published image of SCG in a docker container, with the contents of the local mnt/config directory mounted into the newly generated docker image for ease of use. Similarly, the mnt/databases and mnt/logs are mounted for easier analysis.

Example configuration - Default

   scg-docker/latest-docker-run.sh

Passing no parameters means that it will default to ("--mem /ds")

It specifies an in-memory dataset at "/ds" which replays the "RDF" topic on start-up. It assumes that Kafka must be up and running, prior to launch.

The Fuseki server is available at http://localhost:3030/ds.

Example configuration - ABAC

   scg-docker/latest-docker-run.sh --config config/config-local-abac.ttl

This runs the server using the configuration file [config-abac-local.ttl](scg-docker/mnt/config/config-abac-local.ttl. It specifies an in-memory dataset at /ds and that Attribute Based Access Control is enabled.

Note: See caveat below re: authentication.

Example configuration - Kafka Replay

   scg-docker/latest-docker-run.sh --config config/config-replay-abac.ttl

As this suggests, this runs server using the configuration file config/config-replay-abac.ttl or config-replay-abac.ttl as it's known locally.

It specifies an in-memory dataset at "/ds" which replays the "RDF" topic on start-up. It assumes that Kafka must be up and running, prior to launch.

The Fuseki server is available at http://localhost:3030/ds.

More advanced testing - docker-run.sh & d-run

To run the local instance you can use other scripts. You will need mvn installed in order to build the code (as described above).

You can then run docker-run.sh to use the newly built images.

   scg-docker/docker-run.sh

It uses the same parameters as the latest-docker-run. sh script above.

Alternately, you can use the script d-run which will map the relevant config and database directories from the local filesystem, pulling down the given image and running it directly (not in -d mode). It requires a number of environment variables to be set as indicated in the script.

It can be run with exactly the same configuration as latest-docker-run.sh except with no default configuration if nothing is provided.

Open Telemetry

Open Telemetry for SCG will be enabled if any environment variables with OTEL in the name are present at runtime. If this is not the case then the Open Telemetry Agent is not attached to the JVM and no metrics/traces will be exported.

Name		Name	Last commit message	Last commit date
Latest commit History 931 Commits
.github		.github
docs		docs
scg-benchmark		scg-benchmark
scg-docker		scg-docker
scg-server		scg-server
scg-system		scg-system
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
docker-compose-fuseki.yaml		docker-compose-fuseki.yaml
graph-api.yaml		graph-api.yaml
pom.xml		pom.xml
release-setup		release-setup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

smart-cache-graph

Example configuration

API Specifications

System Configuration

`USER_ATTRIBUTES_URL`

`JWKS_URL`

`USERINFO_URL`

`FEATURE_FLAG_AUTHZ`

Authorization Policies

`ENABLE_LABELS_QUERY`

Build

Build and release the smart cache graph maven artifacts

About the Docker Container

Try it out!

Example configuration - Default

Example configuration - ABAC

Example configuration - Kafka Replay

More advanced testing - docker-run.sh & d-run

Open Telemetry

About

Uh oh!

Releases 64

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

smart-cache-graph

Example configuration

API Specifications

System Configuration

USER_ATTRIBUTES_URL

JWKS_URL

USERINFO_URL

FEATURE_FLAG_AUTHZ

Authorization Policies

ENABLE_LABELS_QUERY

Build

Build and release the smart cache graph maven artifacts

About the Docker Container

Try it out!

Example configuration - Default

Example configuration - ABAC

Example configuration - Kafka Replay

More advanced testing - docker-run.sh & d-run

Open Telemetry

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 64

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`USER_ATTRIBUTES_URL`

`JWKS_URL`

`USERINFO_URL`

`FEATURE_FLAG_AUTHZ`

`ENABLE_LABELS_QUERY`

Packages