Skip to content

fix: tedge flows cli loads mapper specific builtin transformers#4020

Merged
didier-wenzek merged 7 commits intothin-edge:mainfrom
didier-wenzek:fix/tedge-flows-cli-using-cloud-specific-steps
Mar 21, 2026
Merged

fix: tedge flows cli loads mapper specific builtin transformers#4020
didier-wenzek merged 7 commits intothin-edge:mainfrom
didier-wenzek:fix/tedge-flows-cli-using-cloud-specific-steps

Conversation

@didier-wenzek
Copy link
Copy Markdown
Contributor

@didier-wenzek didier-wenzek commented Mar 3, 2026

Proposed changes

  • tedge flows loads c8y specific builtin transformers
  • aws and az specific builtin transformers
  • builtin transformers are available to all mappers.
  • tedge flows test --context <path-to-or-inlined-json> initializes the tests context with a JSON object
  • Documentation for all the builtin transformers

This raises several questions:

  • Do we have to load all the builtin transformers, even when the flow under test is only related to one of the mappers?
    • Yes, all the builtin transformers are always available, whatever the mapper, cloud-specific or local.
  • Can we simply make all the transformers, available to any mapper, being cloud specific or local?
    • Yes, so there is no prerequisite on running a specific mapper to run a transformer.
    • Note there is no impact on tedge binary size as this executable already includes all the daemons and mapper variants.
  • What about creating the builtin flows? For all the mapper?
    • No, the builtin flows are only created by their respective mapper.
    • We want to avoid the creation of flow definitions that are never used.
  • What about transformers expecting a configuration that is mapper specific (e.g. the cache which has to be aware of the MQTT topics for birth messages)?
  • What about the transformers that expect the context to be populated? Currently, the c8y mapper populates the context using a very c8y specific piece of code: the entity cache. Should this cache be promoted to a builtin transformer?

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

#3992

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s. You can activate automatic signing by running just prepare-dev once)
  • I ran just format as mentioned in CODING_GUIDELINES
  • I used just check as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

@didier-wenzek didier-wenzek temporarily deployed to Test Pull Request March 3, 2026 10:51 — with GitHub Actions Inactive
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 3, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 3, 2026

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass % ⏱️ Duration
888 0 3 888 100 2h50m47.714171s

@didier-wenzek didier-wenzek force-pushed the fix/tedge-flows-cli-using-cloud-specific-steps branch from 0d0d408 to 932fbea Compare March 17, 2026 08:54
@didier-wenzek didier-wenzek force-pushed the fix/tedge-flows-cli-using-cloud-specific-steps branch from ef26ec3 to 940b101 Compare March 17, 2026 17:31
@didier-wenzek didier-wenzek marked this pull request as ready for review March 18, 2026 16:07
@reubenmiller
Copy link
Copy Markdown
Contributor

@didier-wenzek After trying out the new functionality, I'd be happy merging it as is as it satisfies the main requirements. Below shows what does and doesn't work.

Main Requirements

  • tedge flows list --mapper c8y lists the c8y mappers successfully
  • tedge flows list --mapper local can load a flow which uses every built-in function/transformer (this isn't a use-case but shows it is "recognized" )

Nice to haves (which currently don't work)

The following doesn't work, but more due to the fact that the builtin functions like into-c8y-measurements have some additional dependencies on the context with isn't available yet within the flows:

  • Using tedge flows test for testing c8y related topics

    echo '[te/device/main///m/foo] {"temp":1.23}' | tedge flows test --mapper c8y --debug
    
    # no output is shown due to the messages being discarded at the "into-c8y-measurements" step.

If the above requirement could be made to work by providing some context on the command line, then that might be good enough. Otherwise, if it is too much effort then we can postpone this part.

@didier-wenzek
Copy link
Copy Markdown
Contributor Author

The following doesn't work, but more due to the fact that the builtin functions like into-c8y-measurements have some additional dependencies on the context with isn't available yet within the flows:

* Using `tedge flows test` for testing c8y related topics
  ```shell
  echo '[te/device/main///m/foo] {"temp":1.23}' | tedge flows test --mapper c8y --debug
  
  # no output is shown due to the messages being discarded at the "into-c8y-measurements" step.
  ``` 

Yeah, this is why the system tests are a bit involved by an helper flow that feeds the context with entity metadata published over MQTT.

If the above requirement could be made to work by providing some context on the command line, then that might be good enough. Otherwise, if it is too much effort then we can postpone this part.

This should be straightforward. I will a --context option to populate the context from a JSON object.

pub fn load_builtin_transformers(flows: &mut impl FlowRegistryExt) {
c8y_mapper_ext::load_builtin_transformers(flows);
az_mapper_ext::load_builtin_transformers(flows);
aws_mapper_ext::load_builtin_transformers(flows);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: as it is now, it breaks conditional compilation (e.g. cargo check -p tedge-mapper --no-default-features --features aws for only aws mapper), but should be easy to fix with #[cfg(...)] attributes. What's interesting though is we're not testing that configuration in CI, so all the tests pass.

[package]
name = "az_mapper_ext"
description = "thin-edge aws extension adding support for azure data model"
description = "thin-edge extension adding support for azure data model"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

Comment on lines +51 to +60
Install Flow
[Arguments] ${directory}
${start} Get Unix Timestamp
Execute Command sleep 0.1
ThinEdgeIO.Transfer To Device ${CURDIR}/${directory}/* /etc/tedge/mappers/local/flows/${directory}/
Execute Command ls -lh /etc/tedge/mappers/local/flows/${directory}
Should Have MQTT Messages
... topic=te/device/main/service/tedge-mapper-local/status/flows
... date_from=${start}
... message_contains=${directory}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: very similar to Install Flow in main tedge_flows suite, we could extract it to a shared resource file

@didier-wenzek
Copy link
Copy Markdown
Contributor Author

If the above requirement could be made to work by providing some context on the command line, then that might be good enough. Otherwise, if it is too much effort then we can postpone this part.

This should be straightforward. I will a --context option to populate the context from a JSON object.

Added 9080896

Test c8y specific transformers with tedge flows cli and initial context
Install Flow custom-measurements
${transformed_msg} Execute Command
... tedge flows test te/device/main///m/environment '{"temperature": 258}' --context '{ "device/main//": {"@id":"raspberry-007","@type":"device","@topic-id":"device/main//","@name":"raspberry-007","@type-name":"thin-edge"} }'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity, which scope is the context added to on the JavaScript level? I was half expecting to see either a flow, script, or mapper in the keys

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime can only inject values in the mapper context. So this is implied.

Copy link
Copy Markdown
Contributor

@reubenmiller reubenmiller Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having a problem getting the test code to work for the c8y mapper and a child device.

I'm re-using the context.json you have in the test files.

# works
$ tedge flows test --mapper c8y te/device/main///m/environment '{"temperature": 42}' --context ./context.json
[c8y/measurement/measurements/create] {"temperature":{"temperature":{"value":42.0}},"time":"2026-03-19T15:44:53.991Z","type":"environment"}

# no output
$ tedge flows test --mapper c8y te/device/child-xyz///m/environment '{"temperature": 42}' --context ./context.json

I'm assuming something is just missing from the context object which is required when mapping data for a child device, but I'm not sure what.

This is the context.json that I'm using (copied from the test)

$ cat context.json 
{
  "device/main//": {
    "@id":"raspberry-007",
    "@type":"device",
    "@topic-id":"device/main//",
    "@name":"raspberry-007",
    "@type-name":"thin-edge"
  },
  "device/child-xyz//": {
    "@id":"raspberry-007-child-xyz",
    "@type":"child-device",
    "@topic-id":"device/child-xyz//",
    "@name":"child-xyz",
    "@type-name":"thin-edge"
  }
}

Copy link
Copy Markdown
Contributor

@reubenmiller reubenmiller Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After doing some more digging, it seems the cache-early-messages is the step which is causing the messages for child entities to be discarded (in this case probably cached).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After doing some more digging, it seems the cache-early-messages is the step which is causing the messages for child entities to be discarded (in this case probably cached).

Oh yeah. The cache-early-messages awaits for a child birth message before releasing the measurements to the next step. => We really need to document all these builtin-transformers.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit adds the example that I was after. Being able to replay multiple input messages is a very useful feature!

@didier-wenzek didier-wenzek force-pushed the fix/tedge-flows-cli-using-cloud-specific-steps branch from 9080896 to ca518d1 Compare March 19, 2026 15:56
- `{ builtin = "update-context", config.topics = "te/+/+/+/+/m/+/meta" }`
- If a message doesn't match the configured topic, this message is passed unchanged to the subsequent transformation steps.

### `into-c8y-alarms`
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with alarms. I will add sections for measurements and events, after a first series of comments.

- The default for the Cumulocity mapper is to publish birth messages on `te/device/main/service/tedge-mapper-c8y/status/entities`
- This is configurable. A custom mapper (which topic id is `a/b/c/d`) is expected to publish finalized entity registration
on `te/a/b/c/d/status/entities`.
- A birth message is composed of
Copy link
Copy Markdown
Contributor Author

@didier-wenzek didier-wenzek Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting the cache-early-messages transformer and birth messages highlights that something is wonky.

First the config parameter for cache-early-messages should be a topic not a topic-id from which is derived. Why a custom mapper should be forced to publish these messages on a specific topic?

Second and more importantly, the contract is difficult to explain because based on fragile assumptions including another component. This other component is either the c8y mapper or a user-provided flow which is responsible of populating the context before sending birth messages.

  1. The c8y mapper or a user-provided flow shares entity metadata in the context.mapper.
    And only then emits birth messages for these entities.
  2. cache-early-messages reacts on birth messages releasing any message cached for these entities.
    assuming blindly that the cache is actually properly populated for those.
  3. The step downstream proceeds transforming the messages released by the birth messages;
    but discarding them if step one has not be done properly.

I do think this must be fixed. I propose the following:

  • Deprecate birth messages before release 1.8. This is a fragile mechanism aimed to signal the context is populated but not directly related to the context.
  • Add an onContextUpdate callback to flow scripts and transformers. This will be called when the context.mapper is updated.
  • Re-implement cache-early-messages to leverage onContextUpdate callbacks to release cached messages when the context is actually populated and not when a tier says it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though we might need to talk about how the onContextUpdate handler would work, and if it would react from any changes in the context, or just a subset, and what order is the handler called etc....though it does sound interesting a slightly more fitting for what we are using it for (in this case at least)

Copy link
Copy Markdown
Contributor

@reubenmiller reubenmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very nice improvement to be able to run flows in other mappers and make it easier to run the tedge flows list and tedge flows test against any mappers (even though mappers like c8y require a bit more of a setup)

@didier-wenzek didier-wenzek force-pushed the fix/tedge-flows-cli-using-cloud-specific-steps branch from f28517b to 56a1f00 Compare March 20, 2026 16:54
@didier-wenzek didier-wenzek enabled auto-merge March 20, 2026 17:00
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
…t value

When shared by all clouds the cache-early-messages transformer
can no more be configured with a mapper topic id when added to the transformer registry.
This topic id must be configured by the flows.

Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
The key point is to generate birth-messages for the entities which metadata
have been stored in the flows mapper context. The `cache-early-messages`
transformer push then downstream the measurements cached for these entities.

Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
Signed-off-by: Didier Wenzek <didier.wenzek@free.fr>
@didier-wenzek didier-wenzek force-pushed the fix/tedge-flows-cli-using-cloud-specific-steps branch from 56a1f00 to 7a74894 Compare March 21, 2026 12:55
@didier-wenzek didier-wenzek added this pull request to the merge queue Mar 21, 2026
Merged via the queue into thin-edge:main with commit 0acfd10 Mar 21, 2026
32 of 34 checks passed
@didier-wenzek didier-wenzek deleted the fix/tedge-flows-cli-using-cloud-specific-steps branch March 21, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants