Add default region scope that matches any scoped packet that doesn't match any explicitly defined scopes#1791
Conversation
917943c to
458f66c
Compare
…match any explcitily defined scopes to allow repeaters by default to transport scoped packets not explicitly prevented from doing so.
|
Oh, yes please! 'any' is nice and descriptive but makes me wish '*' was actually called 'none' to begin with. On the flip side I wonder if any should be '?' to make it more obviously special and be similar to '*'. Downside is it continues the pattern of using wildcard characters to mean something different in meshcore than in other contexts. I can see the argument for not worrying about gracefully dealing with old firmware that might have an 'any' region but I bet there will be one person somewhere who gets really confused because of it. |
On balance, I think a verbal description would be better for UI/UX purposes than another character. * isn't a region scope, 'any' is, albeit a special one. My thoughts on the potential for an 'any' collision on the older firmware is that it would probably be sufficient to update the region configuration docs (I'll put out a new PR if/when merged) and specify this gotcha. Given that the documentation (IMO) should be updated for this change, it maybe sufficient. However if the mods feel this case should be handled, I'm happy to revisit this and update the PR. |
|
Howdy! Thanks for the PR. I've listed my thoughts below: Current Region System: Proposed Changes:
The We were using To avoid the issue you mentioned about existing regions maybe already being called For the default settings of these:
|
In principle making use of My initial thought would be that we might have to consider a standalone However, setting Simple truth is that many repeater owners are not active enough for this level of coordination to be practical, especially when you are potentially talking 100+ repeaters in a region. Thus we need to handle the "put it up and semi-forget about it" repeaters gracefully. Thus, whatever we do will need to be done gradually over a period of time. By setting the default to So in our case, our working group alone explicitly setting region scopes, followed by On a pure technical level it may be better to go with a default of I would also add that for those who feel |
We can probably bump the repeater protocol version, which gets sent in the login response, so clients know if it supports the new tags. The
Yeah,
I think the biggest issue we have on the mesh right now, is that there's too much flood traffic. As you noted, the popularity of MeshCore, this flood traffic is only going to increase as more companions start sending messages, and more repeaters come online and forward those packets. This situation of repeater region management, and your want in the PR is a bit of a catch-22 really. The intention of scopes is to limit where these flood packets are going. Currently, as implemented, repeater firmware will not pass on scoped packets unless configured to do so. Which instantly prevents scoped packets from flooding the entire mesh. If we changed the default logic to all repeaters forward all packets regardless of scope, by default, then scopes effectively become useless as they're always flooded across the entire mesh. So there would be no benefit to scoping the packet.
This is why I think it should be denied by default, with the abiltity for repeater admins to allow it if they really need it. As you said, many people will setup the repeater and forget about it, which means when the time comes for encouaging existing repeater owners to implement regions, it's too late because no one can contact the owners of those repeaters that have been abandoned/forgotten and scoped packets would always flood the network. If the thought is that repeater admins could setup the regions later, they might as well set them up right now, and it's sorted. Even having the option to forward all scoped as a setting could become detrimental to the network, especially if people start recommending to turn it on by default, and again, those admins never come back later. I think the way it's currently implemented, with scopes only being forwarded by repeaters that configure the scopes is the best way to prevent the unwanted flood traffic. I also understand the onboarding for region management isn't the best, and this could be improved in the app. Some discussions have been had on Discord, such as having the mobile app suggest regions based on repeater location. Some standardised list of region codes would help users pick the same settings easily. I've also envisioned some sort of "map ui" where a user could tap a geogrpahic area, or administration zone or country on the map, and that would automatically pick the correct region code that shoud be used. I'd also love to have the ability to filter the contacts/discover maps in the app by regions, but distributing this information across the entire mesh is not cheap, so I'm not sure if/how that would happen... Just one usecase example for reference. If I was to setup a bunch of repeaters and sensors at home and on the farm, on the public mesh frequency, and I wanted them to send out data on a shared private channel, I could add a region "#liams-private-region" to all my repeaters, and use it as flood scope on all my sensor units. By default, the rest of the mesh would never forward this traffic. However, if repeaters are set to flood scoped traffic by default, now, all of my private sensor data that's flooded within my private region scope, goes out to the entire mesh. ^ Just some thoughts from me after midnight :) |
|
just my 2c: all unmaintained repeaters with v1.10.0 and older will always forward both scoped and non-scoped flood traffic |
I agree. And it is the biggest meshes that have the greatest problem. And the feedback I have gathered from key repeater owners and users in these meshes is that they/we (I am one of them) need a way to gradually phase in region scoping.
Not really. If clients don't set scopes, repeaters have no information and scoping is by definition impossible. If clients do set scopes, repeaters then have the ability or option to make use of them. So the first priority must, by definition be, to get clients to scope. Theoretically, you could achieve this by going down the route of forcing a breaking change on them, but unless you are willing to remove or deny
In practice, in the big meshes at least, no clients are bothering to scope their packets and not many repeater owners are bothering to set scopes because currently, scoping is not being used. Currently almost all clients that set a scope end up with nothing being delivered. So they remove the scope. So in practice, the scopes are not being used and so any theoretical benefit of the existing setup is moot, because scopes aren't being used. The users in our meshes first priority is that "it works". If you try and drop a breaking change on them and they have the ability to just not bother setting scopes they will do what they are currently doing - don't bother with scopes at all.
They are already useless, because they aren't being used. My proposal is that we initially go permissive to make it practical for clients to set the scopes in the first place. Then we can through communication and/or firmware changes get to restrictive and make scopes useful in their intended role
If they are entirely abandoned - then either of our proposals are ultimately moot points. If, as seems to be more often the case, the repeaters are semi-actively maintained then your concerns could be alleviated, without tanking traffic during a switchover by initially releasing the firmware with a default of There is also the topology of many meshes to consider, where you have a handful of repeaters being responsible for inter-region traffic - which is exactly the traffic we are trying to reduce. So, in the UK's case, the actions of a handful of individuals can make a significant contribution to this. And we have the group ready to make the change. And what that group needs is clients to set scopes on their packets. And for that to happen, initially , clients need to be able to set a scope without their packets disappearing into the ether. Also in our experience the important traffic nodes tend to be the ones that are actively managed. Finally, in practice, MeshCore has been developed on the premise that we don't want to stop traffic moving due to a new firmware update, hence
There are some repeaters that have set regions scopes (including me) but at the moment, it is a pointless exercise because clients aren't using scopes. My references to setting later refers to a gradual rollout amongst "Trunk" repeater owners where, initially we are permissive with
We currently have, in practice, exactly that situation and this PR isn't going to change that. I don't know of a single repeater that has
If clients don't set scopes, then we cant do a darn thing. In practice, the only way to get the existing clients in big meshes to set scopes is if they can do so without their packets disappearing into the ether. And the only way to do that is to do a rolling change rather than a sudden change.
If we are talking about private scopes, that could always be handled differently. As far as I am aware, we currently only have public scopes in production. If the above isn't the case; If so, in the above use case, you/they used the current region functionality in a way that isn't the intended use case or the key functionality and isn't to my mind a good reason not to proceed with this change. It might be justification for a new feature however. I have no desire to inconvenience other users with my proposed changes. However, the point of region scoping is to reduce unnecessary traffic. In the meshes where this need is greatest, the current setup is simply not workable and scoping is not being used. Scoping is effectively un-implemented. And practically speaking, given the sizes of the big meshes, we have to pursue a gradual approach. If clients won't set scopes, scoping will not work. The only practical way to get these clients to set it is if initially, they can do so in a relatively pain-free manner. Any attempt at trying to ease that burden via breaking changes followed by "app" helpers would be stuck in the Catch 22 that you mentioned earlier. Where as get clients scoping first, then start enforcing solves the paradox, without, practically speaking, risking an increase in traffic because where it matters, everything is Rough order of events I would imagine would be;
|
|
I've been thinking about the "region question" a lot recently, and I think I have a solution that solves the catch-22 problem, using @liamcottle 's new region picker (work in progress), some UI changes, and a new channel type, while retaining the current local flood messaging model for existing Public and hashtag channels for backwards compatibility and frictionless new user onboarding UX. I like the proposal for explicitly naming the special regions All of the following are currently true:
My proposal... example scenario is a user with a companion and a repeater located in Vancouver, British Columbia (BC), Canada App UI - Repeater
I think this might be the plan already? This next bit is novel I think, and critical to bootstrapping adoption: App UI - Companion
New - Region Channels These new region channels would be locked to the same name and scope as the location segment (name and scope immutable) and should probably be indicated by a new prefix type, maybe In addition, having a banner message in region channels to the effect of "All repeaters in the path must have regions configured for this channel to deliver messages" would create social pressure on repeater admins to configure regions, with the ability to still use flood in Public or direct messages to contact those admins (note that This would be self-moderating, in that the region channels are already ready to use, and if there is too much local chatter in Public or #test or #bot, users will suggest "Please move this discussion to +region-name, you're flooding our mesh" (region channel clickable like hashtag channels - and if it hasn't already been created, would launch the region picker to promote adoption again). Misc This should avoid most collisions when regions have the same name, eg There are a few edge cases where users and admins would have to configure custom regions, for example the Twin Cities of Minneapolis–Saint Paul, or Vancouver WA and Portland OR, which are in separate states but culturally one region. I haven't really thought about private regions, as my main concern is large mesh traffic congestion and the biggest wins with the least change. Conclusion One of the advantages of a mesh topology is that not every repeater owner has to set up regions, some critical number will route the messages around repeaters that have not been configured. As @ElectroMW pointed out, trunk repeater admins are much more engaged and if they are prompted to use a region picker on repeater admin login I believe most would accept (look at how quickly multibyte path is being adopted). The worst case scenario is people do not move to region channels, traffic does not decrease, and we're back to choosing between If a repeater admin ignores the region picker prompt and does not set a region, with the default flood behaviour, region channel messages would not be forwarded, but in most cases the mesh could route around it. The question remains which approach is more desirable:
I think that the UI changes above, especially auto-adding region channels so they are instantly visible to users, and prompting repeater admins to set location, would "bootstrap" region adoption quickly enough that option 1 (the current behaviour) would be sufficient, with the added motivation that encouraging others to use region channels would have an immediate benefit of improved reliability through congestion reduction for yourself and for everyone else. These social/incentive effects are very powerful! I believe this proposal fits with the "meshy ethos" of a cooperative network topology arising from individual decisions, with guidance on best practices from the UI design. Thoughts? |
|
What if *scoped was combined with a hop count limit as a happy medium during transition? So default allow flood non-matched *scoped, but it only has a few hops to find a region scoped repeater or it dies. This would cover the gaps of the slow to adopt while still reducing the congestion. Then proceed as proposed: slowly get more restrictive on scoped as adoption grows. |
Moving the question of whether to initially I can imagine using existing standardised region boundaries will work quite well in some parts of the world but not others. To give you an example from the UK. The area I live in is close to the boundaries of 3 counties and 3 regions. We have channels for 2 of those regions and three of those counties and these are rarely used. The one that is primarily used is a composite channel of those three counties combined. This wouldn't show up on any standardised region list and directing new users to those standardized channels wouldn't be the best approach in our case. So, in addition to the above standardized selection, IMO, there needs to be provision for non-standardised regions and channels that are already in use and are effectively the local channel for that area/region. You would also need to consider that there is likely (at least where I am) to be more than one scope in use, including possibly a national and international scopes to consider.
This ultimately is "just" a mechanism to enforce using region scopes if I have understood it correctly. If that is the case;
I do like the idea of requiring clients to set a scope. I think that if region channels were intended as an eventual replacement of #hashtag channels, I would perhaps agree with them as a way forward. However, I doubt they would work out if the initial rollout was along with On the point of critical repeater owners - if local nodes don't have their scopes set correctly, then the traffic won't get to the critical nodes for their active approach to be useful. The diligence of critical nodes is a benefit for a generally permissive approach, as they can prevent much inter-region traffic on their own. You bring up some nice ideas but fundamentally this is no different to Liam's approach IMO. I would agree that yours and Liam's approach on a theoretical level is better but it flies in the face of the practical experience here when it comes to region scoping rollout attempts. There is a crucial difference between this change and other changes, such as multibyte paths. I, as a client could say try and set 2 byte hops. The repeaters locally don't yet have the required firmware version. No problem, I drop back to 1 byte hops and my traffic goes through. If we take yours and Liam's approach and initially default to deny, then unless enough relevant repeaters have set the scopes I need, I will either;
No one has yet, it seems to me, come up with a convincing way to square that circle. The only way I see region scoping being widely used with yours and Liam's suggestions is to force clients to use them and I think sadly we would likely see a significant break in mesh connectivity, to the extent that some and possibly many people might walk away. Here at least, the biggest carrot MeshCore had over OtherMesh was that messaging "just worked". We need to get clients scoping and repeaters acting on that information in smart and useful ways. If we are not careful and go down the "move fast and break it" road, we risk blowing up what so many of us has built.
Sounds like a reasonable mitigation to concerns of *scoped traffic to me. What sort of limit do you envision? At the end of my last response I included a brief checklist that I imagined how initially being permissive might work - I would update that slightly with the proposed app changes so that at stage 2, the app would start to nag clients and repeater owners to be setting scopes. |
My area is geographically wide but relatively small and supported by a dedicated crew of admins. So 2 hops and you are likely going to find a complete path of region configured repeaters. To be honest we're likely in a position to fully adopt without much friction. However, I was experimenting with the concept of non-configured scope rules and and ability for limiting hop count on various conditions before finding this PR. So I was going the direction of a configurable number. But that is for another use case and a departure from this PR, so I'd say just hardcode a limit of 16 or less and reduce it each release. I can't speak for more congested areas however. There is also the issue that it could go many hops through region configured repeaters only to get dropped by a critical unconfigured repeater. Possibly other issues with mixed hop count limits. I do welcome the undefined *scoped addition in general as I can see edge case situations where you might want to go with a blacklist strategy over an implicit deny. Maybe trying to go to great lengths to mitigate the issue of those who are not engaged isn't the right strategy and it may come down to just pulling the trigger and designate a few ambassador nodes that allow null region to monitor and onboard non-scoped users. Most areas have a majority of engaged admins and it wouldn't take that much adoption before it becomes practical to force, or strongly direct towards, region selection in the app. |
I appreciate this approach is a relatively long a difficult one - without wanting to attack or criticise the original authors of Region scoping, we are here because of the limitations of the original/current implementation of scoping. Any route forward will incur a great deal of effort. The only significant decision to my mind is do we handle this in a more gradual manner with fewer risks, where we (devs, active users) do more or with a relatively hard change with, to my mind, unnecessary and potentially significant risks to connectivity. Sadly at the moment I currently foresee 2 likely outcomes if things continue in the current vein, based upon mine and others experience in a busy mesh. It will likely lead either to clients not bothering to scope, which leads to congestion issues (we are starting to see evidence of this with critical repeaters). Or if we go down the route of enforcing scopes without an interim, permissive phase, an unpleasant period of connectivity issues. Neither is good for the health of the project and might well lead to us loosing users. Hence why I've continued to push for this change despite lack of interest from the powers that be. Although among local users I've talked to the feedback for this proposal has been almost unanimously positive, |
Seems they are interested? Can't expect these kind of changes happen immediately. Especially when there are alternative proposals still coming in. I was in the camp of default allow for a forced gradual rollout, but I've come around the other way. Regions didn't exist before, they were created to solve the congestion issue, and current implementation is deny by default, for better or worse. Switching to allow by default will be a breaking change, along with negating the reason for it. Seems the simple proposal of adding *scoped/*nonscoped rules gives the admins the ability to manage the rollout in a way that suits their mesh without introducing changes. And a fix that could happen sooner than later, which is the greater concern for some. Default allow takes this power away from the active admins and in fact punishes a well coordinated mesh. After that, this is more of a community management issue I feel. Collaborative meshing is always going to require some amount of engagement and coordination. Are admins that are so disengaged that they can't coordinate on rollout going to be up to date with the firmware that would force it on them, or coordinate on future network health issues? If there are users that don't want to seek out community sources on network use or listen to local ambassadors because it doesn't "just work", wouldn't those users be happier on a monolithic service anyway? |
I meant with respect to taking the permissive then tight, gradual approach as opposed to region scoping generally.
As I've discussed previously - at the moment, particularly where scoping is most needed, in the larger, most congested meshes, scoping isn't really being used and thus in practice this isn't a breaking change, because the feature isn't really being used. In practice going permissive now isn't changing existing traffic patterns but it does provide a pathway to get clients to scope their packets in a relatively pain-free way. That can then be followed by restrictive. What would be in practice a breaking change would be to enforce scoping without a rollout phase. Again, without wanting to in any way criticize the individuals who made the decisions regarding scoping as it currently exists but I think the current lack of scoping by clients is hard evidence that the initially restrictive approach they took hasn't worked. But, as take-up is still low to 0, we have the opportunity to resolve that.
Can you provide examples of meshes that are already using scoping as currently implemented, that this change would negatively impact? Even so, it does not take any power away from or punish well coordinated meshes. If the are so well coordinated, it would be super easy to change one setting after a firmware upgrade. The flip side, a breakdown in communication if users are forced to scope without a proper rollout would be punishment. The flip side of requiring 100s of repeaters to near-simultaneously configure a list of regions that in many places have yet to be defined, would be punishing any significant mesh, well coordinated or otherwise. The larger meshes that I have either personal experience with or have spoken to people who have (UK and West Coast/Cascadia specifically) have tried to implement scoping and failed using the existing setup.
In one word - Scale. In the UK we have something close to a country wide mesh with 1000+ repeaters. Even with the best will in the world (and many repeater owners are active enough) trying to pull this off purely through breaking change (assuming not scoping packets will be at some point be prevented in order to help adoption - otherwise you will probably get what we have now - people not bothering to scope) followed by communication will likely lead to problems, potentially big ones. Given the above, having spoken to key individuals (the ambassadors you mention) in these meshes, the near unanimous opinion I have had from them is that we need a way to do this gradually. I agree, in principle, theoretically, the restrictive argument is more attractive and makes more sense. But I sincerely doubt those arguments will hold up to the reality of meshes like the UK if/when people are forced to use scopes, or we have people continuing to not scope because it is the only way to make it work. There is also a key point that seems to be being missed. Which is better, working links that are congested or broken links? |
It is a breaking change at the project level if a feature is completely the opposite to what it was when released. That has the impact of frustrating active admins which is greater risk to project survival than casual users. An "in practice" breaking change is a community based one.
It absolutely punishes them by increasing the propagation of uncoordinated regions in, or near, an otherwise coordinated mesh.
I follow West Coast; from the ones I've talked to their main concern is congestion, to the point of considering drastic, total decoupling solutions, beyond region scoping even.
Still a community based scaling issue. There is always going to be scaling issues that require community coordination.
Which @liamcottle proposed solution provides to those admins.
In my opinion, broken links are better. Unpredictable broken leads to confusion and frustration, hard broken leads to seeking out why and pressure to collaborate. Without the option to set *scoped rules I agree, will be difficult to roll out. Which is why I think it is better to encourage that change without complicating it, sooner than later, as it will allow gradual roll out without making any breaking changes. |
|
Generally, I haven't seen anything new in your latest response that I haven't already responded to elsewhere in the thread so I won't go point by point. I will talk about the seeming hard distinction between "community" based changes and (presumably) "firmware" based changes. In practice, there is influence in both directions and, although a line must be drawn somewhere, where it is drawn is very much up for discussion. It maybe argued that this entire thread of discussion, is fundamentally about where to draw that line. That you have chosen in this instance to draw it where you feel it should be, is, by definition, your decision. I would however, draw to your attention the original release and current state of region scoping. We were provided Why was this? Presumably because it was considered that it would be too big a change for the community to practically handle and so the decision was made to draw the line a few steps in the "lets help the community" direction as opposed to "its on you" direction. So, although it is very clear you don't agree with me on where that line should be drawn, given the above, I don't think my position can be fairly dismissed as being the wrong side of a hard empirical line. I think over several posts - I have laid out, with a fair amount of detail my thoughts and ideas about rolling out region scoping, based upon a fair amount of hard evidence and experience. However the negative responses ultimately don't seem to be much more than So for now, I think I am going to hold fire on further discussion until/unless someone who is in favour of Or one of the maintainers comes out and says that they have decided to not consider a default of |
In response to issue #1751 here is a proposed change that would provide a resolution to a catch-22 problem that is hampering region scope adoption.
Rationale
Currently if a client starts scoping packets, any repeater that doesn't explicitly have a region scope rule for that scope will simply drop the packet. There have been attempts to implement scoping (London and WAW to my knowledge but probably others), but these have failed as currently it would require a local area to get almost 100% of repeater owners to change settings simultaneously.
So clients simply don't bother setting scopes as it doesn't work for them. Repeater owners don't bother because very few clients are scoping their packets.
So, by having a default scope, that defaults to allow 'F'lood, the default behaviour for both scoped and un-scoped packets is to be repeated, whilst not reducing the ability of repeater owners to restrict traffic. This is particularly helpful for dealing with repeater owners who largely setup then ignore their nodes.
This means, once enough repeaters have upgraded to the relevant FW version or newer, additional attempts to roll out scoping can be implemented gradually by smaller numbers of nodes (there are already working groups of "trunk" repeater owners able to co-ordinate their repeaters' scope setup simultaneously to encourage the rest of the mesh) without tanking traffic or without creating counter-productive client frustration.
Over time, when a critical mass of scoping rollout has been achieved, repeater owners can simply toggle 'any' to deny to replicate current functionality. If need be, in the future, 'any' could be changed to default to deny in a future FW release.
Changes
This change creates a second special RegionEntry, 'any,' that is largely handled in the same way as the existing 'wildcard' RegionEntry. It takes 1 byte from the current "reserved header" for persisting the 'any' flag.
For data export purposes, 'any' is appended to the end of the list. For get/set purposes, 'any' is resolved before user defined scopes, like the * scope and cannot be deleted.
Currently if a repeater has a user scope called any, it will be functionally overridden by 'any' and couldn't be deleted. My thoughts are that that is rarely enough to be the case that it wouldn't warrant extra handling.