Skip to content

Ticket #5047, #5048: Separate mc.keydef file, updated syntax#5061

Open
egmontkob wants to merge 6 commits intoMidnightCommander:masterfrom
egmontkob:5047-keydef-file
Open

Ticket #5047, #5048: Separate mc.keydef file, updated syntax#5061
egmontkob wants to merge 6 commits intoMidnightCommander:masterfrom
egmontkob:5047-keydef-file

Conversation

@egmontkob
Copy link
Contributor

Proposed changes

Move the escape sequences from the global "mc.lib" and user "ini" file to new global and user "mc.keydef".

Saner escaping, nicer look.

Removed "terminal:" prefix.

Checklist

👉 Our coding style can be found here: https://midnight-commander.org/coding-style/ 👈

  • I have referenced the issue(s) resolved by this PR (if any)
  • I have signed-off my contribution with git commit --amend -s
  • Lint and unit tests pass locally with my changes (make indent && make check)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added the necessary documentation (if appropriate)

@egmontkob egmontkob requested a review from mc-worker March 10, 2026 12:28
@github-actions github-actions bot added needs triage Needs triage by maintainers prio: medium Has the potential to affect progress labels Mar 10, 2026
@github-actions github-actions bot added this to the Future Releases milestone Mar 10, 2026
@egmontkob egmontkob requested a review from zyv March 10, 2026 12:28
@egmontkob egmontkob added area: keybind Key bindings and removed needs triage Needs triage by maintainers labels Mar 10, 2026
@egmontkob egmontkob modified the milestones: Future Releases, 4.9.0 Mar 10, 2026
@egmontkob
Copy link
Contributor Author

This is a cleanup before finalizing the real work in #4632 & friends.

Documentation (manpage etc.) is yet to be updated, but I'd prefer to hear a round of feedback first.

Copy link
Contributor

@mc-worker mc-worker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick view.

s++;
}

return g_string_free (ret, FALSE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return GString itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return GString itself.

Is there a generic guideline here? Should we always prefer GString to char* as retval? How about input parameters?

Or should the API depend on how the internals were the most convenient to implement: if we already have a GString then return that?

I was tempted to change the input of this function, as well as the retval of the decode_controls() function to GString, for the reason that theoretically a terminal's escape sequence may contain a NUL byte (although in practice I really don't think it does, except for Ctrl+@ a.k.a. Ctrl+space which is the single NUL byte, handled elsewhere in the code).

For the retval of this function, as well as the input parameter of the decoding counterpart, the semantics is C-style string: cannot contain embedded NUL byte. What's the point of changing the return type to GString? Also, for consistency, shouldn't we then change the input parameter of the decoder too, making it possibly more cumbersome for the caller? Or have 3 GStrings and 1 char*?

Would love to learn more about the preference in general, and its reasons; thanks! :)

@egmontkob
Copy link
Contributor Author

Added a new first commit, moving from keymap to mcconfig a method that I previously copy-pasted into keydef.

I haven't yet addressed the char * / GString issue for decode/encode, I'd like to see a consistent full picture which of the 2+2 input parameters and retvals should be changed, please help me here :)

@mc-worker
Copy link
Contributor

I haven't yet addressed the char * / GString issue for decode/encode,

You do use a GString object within a function. Then you destroy it in parts in two different places: the container and some its members in the function, the char array in the caller function. It would be logical to destroy the entire object at once. And if you need the length of the string in the caller function, you have to call strlen() instead of use a lenmember of GString.

@egmontkob
Copy link
Contributor Author

You do use a GString object within a function.

I happen to use it, in both the decode and encode function, for no particular reason. Each of them could have very easily been written without GString. Maybe if I had written the code a day earlier or a day later, or in a slightly different mood, I would have gone with plain char *.

I do not write the implementation first, and then tailor the interface to that, this wouldn't make sense to me. I first create the interface, with whatever types that I believe make the most sense, which in this case was char *, then I come up with an implementation, which in this case happened to use GString helpers.

Then you destroy it in parts in two different places: the container and some its members in the function, the char array in the caller function. It would be logical to destroy the entire object at once.

Destroying at two parts is a standard practice, this is explicitly what g_string_free (..., FALSE) was designed for. It's used in mc's codebase around 40 times. There's nothing wrong with it.

And if you need the length of the string in the caller function, you have to call strlen() instead of use a lenmember of GString.

I don't think I use the length in the caller. But even if I did...

I know, for some reason, mc has an extremely strong aversion against running strlen() twice on a string. We're talking about teeny-weeny strings here, very few of them, accessed very rarely (typically only at mc's startup). Maybe we can shave off a few nanoseconds. But that's not necessarily true either.

Accessing data via a GString * requires one more indirection than via a char *, possibly accessing memory at two very different places, which might be worse cache-wise than accessing one neighborhood twice. So it's not clear whether the performance would increase or decrease, having access to len and not having to measure it again. (But, again, we actually don't need the length.) But it's absolutely certainly super negligible, it's the worst use of our time trying to optimize it.

What we need to optimize for is code cleanness, consistency, and quick development flow.

For the decode function, a return value of GString * might convey the incorrect message of supporting the NUL byte. It does not support it. I upgraded all the char * to GString * that are related to learning keys, reading keys from the config file, and it still didn't work for me. (The byte 0x01 Ctrl-A, or ESC followed by this byte also didn't work for me as an "escape sequence" assigned to a key.) There's more work to be done, and presumably exactly zero need for it, so I'd rather not do it now.

And how far would you go with your logic? E.g. mc_config_get_escape_sequence_list() returns a list of strings as char **, now one question for sure is whether to rather return GString **, another question is that if I happen to internally go with GArray then should I return that, just so that the caller has immediate access to the number of strings returned?

I honestly really don't like this philosophy of adjusting the interface: the return value (and you didn't respond: what about the input parameters?) to the implementation, rather than designing the interface first. I really don't like trying to micro-optimize the most insignificant bit, let alone with no information whether it would actually speed up things or not if the caller cared about the string length, let alone the caller now doesn't even care. I really don't like being so afraid of maybe (but in fact: not) calling strlen() an additional few dozens of times per mc startup.

I'm tempted to say that if you firmly insist on returning GString * if it's created in the function body anyway, my proposed solution would be to rewrite the body of those functions not to use GString :)

It might look like I'm just spending 15 minutes to type this comment instead of changing what you asked for. Behind it is much more unpublished work actually trying to adjust the code, and if we're going with GString anyway then trying to add support for embedded NUL bytes, which would also require to change the interface of mc_config_get/set_escape_sequence_list() from char ** to GString **, for which I used GArray as an internal helper, but then again back to the question: shouldn't I return that GArray just in case the caller finds it useful...? and do it in reasonable, well-polished commits in some meaningful order... just to realize during the process that e.g. unittests became uglier, and somehow the whole code just doesn't feel as clean to me as the char * version was. (Surely, I could restart and address your particular immediate feedback only...)

I'd really like to stick to the char * version for now, and focus on more important things... there's a lot left to do with keyboard handling, this PR is only about 1/4 - 1/3 of the way there.

@egmontkob egmontkob force-pushed the 5047-keydef-file branch 4 times, most recently from 98c3c4a to a5ebd7d Compare March 12, 2026 13:51
@egmontkob
Copy link
Contributor Author

How do you like the direction of the latest push?

Since I've worked on it a lot, I thought I'd upload it.

The encode_controls() / decode_controls() methods now support embedded NUL in the raw sequence.

encode_controls() takes a char * string and a ssize_t len parameter, the latter potentially being -1 for C-terminated strings; a pretty standard practice seen in GLib and elsewhere so that you don't have to create a GString if you don't have one. Currently there are two callers of this method: mc_config_set_escape_sequence_list() which does have a GString and the unittest which doesn't. I can modify to take GString if you prefer that.

decode_controls() returns a GString * because here the return value is the one potentially containing an embedded NUL.

The retval of encode_controls() and the input param of decode_controls() are still a char * because here we don't need embedded NUL. I still don't like the idea of encode_controls() returning a GString * just because it happens to have one internally, but if you really insist then I'll do it to be able to move forward with the rest. Also please then advise about the input param of decode_controls(), should that remain a char *?

Much of the internals have been reworked to carry a possible internal NUL of the escape sequence, although it doesn't work, it would need further investigation which is not my focus right now. That being said, if we ever happen to need that, the current underlying work might become useful.

Please advise whether I should continue this latest version, or the one before (with char * everywhere), and also if you really insist on converting both methods' output params to GString *.

@mc-worker
Copy link
Contributor

Maybe if I had written the code a day earlier or a day later, or in a slightly different mood, I would have gone with plain char *.

You love the hand-made memory management, don't you?

Destroying at two parts is a standard practice, this is explicitly what g_string_free (..., FALSE) was designed for.

Possible is not mandatory.

It's used in mc's codebase around 40 times.

It's not an argument. The code written in old years is needed to be refactored from time to time.

Maybe we can shave off a few nanoseconds. But that's not necessarily true either.

First we avoid micro-optimization, then we avoid any optimization at all. As a result we have a software that works hardly on multi-core multi-GHz CPUs. Don't forget than mc runs not only on desktops but in embedded devices where CPUs are not so productive.

and you didn't respond: what about the input parameters?

Getting char * from GString is just for free. Getting GString from char * requires extra memory allocations.

shouldn't I return that GArray just in case the caller finds it useful

In mc_config_get_escape_sequence_list() you returns array and its length as two separate entities. Why you don't want return a single object of type GPtrArray that contains them both?

mc_config_set_escape_sequence_list() takes NULL-terminated list and its length. Why? One of those is enough. (Who sad GPtrArray?)


I don't want to argue anymore. I just can modify myself your code from my point of view (in the separate branch).

Move the method to get a config file's full name from keymap to mcconfig,
because in the next commit keydef will need the same method.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…o a separate file

Move the escape sequences from the global "mc.lib" and user "ini" file
to new global and user "mc.keydef".

Add command line option --keydef/--nokeydef and environment variable
MC_KEYDEF, analogously to the keymap feature.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
Repeated keys aren't supported.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
@egmontkob
Copy link
Contributor Author

Maybe if I had written the code a day earlier or a day later, or in a slightly different mood, I would have gone with plain char *.

You love the hand-made memory management, don't you?

No, I absolutely hate hand-made memory management, I truly wish we had something even higher level than GLib.

When I said plain char *, I didn't elaborate: I wouldn't have gone with manual dynamic on-demand reallocating. I would have gone with measuring the input's length, and allocating that much, or twice that much for the encoder, which is guaranteed to be enough in this particular encoder/decoder. If I needed to do the reallocation manually, I'd almost certainly find some higher level methods (e.g. GString) to do that for me.

First we avoid micro-optimization, then we avoid any optimization at all. As a result we have a software that works hardly on multi-core multi-GHz CPUs [...]

It's absolutely not obvious that GString * with its on-demand expansion (including reallocating and copying over the data) would be any more effective than measuring the size upfront and allocating a large enough buffer to begin with, something that doesn't need the GString * wrapper on top of char *. Plus, accessing data via GString * rather than char * requires one more indirection. I am absolutely not convinced that your preferred code is any more effective than mine.

What I am convinced about is that it doesn't matter at all. Performance is not going to measurably increase finding whichever of GString * or char * is more effective at a place that's executed maybe around a few dozen times per mc startup.

In mc_config_get_escape_sequence_list() you returns array and its length as two separate entities. Why you don't want return a single object of type GPtrArray that contains them both?

I wanted to stay consistent with the g_key_file_get/set_...() API and their mc_...() wrappers (which you surely may find a bad idea). Also, GPtrArray was not in my active knowledge, I used GArray internally which I honestly hate so I'd rather switch to GPtrArray or whatever else :)

Also, as I've already stated: I wouldn't want the function signature to be based on whatever internals I happened to have picked (manual array allocation or GArray or GPtrArray or whatever). I believe that for the function signature to be easy-to-use and consistent with the similar functions is more important than reflecting its internals in the hope of possibly (although not provenly) shave off a few nanoseconds. As I understand you, you disagree with me here.

mc_config_set_escape_sequence_list() takes NULL-terminated list and its length. Why?

Again, my primary reason was consistency with g_key_file_get/set_...().

One of those is enough. (Who sad GPtrArray?)

... but above you said ...

what about the input parameters?

Getting char * from GString is just for free. Getting GString from char * requires extra memory allocations.

So at one point you argue that char * as the input parameter is fine because there asking for a GString * requires extra memory allocation in the caller, and yet here you argue that the caller should wrap its data in a GPtrArray?

I don't see a consistent guideline from you.

I don't want to argue anymore. I just can modify myself your code from my point of view (in the separate branch).

I don't want to submit code that you really dislike and would jump on to refactor. But I don't yet understand the rationale behind your requests, nor do I see those requests being consistent.

Do you like that in my latest version I changed many char * to GString * where we store the raw escape sequence in memory or pass it around, in preparation of possibly supporting an embedded '\0' byte in the future? Is it okay if I continue from that latest version, rather than the previous one?

I'm happy to change the internals from GArray to GPtrArray.

For the retval: Okay, I can return GPtrArray, reflecting the (soon-to-be) internals of the getter. Surely not what I would have gone with on my own but a reasonable decision that I'm happy to accept.

As for the input parameter of the setter: GPtrArray so that the caller has to make extra allocations? I'm getting mixed messages, please clarify what you'd like to see here.

Separate entries by space rather than semicolon. Semicolons are no longer
escaped.

The escape character is represented by \e, \E or ^[, rather than \\e.

Other special characters also have more standard notation, e.g. ^H and ^?.

Change the internal variables to hold the raw value rather than the
escaped string.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
This makes us one step closer to supporting escape sequences with an
embedded NUL byte, although we probably won't need this.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
…ydef section names

Also rename [general] to [_common_], emphasizing that it's not a
terminal name and that it's always loaded.

Signed-off-by: Egmont Koblinger <egmont@gmail.com>
@egmontkob
Copy link
Contributor Author

egmontkob commented Mar 17, 2026

Changed the getter's return type to GPtrArray * – am I getting closer to what you have in mind?

Yet to address the setter's input parameter according to clarified instructions.


I'm not a 100% satisfied with the design of returning GPtrArray *, reason being that it's quite an ambiguous type.

For one, it doesn't convey whether its elements are char * or GString * or whatever else.

Then, it doesn't convey whether the array is NULL-terminated or not. This is only answered for us because NULL-terminated is a recent GLib addition and we support much older versions, too.

It also doesn't convey the message whether the array has been set up to automatically free the members, or not. Currently, for me, it's not. As far as I can see, in order to set up automatic freeing, I'd need a tiny helper function to pass that TRUE to g_string_free() because g_ptr_array_new_full()'s element_free_func only takes a function but no parameters.

Anyone who uses this method either needs to reads its documentation where all these details have to be carefully specified – which I'm yet to do!!! –, or has to check the implementation.

Returning a GString **, and having an additional int *length parameter, doesn't really have this problem. It's an absolutely well-established pattern that int *length should be optional, implying that the return value must be a NULL-terminated list, and then no more questions can be asked. A much less ambiguous design.

This direction of returning GPtrArray * does not feel clean and robust to me.

@ossilator
Copy link
Contributor

GString is sort of a pointer type. it wouldn't be unreasonable to just return it by value.

@egmontkob
Copy link
Contributor Author

GString is sort of a pointer type. it wouldn't be unreasonable to just return it by value.

Yup (and same for GPtrArray). It's not an existing pattern of mc, though. Out of the many possible solutions I'd preferably pick one that matches our existing practices.

@mc-worker
Copy link
Contributor

@egmontkob please look at branch 5047-keydef-file-2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: keybind Key bindings prio: medium Has the potential to affect progress

Development

Successfully merging this pull request may close these issues.

Simplify escaping of input escape sequences Move the input escape sequences to a separate file

3 participants