- Home: https://github.com/1Hyena/nt4c
- Issue tracker: https://github.com/1Hyena/nt4c/issues
NT4C stands for "NestedText for C" and that is exactly what this project is about.
In short, NestedText is a file format for holding structured data.
The following resources can explain more if you are unfamiliar with it:
NT4C is a NestedText parser implementation written in accordance with the C23 standard of the C programming language. It includes the following features:
-
Compliance: NT4C aims to comply with the latest version of the NestedText specification. However, it is currently only compliant with the Minimal NestedText specification.
-
Performance: NT4C is fast as it does not involve any heap memory allocations. It also avoids unnecessary memory copying by directly referencing the input text in the resulting graph.
-
Compactness: The NT4C parser is implemented in a single header file with no dependencies other than the standard C library.
-
Embedding: The NT4C parser is easily reusable in other projects with a simple API that includes a few key functions, primarily
nt_parse(). -
Callbacks: NT4C parses the entire document and calls a callback function provided by the application to inform it about each NestedText unit.
-
Tree model: If sufficient memory is provided to the NT4C parser, it constructs a graph where each node directly references a segment from the input text.
-
Portability: NT4C builds and functions on Linux. It should be relatively simple to make it run on most other platforms as long as the platform provides the C standard library.
-
Encoding: NT4C expects UTF-8 encoding of the input text and does not attempt to detect Unicode encoding errors.
-
Permissive license: NT4C is available under the MIT license.
To parse a NestedText document, you can include the nt4c.h header file directly in your codebase. The parser is implemented in a single C header file for easy integration.
The main functions to use are nt_parse() and nt_parser_parse(). The former
is a convenience function for simple callback based parsing whereas the latter
takes a pointer to the NT_PARSER structure as its first argument and is to be
used for customized parsing.
The NT_PARSER structure stores parsing configuration and the parsing process
state. By default, it can handle up to NT_PARSER_NCOUNT nodes in its internal
memory. However, you can use the nt_parser_set_memory function to work with a
custom array of NT_NODE structures.
When you call nt_parser_parse(), the parser populates the document graph with
nodes. It continues processing even if the output buffer reaches its capacity.
After a successful parsing operation, both nt_parse() and nt_parser_parse()
return the number of nodes in the input text. This information can help you to
determine the memory required for storing the full graph of the document. If
parsing fails, the function returns a negative value.
The graph of the document is considered fully stored when the value returned by
nt_parser_parse() is non-negative and does not exceed the output buffer's
capacity.
The ex_hello example demonstrates how to use the NT4C parser to generate the text "hello world" and display it on the screen.
Lines 6 to 15 in 490be86
The ex_callback example demonstrates how to make the NT4C parser call a user-specified function each time it parses the next logical portion of the input document.
nt4c/examples/src/ex_callback.c
Lines 7 to 38 in 490be86
This example demonstrates how to utilize the NT4C parser to parse and display a NestedText document on the screen. The input document undergoes parsing twice. Initially, the length of the document is calculated. Subsequently, a variable-length array is set up to store the Document Object Model (DOM).
Lines 23 to 36 in 490be86
This example shows how to use the NT4C parser to pretty-print a NestedText document. It reformats the input text and adds syntax highlighting.
Lines 71 to 77 in 490be86
Here is a NestedText document before and after pretty-printing, as shown in the screenshot below:
Lines 1 to 29 in 490be86
This example shows how to use the NT4C parser to print the structure of a NestedText document on the screen.
Lines 85 to 99 in 490be86
Here is a screenshot showing the structure of the parsed NestedText document:
-
- nt_make_parser () →
NT_PARSER - nt_parser_init (&parser)
- nt_make_parser () →
-
- nt_parse (text, text size, callback, userdata) →
int - nt_parser_parse (&parser, text, text size) →
int
- nt_parse (text, text size, callback, userdata) →
-
- nt_parser_set_memory (&parser, &nodes, node count)
- nt_parser_set_recursion (&parser, depth)
- nt_parser_set_blacklist (&parser, banned types)
- nt_parser_set_whitelist (&parser, allowed types)
- nt_parser_set_userdata (&parser, &userdata)
- nt_parser_set_callback (&parser, &callback)
-
- nt_type_code (type) →
const char * - nt_type_type (type) →
NT_TYPE
- nt_type_code (type) →
Lines 48 to 88 in 490be86
Specify the size of the integrated memory buffer of the NT_PARSER structure by
defining the NT_PARSER_NCOUNT macro before including the nt4c.h header. The
integrated memory was added to increase the API usage convenience in cases where
the size of the input document is always known to be small (see
ex_pretty).
Lines 41 to 43 in 490be86
Lines 114 to 116 in 490be86
Examples: ex_echo
Lines 118 to 122 in 490be86
Lines 99 to 112 in 490be86
Examples: ex_hello
Lines 124 to 134 in 490be86
Examples: ex_echo
Lines 136 to 143 in 490be86
Examples: ex_echo
Lines 145 to 150 in 490be86
Lines 152 to 158 in 490be86
Examples: ex_pretty
Lines 160 to 166 in 490be86
Lines 168 to 174 in 490be86
Lines 176 to 182 in 490be86
Lines 184 to 189 in 490be86
Examples: ex_callback
Lines 191 to 195 in 490be86
Examples: ex_callback
NT4C has been authored by Erich Erstu and is released under the MIT license.




