summaryrefslogtreecommitdiffstats
path: root/cnm-specification.cnm
blob: b17331171ec0b7548f497dd165fd6cda50bb3d73 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
title
	ContNet Markup specification, version 0.4 (2017-09-07)

content
	section Overview
		text
			CNM is a lightweight markup language primarily meant to be used as the hypertext document markup format for ContNet. It is a line-based Unicode text markup format with indentation-delimited blocks. The primary goals of CNM are simple parsing and composition, as well as being readable and writable by humans.

			CNM contains semantic content of hypertext pages. It does not include layout, styles or scripts, as all of that is supposed to be handled by the rendering application. As such, it aims to avoid obfuscating content behind presentation and supports responsive design, as every device can render the content to fit its screen and interface.

	
	section Syntax
		text fmt
			All parts of CNM use the UTF-8 encoding. Any invalid UTF-8 sequence is replaced with the ``U+FFFD`` replacement character.

			A CNM document is mainly composed of blocks defined by indentation. The core structure of the document consists of nested blocks containing other blocks, with the leaves being either blocks with no child blocks or some form of text that does not contain any blocks.

			Each line in the document ends in a line feed character. All raw (not provided as an escape sequence) carriage return or null characters in the document are ignored. If the document does not end with a line feed character, it is parsed as if it had ended with one.

			Parsing method for the contents of a block depend on which block it is. If the block is not known, it should be ignored and all of its contents skipped by advancing until the next nonempty line with less indentation than the unknown block's contents.

			When whitespace is mentioned in the specification, it refers to the following ASCII whitespace characters: tab (``U+0009``), line feed (``U+000A``), form feed (``U+000C``) and space (``U+0020``) in their raw Unicode character form, not as an escape sequence. All other Unicode whitespace characters stand for themselves and are not collapsed or used to split fields.

			An empty line is a line consisting of at most as much indentation as the parent block's contents and nothing else. Such lines implicitly belong to the last parsed block regardless of the amount of indentation and act the same as if the indentation depth was the same as the block's contents.

		text fmt
			__**TL;DR:** Encoded in UTF-8, line-based. LF is line terminator, CR is ignored. Unknown blocks' contents are skipped.

		text
			The following general syntactic contexts are commonly used:


		section Block mode
			text fmt
				In block mode, every nonempty line is parsed as a block name line. 

				The block name line consists of a list of whitespace-delimited simple text tokens. The line is first split on each sequence of one or more whitespace characters that are not a part of a simple text escape sequence (specifically, not ``"\\ "``). If there's any leading or trailing whitespace, the first or last token is an empty string after splitting. If the splitting ends with a single empty token (the entire line was just whitespace), the line is treated the same as an empty line and is skipped.

				The first token in the block name line is the block name. It defines the meaning of the block and how its contents are parsed. The remaining tokens, if any, represent the block's arguments. All empty tokens in the arguments should be ignored. Some blocks might use the arguments as one single value; in that case, the arguments are joined together with spaces.

				Note that excess tabs or space indentation will result in a block with an empty name. This will usually result in an unknown block, which will then be skipped.

				All lines following the block name line that are indented at least one level more than the block name or are empty are parsed as the contents of the named block. For every such line, the initial indentation equal to one level more than the block name's is removed and the remainder of the line parsed according to the named block's mode (the inner block keeps any tab characters in excess of the indentation). Block mode parsing in the current block resumes on the first nonempty line that has less indentation than the contents of the last named block.

			text fmt
				__**TL;DR:** Block mode contains blocks. Each block starts with line containing simple text name and optional arguments, split by non-escaped whitespace. All lines indented over the indentation of the block name line are contents of that block.


		section Simple text mode
			text
				Simple text is parsed by collapsing all raw (not provided as an escape sequence) whitespace into a single space and removing any leading or trailing spaces, then resolving escape sequences.

				Simple text can contain escape sequences. These are C-style sequences of two or more characters that begin with a backslash and are parsed as a single character they represent. The following escape sequences are currently defined (without quotes):

			raw
				"\b"          ->  U+0008      backspace
				"\t"          ->  U+0009      tab
				"\n"          ->  U+000A      line feed
				"\v"          ->  U+000B      vertical tab
				"\f"          ->  U+000C      form feed
				"\r"          ->  U+000D      carriage return
				"\ "          ->  U+0020      space
				"\\"          ->  U+005C      backslash
				"\x##"        ->  U+00##      8-bit Unicode character
				"\u####"      ->  U+####      16-bit Unicode character
				"\U########"  ->  U+########  32-bit Unicode character

			text fmt
				The ``#`` characters in ``\\x##``, ``\\u####`` and ``\\U########`` escape sequences are arbitrary hexadecimal digits ``[0-9a-fA-F]``. In ``\\U########``, the first two digits should generally be zero, since Unicode only supports 21-bit characters. Invalid codepoints are unescaped into the ``U+FFFD`` replacement character.

				Any other sequence starting with a backslash that is not in the above table, or one of the ``\\x``, ``\\u`` and ``\\U`` sequences with too few hex digits, are parsed the same as if the backslash itself was escaped: they're left in the text unchanged, with the backslash remaining present.

				Simple text mode is mostly used in block mode block names and arguments or as a part of other formats in specific blocks.

			text fmt
				__**TL;DR:** Collapse and trim whitespace. Handle C-style escape sequences. Invalid escape sequences are parsed as normal text.


		section Raw text mode
			text
				In raw text mode, all data is parsed as a literal text blob. Whitespace is preserved exactly as-is, including any leading tabs (tabs that are a part of the block's indentation do not count as a part of the block content in block mode) and empty lines inside the content, excluding any leading or trailing empty lines, which are removed. Global text parsing rules (ignoring carriage returns, UTF-8) still apply. Each raw text line also retains its line feed character.

				Raw mode is mostly used for the raw block and for the initial parsing of other blocks with their own syntax. In essence, every block could first be parsed in raw mode, then the results of that using the block's parsing mode.

			text fmt
				__**TL;DR:** Lines are kept unmodified for later processing.


	section Structure
		text fmt
			The top level of a CNM document is parsed in block mode. It contains blocks containing metadata and the content itself.

			None of the top-level blocks in CNM have any arguments.
			
			An empty top-level block is equivalent to an absent one.

			If the same top-level block appears multiple times in the document, the contents of all instances are merged together. The content merging happens after parsing, so all child blocks end with the end of each instance of a top-level block. This means that a child block of one of multiple instances of container blocks (``content``, ``site`` and ``links``) is fully contained in its parent top-level block and cannot extend into the next one. Simple text blocks (``title``) can just merge their contents as if all of their lines belonged to a single block, since simple text collapses whitespace anyway.

			The following blocks are defined on the top level:


		section title
			text
				Contains the document title. The contents of the block are parsed as simple text.

				Note that the title can be of arbitrary length or even absent and may contain characters like line feed and various control codes. Implementations are not required to display them as such and may instead prefer to display the title, or its prefix up to a certain length if it's too long, as a single line with all whitespace collapsed even after resolving escape sequences.

				While a title is recommended, a document is not required to have one. Implementations may display that as an empty title (or not show a title at all) or an implementation-defined placeholder or content excerpt of their choice.

			text fmt
				**Example:**

			raw text/cnm
				title
					This is a document title.

			text fmt
				__**TL;DR:** Simple text. May be very long or not present at all. Make sure to handle e.g. newlines.


		section links
			text fmt
				The ``links`` block can contain an arbitrary number of hyperlinks, which are intended to be a page-wide list of links to relevant parts of the website or other websites.

				The block contents are parsed in block mode.

				Each block inside the contents of the ``links`` block should have a URL as the block name and the hyperlink text as the block arguments joined with spaces. If the argument is not present or empty, the hyperlink name is set to the hyperlink URL. The contents of the URL block are parsed as simple text and represent a link description, which may be optionally displayed by the interactive client (for example, as a title that appears on mouse-over or a footnote), but may as well be hidden.

				Links with missing URL (blank block name) are skipped.

			text fmt
				**Example:**

			raw text/cnm
				links
					/example Clicking this link leads to /example.
					/test
						The above link has no explicit title,
						so "/test" is used instead.

						However, it has a description.
						Despite the empty line,
						it's displayed as a single line.
					cnp://example.com/ Links can also be absolute URLs.

			text fmt
				__**TL;DR:** Block mode. Contains nested blocks with URL in name, link text in argument and description in simple text contents.


		section site
			text fmt
				The ``site`` block represents a sitemap. It is used to show a hierarchical tree of the current site. The block contents are parsed in block mode.

				Each block inside the site block should have a filename or filepath as the block name, which represents the path on the current site. The arguments, joined together with spaces, are an optional name of the path that is used as the hyperlink text; if not provided, then the path should be used as the name. The contents of each block are parsed in block mode and recursively contain other path blocks.

				The path blocks represent an absolute hierarchical filepath within the current site. Each block represents a hyperlink to a certain page. To construct the entire filepath for a specific path block, prepend a slash to its name and the name of every parent block all the way to the site block itself, then join them together into a single string. If a block path contains slashes, it represents several levels of directories; path composition rules are unchanged. If a block path has a trailing slash, it should be preserved in the filepath. The final filepath represents a relative URL based on the document root of the current site.

				The client should display these as a list or tree of hyperlinks for navigating the current site. It may assume that a node whose path matches the current page's location is the current page (e.g. shows it in a different color, or shows all other nodes collapsed, etc.). The order of nodes should not be changed and nodes with duplicate path or name should be kept as-is.

				Sitemap entries with missing path argument are skipped.

			text fmt
				**Example:**

			raw text/cnm
				site
					foo This is a link to /foo
						bar And this to /foo/bar
						baz/quux This one leads to /foo/baz/quux
							test And this to /foo/bar/baz/quux/test
						baz
							quux Above link uses "baz" as the name.
								test2 This leads to /foo/baz/quux/test2
					cnp://example.com/ This leads to /cnp:/example.com/

			text fmt
				__**TL;DR:** Block mode. Contains recursive block mode blocks with paths as names and hyperlink text as descriptions. Join the names from the root site block to the selected child node into a filepath.


		section content
			text fmt
				The ``content`` top-level block contains the entire body of the document. All of content's child blocks represent the document content.

				The block contents are parsed in block mode. The meaning of each child block depends on its name. The following content blocks are currently defined:


			section section
				text fmt
					The ``section`` block represents a division of the contents with an optional title.

					The contents of the section block are parsed in block mode and can be arbitrary content blocks.

					If the block has arguments, they are joined together with spaces and represent the section title. The section title is displayed as a heading and can be used as a content selector inside the document. Nested sections with titles represent subsections.

					A section without a title groups the child blocks together without counting as a section (e.g. no table of contents entry). An example use of that is putting multiple text blocks into a list item. As a direct child of the ``content`` or ``section`` block, a title-less section does nothing and is equivalent to a document that has its child blocks directly inside the parent block in the place of the section block.

				text fmt
					**Example:**

				raw text/cnm
					content
						section Section name goes here.

				text fmt
					__**TL;DR:** Group of content blocks with a heading.


			section text
				text fmt
					The ``text`` block represents text contents.

					It is parsed in raw text mode, with additional formatting being applied on top depending on the block arguments.

					The ``text`` block can be specified with a text format mode as the first argument. The format may be used to add rich text formatting.

					Currently, there are three text format modes defined: ``plain``, ``pre`` and ``fmt``. If the block argument is empty, the ``plain`` format is used. Contents of blocks with unknown format modes can be parsed as if they were ``raw`` blocks.

				text fmt
					__**TL;DR:** Contains text. Formatting depends on argument.


				section text plain
					text fmt
						The ``text plain`` block represents plain text content. It consists of a sequence of paragraphs of simple text. Since it's the default mode for the ``text`` block, using the ``plain`` argument is not necessary.

						A paragraph is a sequence of consecutive nonempty lines of simple text. A paragraph ends with an empty line or the end of the text block. When displaying paragraphs, spacing should be added between them (such as some padding or a blank line). Escaped line feeds in the text itself do not have this spacing.

					text fmt
						**Example:**

					raw text/cnm
						content
							text
								This is a paragraph of text.
								This sentence is in the same line as the above.

								This one, however, is a new paragraph.\n
								And the escaped line break above splits this
								sentence into a new line, but not a new paragraph.

								This   is   joined   by   single   spaces.

					text fmt
						__**TL;DR:** Contains paragraphs of simple text and escape sequences.


				section text pre
					text fmt
						The ``text pre`` block represents preformatted plain text content.

						The ``text pre`` block contents are parsed the same way as a ``raw`` block's, except that simple text escape sequences are still resolved and no syntax highlighting should be done. Whitespace is left untouched and the whole text block is just a single paragraph regardless of blank lines (which are simply literal line feeds).

					text fmt
						**Example:**

					raw text/cnm
						content
							text pre
								This is the first line.
								This is on a new line.
								This sentence is\non two lines.

								The above line is empty, but not a paragraph.
								This   line   contains   triple   spaces.

					text fmt
						__**TL;DR:** Contains preformatted raw text and escape sequences.


				section text fmt
					text fmt
						The ``text fmt`` block represents text that contains simple inline formatting.

						First, the text block is split into paragraphs the same way as a plain text block, with whitespace collapsed as in simple text. After that, the CNMfmt formatting is applied to each paragraph. Finally, escape sequences (including CNMfmt specific ones) are resolved.

						See the @@#/The\ CNMfmt\ inline\ formatting\ submarkup CNMfmt@@ section below for more information.

					text fmt
						**Example:**

					raw text/cnm
						content
							text fmt
								This is **emphasized**, __alternate__, ``code``,
								""quoted"" and @@/ a hyperlink to /@@.

								**emphasized __emphasized+alternate **alternate ""alternate+quoted
								still alternate+quoted **alternate+quoted+emphasized

								This is no longer emphasized, alternate, or quoted.
								It is also a new paragraph containing a single
								line without formatting.

								@@# This link contains **emphasized** text.

								**@@# This hyperlink is emphasized,**@@ but this text isn't.

					text fmt
						__**TL;DR:** Contains paragraphs of text containing inline CNMfmt formatting.


			section raw
				text fmt
					The ``raw`` block represents preformatted text contents.

					The block contents are parsed in raw mode. When possible, the contents should be displayed with a monospaced font with all whitespace preserved.

					If present, the first block argument represents the type of the contents. That should generally be the MIME type of the data or lowercased name of the language/syntax in the contents of the ``raw`` block (for example, ``text/html`` or ``html``, ``text/javascript`` or ``application/javascript`` or ``javascript``). When rendering the block contents, the type may be used to perform syntax highlighting.

					Note that, as in all other blocks, it's not possible to include leading or trailing blank lines in the ``raw`` block's contents.

				text fmt
					**Example:**

				raw text/cnm
					content
						raw
							this is not **emphasized**
							this is on a new line
							this line is \n all in one line
							above line contains characters "\" and "n"

							the above line was empty

				text fmt
					__**TL;DR:** Raw preformatted text. Argument is type name for optional syntax highlighting.


			section list
				text fmt
					The ``list`` block represents a list of items.

					The block contents are parsed in block mode and can contain arbitrary content blocks. Each child block represents one list item; several blocks can be grouped into a single item using a section block.

					The first block argument represents the list type. Currently, two list types are defined: ordered and unordered. Unordered lists are simple lists of items with e.g. bullet points. Ordered lists use Arabic numbers by default; currently, choosing alternate numbering style is not possible, but it may be added in the future. Nested unordered lists may use different bullet style, but are not required to. Nested ordered lists use the same style of numbering as the parent one; nested numbering style may be configurable in future versions of CNM. Ordered lists always start with 1.

				text fmt
					**Example:**

				raw text/cnm
					content
						list
							text
								This is the first item.
							text
								Second item.
							section
								text
									Third item.
								text
									Still third item.
							list
								text
									Nested list, item 4.1.

				text fmt
					__**TL;DR:** List of content blocks. Argument: ordered or unordered. Ordered always starts with 1.


			section table
				text fmt
					The ``table`` block represents two-dimensional tabular data.

					The contents are parsed in block mode. A table can contain two different types of blocks: ``header`` and ``row``. The ``header`` and ``row`` blocks both act like a section block without an argument: they can contain arbitrary content blocks. Each of their child blocks represents one table cell; to group multiple blocks into one cell, a ``section`` block without a title can be used.

					The width of the table depends on the longest header or row. Any headers or rows with less cells than that are padded with empty cells on the right side.

					Currently, there is no support for multi-column or multi-row cells.


				section header
					text fmt
						The ``header`` block represents a table header row.

						It is parsed the same way as a ``section`` block without a title and can contain arbitrary content blocks. Each child block represents a column header cell.

						The ``header`` block represents a row with table headers. It should be displayed in a more emphasized manner and, optionally, allow sorting all follow-up rows until the next header or the end of the table by columns. A table is not required to start with a header, nor to include one at all.


				section row
					text fmt
						The ``row`` block represents a table data row.

						It is parsed the same way as a ``section`` block without a title and can contain arbitrary content blocks. Each child block represents a table body cell.

						The ``row`` block represents a row the table contents.


				text fmt
					**Example:**

				raw text/cnm
					content
						table
							header
								text
									Header of column 1
								text
									Header of column 2
								text
									Header of column 3
							row
								text
									Row 1 column 1
								text
									Row 1 column 2
							row
								text
									Row 2 column 1
								text
									Row 2 column 2
							row
								section
									text
										Row 3 column 1
									text
										Still Row 3 column 1
								text
									Row 3 column 2
								text
									Row 3 column 3

									Row 1 column 3 and row 2 column 3 are empty.

				text fmt
					__**TL;DR:** Contains headers and rows. Child blocks of these are cells.


			section embed
				text fmt
					The ``embed`` block is used to embed external content into the document.

					The first block argument represents the MIME type of the embedded content. It can be used by the user agent to decide how to handle it. Graphical browsers are recommended to display at least common image types (e.g. ``image/png``, ``image/jpeg``, ``image/webp`` and ``image/svg+xml``) inside the page by default. An empty argument or invalid MIME type can be treated as an application/octet-stream type and not be embedded.

					The second argument is the URL pointing to the embedded content. An embed block without a URL should be ignored. The URL may also be a data URI.

					The contents of the block are parsed in simple text mode and represent the description of the embedded content. If present, the description can be displayed as e.g. a caption, mouse-over title, placeholder when the content cannot be embedded, etc., but may as well be hidden.

					If the content type is unknown or cannot be embedded within the page, the embedded content should be presented as a hyperlink instead.

				text fmt
					**Example:**

				raw text/cnm
					content
						embed image/png /static/example.png
							This is an embedded image's caption/title/hover text.

				text fmt
					__**TL;DR:** Argument is MIME type and URL, contents are description. Embed inside page if possible, otherwise provide hyperlink.


	section Selectors
		text fmt
			CNM selector queries can be used to identify specific sections in a CNM document.

			Selectors can be used to select a section in the document (e.g. to move an open document so that it's visible) or filter a document to only show certain sections and their content.


		section Section selector
			text fmt
				A section selector query identifies a specific section in the document. It's usually used in the hash fragment part of a URL to move the visible document to the named section. Section title selectors are case-sensitive.

				Section selectors can select sections either by a section title, a path of section titles or a path of section indices. A section without a title does not count as a section and cannot be selected by section selectors; any mention of sections in the specification of selectors refers exclusively to sections with non-empty titles. A section with an empty title can essentially be regarded as a generic container block.

			section Title selector
				raw
					#{title}

				text fmt
					The title selector selects the first section with the given title (``{title}``) in the document. The section order is defined by their vertical position; block depth is irrelevant. If multiple sections in the document have the same title, this selector only selects the first one. The title must use URL percent-encoding where at least the slash character (``U+002F``) is encoded into ``%2F`` or ``%2f``.

					An empty title matches the top of the document contents.

					Note that the ``#`` character (``U+0023``) in the selector is not the same as the one separating the URL hash fragment. An example URL with a title selector is @@cnp://example.com/file.cnm##title@@.

			section Title path selector
				raw
					/{path}

				text fmt
					The title path selector selects a section based on a path of section titles. The ``{path}`` part of the query consists of zero or more section titles (escaped just like in the title selector) separated by a single slash character.

					Each title in the path selects a section using the same method as the title selector, but only considers sections that aren't a child block of another section in the current context (are accessible from the current context without passing through another section). The initial context is the top-level ``content`` block. Each time a section in the path is matched, the new context becomes this section's contents.

					If any part of the path fails to find a matching section, the query does not match anything.

					An empty path matches the top of the document contents. An empty title in a non-empty path does not match anything.

			section Index path selector
				raw
					${indices}

				text fmt
					The index path selector selects a section based on a path of section indices. The ``{indices}`` part of the query is a dot-separated path of zero or more section indices represented by decimal numbers.

					Each index in the path selects a section within the current context (as in the title path selector). The first section has the index 1.

					If any index in the path is zero or higher than the number of the sections in its context, the query does not match anything.

					An empty path matches the top of the document contents.


		section Content selector
			text fmt
				A content selector is a selector that selects a subset of the document contents based on a section.

				The content selectors have the same syntax as the section selectors, but may be optionally prefixed with an exclamation mark (``U+0021``) for a shallow selector.

				Using a content selector query on a document returns a new document consisting of only the named section, all of its contents and all parent block names up to the top-level without any of their sibling blocks or other contents.

				A shallow selector selects a similar document, but excludes the contents of any child sections of the selected section (the section block name lines and any non-section blocks with their contents are kept).

				For the cases where a specific selector selects the top of the document contents, the entire ``content`` block with all of its contents is selected (or, in the case of a shallow selector, without child section contents).

				An empty content selector selects the entire document with all of its contents, including non-``content`` top-level blocks, unmodified (though the actual document may be recomposed, as long as the contents aren't changed). A content selector consisting only of the shallow selector modifier ``!`` selects the same document, but without the contents of any sections.


		section Examples
			text fmt
				Example CNM document:

			raw text/cnm
				title
					Test
				content
					section A
						text
							T1
						section B
							text
								T2
							list
								text
									T3
								section C
									text
										T4
						section C
							text
								T5
					list
						section
							text
								T6
						section E
							text
								T7
						text
							T8
					section E
						text
							T9

			text fmt
				Section selectors:
			list
				text fmt
					``#A`` selects the section ""A"" containing the text ""T1"", section ""B"" and section ""C"".
				text fmt
					``#C`` selects the section ""C"" containing the text ""T4"".
				text fmt
					``#F`` does not select anything.
				text fmt
					``/A`` selects the section ""A"" containing the text ""T1"", section ""B"" and section ""C"".
				text fmt
					``/A/B/C`` selects the section ""C"" containing the text ""T4"".
				text fmt
					``/A/C`` selects the section ""C"" containing the text ""T5"".
				text fmt
					``/E`` selects the section ""E"" containing the text ""T7"".
				text fmt
					``/B`` does not select anything.
				text fmt
					``$1`` selects the section ""A"" containing the text ""T1"", section ""B"" and section ""C"".
				text fmt
					``$2`` selects the section ""E"" containing the text ""T7"".
				text fmt
					``$3`` selects the section ""E"" containing the text ""T9"".
				text fmt
					``$1.1.1`` selects the section ""C"" containing the text ""T4"".
				text fmt
					``$1.3`` does not select anything.

			text fmt
				Content selectors:
			list
				section
					text fmt
						``#C`` selects the following document:
					raw text/cnm
						content
							section A
								section B
									list
										section C
											text
												T4
				section
					text fmt
						``!/A`` selects the following document:
					raw text/cnm
						content
							section A
								text
									T1
								section B
								section C
				section
					text fmt
						``!/`` selects the following document:
					raw text/cnm
						content
							section A
							list
								section
									text
										T6
								section E
								text
									T8
							section E
				section
					text fmt
						``!`` selects the following document:
					raw text/cnm
						title
							Test
						content
							section A
							list
								section
									text
										T6
								section E
								text
									T8
							section E


	section The CNMfmt inline formatting submarkup
		text fmt
			The CNMfmt markup is used within ``text fmt`` content blocks to provide inline formatting of text.

			CNMfmt extends the CNM ``text plain`` block by introducing toggles of various format options. These toggles consist of two symbol characters. If the format of the toggle is currently not in effect, the toggle enables it. Otherwise, the format is disabled. Formats do **not** have to be toggled in LIFO order. All formats are implicitly closed with the end of the paragraph.

			The following toggles and formats are currently defined:

		raw
			**  emphasized
			__  alternate
			``  code
			""  quotation
			@@  hyperlink


		section Emphasized
			text fmt
				The **\*\*emphasized\*\*** format indicates emphasized text. It uses two asterisks (``\*\*``) as the toggle. The usual way to style emphasized text is with a bold font, but implementations may choose to use a different style.

		section Alternate
			text fmt
				The __\_\_alternate\_\___ format indicates text in an alternate voice that is offset from the normal text. It uses two underscores (``\_\_``) as the toggle. The usual way to style alternate text is with an italic font, but implementations may choose to use a different style.

		section Code
			text fmt
				The contents of the ``\`\`code\`\``` format represent computer code or similar text that is usually not in a spoken language. It uses two grave accents (``\`\```) as the toggle. Note that whitespace in this tag is **not** preserved; it is collapsed the same way as in the rest of the ``text fmt`` block. Code should be displayed in a monospaced font, if possible.

		section Quote
			text fmt
				The ""\"\"quote\"\""" format represents a quotation. It uses two quote marks (``\"\"``) as the toggle. The usual way to style quoted text is to include quote marks on the beginning and end and/or frame it, but implementations may choose a different style.

		section Hyperlink
			text fmt
				The @@cnp://example.com/ \@\@cnp://example.com/ hyperlink\@\@@@ format represents an inline hyperlink. It uses two at signs (``\@\@``) as the toggle.

				The hyperlink consists of two parts: the URL and the link text.

				The URL is the first non-whitespace word inside the formatted text. The URL does not contain any CNMfmt toggles excluding ``\@\@``, which ends the entire hyperlink format (for example, if a ``\_\_`` appears inside the URL, it does not toggle the alternate format). Note that the URL can still contain CNM simple text and CNMfmt escape sequences; these can be used to supply Unicode characters and spaces instead of manually percent-encoding the URL.

				If the hyperlink format consists of more than one word, the remainder of the content is used as the hyperlink text. It may contain arbitrary CNMfmt formatting. If the link text is blank, the URL is used as link text instead.


		text fmt
			Any other sequences of two symbols stand for themselves as text.

			The CNMfmt markup also includes several new escapes alongside the standard CNM ones to allow including the toggle characters as text:

		raw
			"\*"  ->  U+002A  asterisk
			"\_"  ->  U+005F  underscore
			"\`"  ->  U+0060  grave accent
			"\""  ->  U+0022  quotation mark
			"\@"  ->  U+0040  at sign