summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorclsr <clsr@clsr.net>2017-09-01 17:07:00 +0200
committerclsr <clsr@clsr.net>2017-09-01 17:07:00 +0200
commitc0374c5b4a12c1b1f937d151775f87268bed8360 (patch)
tree2e8f17bb714441ed543393c13be5d7e8b0468e56
parentd753b91063d7d388be4b9a8cef3e4fef2f4ffe9d (diff)
downloadcnm-c0374c5b4a12c1b1f937d151775f87268bed8360.tar.gz
cnm-c0374c5b4a12c1b1f937d151775f87268bed8360.zip
Rework CNMfmt to define semantics instead of styles
CNMfmt now defines semantic formatting instead of adding style to the content (such as emphasized instead of bold and alternate voice or mood instead of italic). Additionally, a new quotation formatting has been added.
-rw-r--r--cnm-specification.cnm86
1 files changed, 43 insertions, 43 deletions
diff --git a/cnm-specification.cnm b/cnm-specification.cnm
index 618207f..c599a61 100644
--- a/cnm-specification.cnm
+++ b/cnm-specification.cnm
@@ -1,5 +1,5 @@
title
- ContNet Markup specification, version 0.3.1 (2017-08-23)
+ ContNet Markup specification, version 0.4 draft (2017-09-01)
content
section Overview
@@ -17,14 +17,14 @@ content
Each line in the document ends in a line feed character. All raw (not provided as an escape sequence) carriage return or null characters in the document are ignored. If the document does not end with a line feed character, it is parsed as if it had ended with one.
- The contents of each block are parsed according to that block's parsing mode. If the block is not known, it can be parsed as a raw text block or skipped entirely.
+ Parsing method for the contents of a block depend on which block it is. If the block is not known, it should be ignored and all of its contents skipped by advancing until the next nonempty line with less indentation than the unknown block's contents.
When whitespace is mentioned in the specification, it refers to the following ASCII whitespace characters: tab (``U+0009``), line feed (``U+000A``), form feed (``U+000C``) and space (``U+0020``) in their raw Unicode character form, not as an escape sequence. All other Unicode whitespace characters stand for themselves and are not collapsed or used to split fields.
An empty line is a line consisting of at most as much indentation as the parent block's contents and nothing else. Such lines implicitly belong to the last parsed block regardless of the amount of indentation and act the same as if the indentation depth was the same as the block's contents.
text fmt
- //**TL;DR:** Encoded in UTF-8, line-based. LF is line terminator, CR is ignored. Unknown blocks' contents are skipped.
+ __**TL;DR:** Encoded in UTF-8, line-based. LF is line terminator, CR is ignored. Unknown blocks' contents are skipped.
text
The following general syntactic contexts are commonly used:
@@ -43,7 +43,7 @@ content
All lines following the block name line that are indented at least one level more than the block name or are empty are parsed as the contents of the named block. For every such line, the initial indentation equal to one level more than the block name's is removed and the remainder of the line parsed according to the named block's mode (the inner block keeps any tab characters in excess of the indentation). Block mode parsing in the current block resumes on the first nonempty line that has less indentation than the contents of the last named block.
text fmt
- //**TL;DR:** Block mode contains blocks. Each block starts with line containing simple text name and optional arguments, split by non-escaped whitespace. All lines indented over the indentation of the block name line are contents of that block.
+ __**TL;DR:** Block mode contains blocks. Each block starts with line containing simple text name and optional arguments, split by non-escaped whitespace. All lines indented over the indentation of the block name line are contents of that block.
section Simple text mode
@@ -73,7 +73,7 @@ content
Simple text mode is mostly used in block mode block names and arguments or as a part of other formats in specific blocks.
text fmt
- //**TL;DR:** Collapse and trim whitespace. Handle C-style escape sequences. Invalid escape sequences are parsed as normal text.
+ __**TL;DR:** Collapse and trim whitespace. Handle C-style escape sequences. Invalid escape sequences are parsed as normal text.
section Raw text mode
@@ -83,7 +83,7 @@ content
Raw mode is mostly used for the raw block and for the initial parsing of other blocks with their own syntax. In essence, every block could first be parsed in raw mode, then the results of that using the block's parsing mode.
text fmt
- //**TL;DR:** Lines are kept unmodified for later processing.
+ __**TL;DR:** Lines are kept unmodified for later processing.
section Structure
@@ -94,7 +94,7 @@ content
An empty top-level block is equivalent to an absent one.
- If the same top-level block appears multiple times, the contents are merged together, with all child blocks of the first instance ending with it if it is a block mode block (``content``, ``site`` and ``links``). Non-content blocks (``title``) are merged as if their contents were concatenated with empty lines in between.
+ If the same top-level block appears multiple times in the document, the contents of all instances are merged together. The content merging happens after parsing, so all child blocks end with the end of each instance of a top-level block. This means that a child block of one of multiple instances of container blocks (``content``, ``site`` and ``links``) is fully contained in its parent top-level block and cannot extend into the next one. Simple text blocks (``title``) can just merge their contents as if all of their lines belonged to a single block, since simple text collapses whitespace anyway.
The following blocks are defined on the top level:
@@ -115,7 +115,7 @@ content
This is a document title.
text fmt
- //**TL;DR:** Simple text. May be very long or not present at all. Make sure to handle e.g. newlines.
+ __**TL;DR:** Simple text. May be very long or not present at all. Make sure to handle e.g. newlines.
section links
@@ -144,7 +144,7 @@ content
cnp://example.com/ Links can also be absolute URLs.
text fmt
- //**TL;DR:** Block mode. Contains nested blocks with URL in name, link text in argument and description in simple text contents.
+ __**TL;DR:** Block mode. Contains nested blocks with URL in name, link text in argument and description in simple text contents.
section site
@@ -174,7 +174,7 @@ content
cnp://example.com/ This leads to /cnp:/example.com/
text fmt
- //**TL;DR:** Block mode. Contains recursive block mode blocks with paths as names and hyperlink text as descriptions. Join the names from the root site block to the selected child node into a filepath.
+ __**TL;DR:** Block mode. Contains recursive block mode blocks with paths as names and hyperlink text as descriptions. Join the names from the root site block to the selected child node into a filepath.
section content
@@ -202,7 +202,7 @@ content
section Section name goes here.
text fmt
- //**TL;DR:** Group of content blocks with a heading.
+ __**TL;DR:** Group of content blocks with a heading.
section text
@@ -216,7 +216,7 @@ content
Currently, there are three text format modes defined: ``plain``, ``pre`` and ``fmt``. If the block argument is empty, the ``plain`` format is used. Contents of blocks with unknown format modes can be parsed as if they were ``raw`` blocks.
text fmt
- //**TL;DR:** Contains text. Formatting depends on argument.
+ __**TL;DR:** Contains text. Formatting depends on argument.
section text plain
@@ -241,7 +241,7 @@ content
This is joined by single spaces.
text fmt
- //**TL;DR:** Contains paragraphs of simple text and escape sequences.
+ __**TL;DR:** Contains paragraphs of simple text and escape sequences.
section text pre
@@ -264,7 +264,7 @@ content
This line contains triple spaces.
text fmt
- //**TL;DR:** Contains preformatted raw text and escape sequences.
+ __**TL;DR:** Contains preformatted raw text and escape sequences.
section text fmt
@@ -281,22 +281,22 @@ content
raw text/cnm
content
text fmt
- This is **bold**, //italic//, __underlined__,
- ``monospaced`` and @@/ a hyperlink to /@@.
+ This is **emphasized**, __alternate__, ``code``,
+ ""quoted"" and @@/ a hyperlink to /@@.
- **bold //bold+italic **italic __italic+underlined
- still italic+underlined **italic+underlined+bold
+ **emphasized __emphasized+alternate **alternate ""alternate+quoted
+ still alternate+quoted **alternate+quoted+emphasized
- This is no longer bold, italic, or underlined.
+ This is no longer emphasized, alternate, or quoted.
It is also a new paragraph containing a single
line without formatting.
- @@# This link contains **bold** text.
+ @@# This link contains **emphasized** text.
- **@@# This hyperlink is bold,**@@ but this isn't.
+ **@@# This hyperlink is emphasized,**@@ but this text isn't.
text fmt
- //**TL;DR:** Contains paragraphs of text containing inline CNMfmt formatting.
+ __**TL;DR:** Contains paragraphs of text containing inline CNMfmt formatting.
section raw
@@ -315,7 +315,7 @@ content
raw text/cnm
content
raw
- this is not **bold**
+ this is not **emphasized**
this is on a new line
this line is \n all in one line
above line contains characters "\" and "n"
@@ -323,7 +323,7 @@ content
the above line was empty
text fmt
- //**TL;DR:** Raw preformatted text. Argument is type name for optional syntax highlighting.
+ __**TL;DR:** Raw preformatted text. Argument is type name for optional syntax highlighting.
section list
@@ -354,7 +354,7 @@ content
Nested list, item 4.1.
text fmt
- //**TL;DR:** List of content blocks. Argument: ordered or unordered. Ordered always starts with 1.
+ __**TL;DR:** List of content blocks. Argument: ordered or unordered. Ordered always starts with 1.
section table
@@ -423,14 +423,14 @@ content
Row 1 column 3 and row 2 column 3 are empty.
text fmt
- //**TL;DR:** Contains headers and rows. Child blocks of these are cells.
+ __**TL;DR:** Contains headers and rows. Child blocks of these are cells.
section embed
text fmt
The ``embed`` block is used to embed external content into the document.
- The first block argument represents the MIME type of the embedded content. It can be used by the user agent to decide how to handle it. Graphical browsers are recommended to display at least common image types (e.g. image/png, image/jpeg and image/webp) inside the page by default. An empty argument or invalid MIME type can be treated as an application/octet-stream type and not be embedded.
+ The first block argument represents the MIME type of the embedded content. It can be used by the user agent to decide how to handle it. Graphical browsers are recommended to display at least common image types (e.g. ``image/png``, ``image/jpeg``, ``image/webp`` and ``image/svg``) inside the page by default. An empty argument or invalid MIME type can be treated as an application/octet-stream type and not be embedded.
The second argument is the URL pointing to the embedded content. An embed block without a URL should be ignored. The URL may also be a data URI.
@@ -447,7 +447,7 @@ content
This is an embedded image's caption/title/hover text.
text fmt
- //**TL;DR:** Argument is MIME type and URL, contents are description. Embed inside page if possible, otherwise provide hyperlink.
+ __**TL;DR:** Argument is MIME type and URL, contents are description. Embed inside page if possible, otherwise provide hyperlink.
section The CNMfmt inline formatting submarkup
@@ -459,36 +459,36 @@ content
The following toggles and formats are currently defined:
raw
- ** bold
- // italic
- __ underlined
- `` monospaced
+ ** emphasized
+ __ alternate
+ `` code
+ "" quotation
@@ hyperlink
- section Bold
+ section Emphasized
text fmt
- The **\*\*bold\*\*** format makes all text inside it bold. It uses two asterisks (``\*\*``) as the toggle.
+ The **\*\*emphasized\*\*** format indicates emphasized text. It uses two asterisks (``\*\*``) as the toggle. The usual way to style emphasized text is with a bold font, but implementations may choose to use a different style.
- section Italic
+ section Alternate
text fmt
- The //\/\/italic\/\/// format makes all text inside it italic. It uses two slashes (``\/\/``) as the toggle.
+ The __\_\_alternate\_\___ format indicates text in an alternate voice that is offset from the normal text. It uses two underscores (``\_\_``) as the toggle. The usual way to style alternate text is with an italic font, but implementations may choose to use a different style.
- section Underlined
+ section Code
text fmt
- The __\_\_underlined\_\___ format makes all text inside it underlined. It uses two underscores (``\_\_``) as the toggle.
+ The contents of the ``\`\`code\`\``` format represent computer code or similar text that is usually not in a spoken language. It uses two grave accents (``\`\```) as the toggle. Note that whitespace in this tag is **not** preserved; it is collapsed the same way as in the rest of the ``text fmt`` block. Code should be displayed in a monospaced font, if possible.
- section Monospaced
+ section Quote
text fmt
- The contents of the ``\`\`monospaced\`\``` format should be rendered using a monospaced font, if possible. Whitespace is **not** preserved; it is collapsed the same way as in the rest of the ``text fmt`` block. It uses two grave accents (``\`\```) as the toggle.
+ The ""\"\"quote\"\""" format represents a quotation. It uses two quote marks (``\"\"``) as the toggle. The usual way to style quoted text is to include quote marks on the beginning and end and/or frame it, but implementations may choose a different style.
section Hyperlink
text fmt
- The @@cnp://example.com/ \@\@cnp:\/\/example.com\/ hyperlink\@\@@@ format represents an inline hyperlink. It uses two at signs (``\@\@``) as the toggle.
+ The @@cnp://example.com/ \@\@cnp://example.com/ hyperlink\@\@@@ format represents an inline hyperlink. It uses two at signs (``\@\@``) as the toggle.
The hyperlink consists of two parts: the URL and the link text.
- The URL is the first non-whitespace word inside the formatted text. The URL does not contain any CNMfmt toggles excluding ``\@\@``, which ends the entire hyperlink format (for example, the ``\/\/`` inside the URL does not toggle the italic format). Note that the URL can still contain CNM simple text and CNMfmt escape sequences; these can be used to supply Unicode characters and spaces instead of manually percent-encoding the URL.
+ The URL is the first non-whitespace word inside the formatted text. The URL does not contain any CNMfmt toggles excluding ``\@\@``, which ends the entire hyperlink format (for example, if a ``\_\_`` appears inside the URL, it does not toggle the alternate format). Note that the URL can still contain CNM simple text and CNMfmt escape sequences; these can be used to supply Unicode characters and spaces instead of manually percent-encoding the URL.
If the hyperlink format consists of more than one word, the remainder of the content is used as the hyperlink text. It may contain arbitrary CNMfmt formatting. If the link text is blank, the URL is used as link text instead.
@@ -500,7 +500,7 @@ content
raw
"\*" -> U+002A asterisk
- "\/" -> U+002F slash
"\_" -> U+005F underscore
"\`" -> U+0060 grave accent
+ "\"" -> U+0022 quotation mark
"\@" -> U+0040 at sign