diff --git a/docs/en/docs/dev/reference/confdb/index.md b/docs/en/docs/dev/reference/confdb/index.md index a29c7b9f2a57d4bdfd70b8e0f95c87e4c462f552..dc1e90f23c66b1a1a479dbc9688ea354c3c9d114 100644 --- a/docs/en/docs/dev/reference/confdb/index.md +++ b/docs/en/docs/dev/reference/confdb/index.md @@ -1,35 +1,35 @@ # ConfDB Overview `ConfDB` is a common term for complex of measures for high-level device configuration -processing within NOC. Vendors tend to invent own configuration +processing within NOC. Vendors tend to invent their own configuration formats, often mimicking each other, and to introduce incompatibilities between releases and platforms. Besides simple change tracking and -regular expression search, the multi-vendor configuration processing -may be very challenging task. +regular expression search, multi-vendor configuration processing +may be a very challenging task. -NOC addresses the challenge with following approach: +NOC addresses the challenge with the following approach: -* `Decomposition` - complex task may be split to simple steps +* `Decomposition` - complex task may be split into simple steps * `Reusability` - all tools may be reused whenever possible -* `Pipelining` - each steps combined together into configuration processing pipelines -* `Clean contract between steps` - each step performs own task. Steps accept - predictable result from previous steps and pass predictable result to following steps +* `Pipelining` - each step combined into configuration processing pipelines +* `Clean contract between steps` - each step performs its own task. Steps accept + a predictable result from previous steps and pass a predictable result to following steps * `Clean API` - each step must be understandable and easy to implement -* `Quick result` - First result must be reached easily and quickly. Then you can became to +* `Quick result` - First result must be reached easily and quickly. Then you can become to implement more complex things To better understand the concept of `ConfDB` one should refer -to widely-used concept in programming languages - `Virtual Machines`. -`Virtual Machine` (VM) is the fictional computer with own `native assembly` -language (or machine codes). Its sometimes easier to break the +to a widely-used concept in programming languages - `Virtual Machines`. +`Virtual Machine` (VM) is the fictional computer with its own `native assembly` +language (or machine codes). It's sometimes easier to break the task of compiling the program from programming language to target -processor to two steps: to compile to fictional machine codes and -to compile from fictional codes to target ones. The benefits in clear +processor into two steps: to compile to fictional machine codes and +to compile from fictional codes to target ones. The benefits in a clear separation of common functions, suitable for all target platforms, -and of specific functions, addressed for single platform. Common functions +and of specific functions, addressed for a single platform. Common functions moved to the left (Code -> VM translation), while specific moved to the right (VM -> Target platform translation). Hence VM represents -clean contract between hardware-dependent and hardware independentent functions. +the clean contract between hardware-dependent and hardware independentent functions. Device configuration is the programming language of target platform. So we can split the task of configuration analysis by applying @@ -40,7 +40,7 @@ parts are moved to the left. All hardware-independentent parts are moved to the right and may be reused. ## ConfDB pipeline stages -Config processing pipeline and stages are represented on chart below +Config processing pipeline and stages are represented on the chart below ```mermaid graph TD diff --git a/docs/en/docs/dev/reference/confdb/tokenizer.md b/docs/en/docs/dev/reference/confdb/tokenizer.md index e66202b1c8436a57b3a852e9526f2e06e38e9c56..b0afffcaae96da0c0f0c78138fab3f4efac9da94 100644 --- a/docs/en/docs/dev/reference/confdb/tokenizer.md +++ b/docs/en/docs/dev/reference/confdb/tokenizer.md @@ -2,38 +2,40 @@ `Tokenizing` is the process of transforming input device configuration to a stream of the `tokens`. Tokenizer accepts raw config and yields -lines of parsed `tokens`. For example, raw config:: +lines of parsed `tokens`. For example, raw config: +``` interface Fa0/1 description Some interface ip address 10.0.0.1 255.255.255.0 +``` -converted into:: +converted into: +``` ["interface", "Fa0/1"] ["interface", "Fa0/1", "description", "Some", "interface"] ["interface", "Fa0/1", "ip", "address", "10.0.0.1", "255.255.255.0"] +``` -Tokenizer must fulfill following requirements: +Tokenizer must fulfill the following requirements: * Knows nothing about the meaning of config -* Low memory usage. Output tokens must be yield whenever ready -* Backward references should be avoided. Tokenizer should operate current window - like a tape. Forward and backward rewinds must be avoided. +* Low memory usage. Output tokens must be yielded whenever ready +* Backward references should be avoided. Tokenizer should operate on the current window + just like tape. Forward and backward rewinds must be avoided. * Output tokens should be grouped and analyzed easy -* Original context should be preserved whenever possible. See at expanding `interface Fa0/1` in following lines -* Each line of tokens should be further processed independentently of each other +* Original context should be preserved whenever possible. See at expanding `interface Fa0/1` in the following lines +* Each line of tokens should be further processed independently of each other -It may seems that you need separate tokenizer per each platform. Luckily you are not. -Though various configuration format have different meaning, almost all -them maintains some `code style`. Like some languages are indent-based (Python) +It may seem that you need a separate tokenizer for each platform. Luckily you are not. +Though various configuration formats have different meaning, almost all +of them maintains some `code style`. Like some languages are indent-based (Python) and some are curly-bracket-based (C, PHP), and some even all-parenthesis (LISP), there are well distinguishable groups of syntaxes. So the real device configurations -are groupped in large syntax families with very few exceptions. Usually you can -choose one of existing tokenizers and apply some configuration rather than -create own tokenizer for a new platform from zero ground. - - +are grouped in large syntax families with very few exceptions. Usually, you can +choose one of the existing tokenizers and apply some configuration rather than +create your own tokenizer for a new platform from zero ground. ## Tokenizers @@ -56,11 +58,11 @@ graph TD **line**(eol="\n", tab_width=0, line_comment=None, inline_comment=None, keep_indent=False, string_quote=None, rewrite=None) - Basic tokenizer, converting line of config into line of tokens, + Basic tokenizer, converting line of config into the line of tokens, separating by spaces and grouping strings together into single tokens and removing comments. Line tokenizer is suitable when each line of configuration is - completely self-sufficient and does not depends on previous or + completely self-sufficient and does not depend on previous or following lines. Though usable by itself, usually used as base class for more advanced tokenizers.