first commit
This commit is contained in:
15
node_modules/leac/CHANGELOG.md
generated
vendored
Normal file
15
node_modules/leac/CHANGELOG.md
generated
vendored
Normal file
@@ -0,0 +1,15 @@
|
||||
# Changelog
|
||||
|
||||
## Version 0.6.0
|
||||
|
||||
- Targeting Node.js version 14 and ES2020;
|
||||
- Now should be discoverable with [denoify](https://github.com/garronej/denoify).
|
||||
|
||||
## Version 0.5.1
|
||||
|
||||
- Documentation updates.
|
||||
|
||||
## Version 0.5.0
|
||||
|
||||
- Initial release;
|
||||
- Aiming at Node.js version 12 and up.
|
21
node_modules/leac/LICENSE
generated
vendored
Normal file
21
node_modules/leac/LICENSE
generated
vendored
Normal file
@@ -0,0 +1,21 @@
|
||||
MIT License
|
||||
|
||||
Copyright (c) 2021-2022 KillyMXI <killy@mxii.eu.org>
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
119
node_modules/leac/README.md
generated
vendored
Normal file
119
node_modules/leac/README.md
generated
vendored
Normal file
@@ -0,0 +1,119 @@
|
||||
# leac
|
||||
|
||||

|
||||

|
||||
[](https://github.com/mxxii/leac/blob/main/LICENSE)
|
||||
[](https://www.npmjs.com/package/leac)
|
||||
[](https://deno.land/x/leac)
|
||||
|
||||
Lexer / tokenizer.
|
||||
|
||||
|
||||
## Features
|
||||
|
||||
- **Lightweight**. Zero dependencies. Not a lot of code.
|
||||
|
||||
- **Well tested** - comes will tests for everything including examples.
|
||||
|
||||
- **Compact syntax** - less boilerplate. Rule name is enough when it is the same as the lookup string.
|
||||
|
||||
- **No failures** - it just stops when there are no matching rules and returns the information about whether it completed and where it stopped in addition to tokens array.
|
||||
|
||||
- **Composable lexers** - instead of states within a lexer.
|
||||
|
||||
- **Stateless lexers** - all inputs are passed as arguments, all outputs are returned in a result object.
|
||||
|
||||
- **No streaming** - accepts a string at a time.
|
||||
|
||||
- **Only text tokens, no arbitrary values**. It seems to be a good habit to have tokens that are *trivially* serializable back into a valid input string. Don't do the parser's job. There are a couple of convenience features such as the ability to discard matches or string replacements for regular expression rules but that has to be used mindfully (more on this below).
|
||||
|
||||
|
||||
## Install
|
||||
|
||||
### Node
|
||||
|
||||
```shell
|
||||
> npm i leac
|
||||
> yarn add leac
|
||||
```
|
||||
|
||||
```ts
|
||||
import { createLexer, Token } from 'leac';
|
||||
```
|
||||
|
||||
### Deno
|
||||
|
||||
```ts
|
||||
import { createLexer, Token } from 'https://deno.land/x/leac@.../leac.ts';
|
||||
```
|
||||
|
||||
|
||||
## Examples
|
||||
|
||||
- [JSON](https://github.com/mxxii/leac/blob/main/examples/json.ts) ([output snapshot](https://github.com/mxxii/leac/blob/main/test/snapshots/examples.ts.md#json));
|
||||
- [Calc](https://github.com/mxxii/leac/blob/main/examples/calc.ts) ([output snapshot](https://github.com/mxxii/leac/blob/main/test/snapshots/examples.ts.md#calc)).
|
||||
|
||||
```typescript
|
||||
const lex = createLexer([
|
||||
{ name: '-', str: '-' },
|
||||
{ name: '+' },
|
||||
{ name: 'ws', regex: /\s+/, discard: true },
|
||||
{ name: 'number', regex: /[0-9]|[1-9][0-9]+/ },
|
||||
]);
|
||||
|
||||
const { tokens, offset, complete } = lex('2 + 2');
|
||||
```
|
||||
|
||||
|
||||
## API
|
||||
|
||||
- [docs/index.md](https://github.com/mxxii/leac/blob/main/docs/index.md)
|
||||
|
||||
|
||||
## A word of caution
|
||||
|
||||
It is often really tempting to rewrite token on the go. But it can be dangerous unless you are absolutely mindful of all edge cases.
|
||||
|
||||
For example, who needs to carry string quotes around, right? Parser will only need the string content...
|
||||
|
||||
We'll have to consider following things:
|
||||
|
||||
- Regular expressions. Sometimes we want to match strings that can have a length *from zero* and up.
|
||||
|
||||
- Tokens are not produced without changing the offset. If something is missing - there is no token.
|
||||
|
||||
If we allow a token with zero length - it will cause an infinite loop, as the same rule will be matched at the same offset, again and again.
|
||||
|
||||
- Discardable tokens - a convenience feature that may seem harmless at a first glance.
|
||||
|
||||
When put together, these things plus some intuition traps can lead to a broken array of tokens.
|
||||
|
||||
Strings can be empty, which means the token can be absent. With no content and no quotes the tokens array will most likely make no sense for a parser.
|
||||
|
||||
How to avoid potential issues:
|
||||
|
||||
- Don't discard anything that you may need to insert back if you try to immediately serialize the tokens array to string. This means whitespace are usually safe to discard while string quotes are not (what can be considered safe will heavily depend on the grammar - you may have a language with significant spaces and insignificant quotes...);
|
||||
|
||||
- You can introduce a higher priority rule to capture an empty string (opening quote immediately followed by closing quote) and emit a special token for that. This way empty string between quotes can't occur down the line;
|
||||
|
||||
- Match the whole string (content and quotes) with a single regular expression, let the parser deal with it. This can actually lead to a cleaner design than trying to be clever and removing "unnecessary" parts early;
|
||||
|
||||
- Match the whole string (content and quotes) with a single regular expression, use capture groups and [replace](https://github.com/mxxii/leac/blob/main/docs/interfaces/RegexRule.md#replace) property. This can produce a non-zero length token with empty text.
|
||||
|
||||
Another note about quotes: If the grammar allows for different quotes and you're still willing to get rid of them early - think how you're going to unescape the string later. Make sure you carry the information about the exact string kind in the token name at least - you will need it later.
|
||||
|
||||
|
||||
## What about ...?
|
||||
|
||||
- performance - The code is very simple but I won't put any unverified assumptions here. I'd be grateful to anyone who can provide a good benchmark project to compare different lexers.
|
||||
|
||||
- stable release - Current release is well thought out and tested. I leave a chance that some changes might be needed based on feedback. Before version 1.0.0 this will be done without a deprecation cycle.
|
||||
|
||||
|
||||
## Some other lexer / tokenizer packages
|
||||
|
||||
- [moo](https://github.com/no-context/moo);
|
||||
- [doken](https://github.com/yishn/doken);
|
||||
- [tokenizr](https://github.com/rse/tokenizr);
|
||||
- [flex-js](https://github.com/sormy/flex-js);
|
||||
- *and more, with varied level of maintenance.*
|
1
node_modules/leac/lib/leac.cjs
generated
vendored
Normal file
1
node_modules/leac/lib/leac.cjs
generated
vendored
Normal file
@@ -0,0 +1 @@
|
||||
"use strict";Object.defineProperty(exports,"__esModule",{value:!0});const e=/\n/g;function t(t){const o=[...t.matchAll(e)].map((e=>e.index||0));o.unshift(-1);const s=n(o,0,o.length);return e=>r(s,e)}function n(e,t,r){if(r-t==1)return{offset:e[t],index:t+1};const o=Math.ceil((t+r)/2),s=n(e,t,o),l=n(e,o,r);return{offset:s.offset,low:s,high:l}}function r(e,t){return function(e){return Object.prototype.hasOwnProperty.call(e,"index")}(e)?{line:e.index,column:t-e.offset}:r(e.high.offset<t?e.high:e.low,t)}function o(e,t){return{...e,regex:s(e,t)}}function s(e,t){if(0===e.name.length)throw new Error(`Rule #${t} has empty name, which is not allowed.`);if(function(e){return Object.prototype.hasOwnProperty.call(e,"regex")}(e))return function(e){if(e.global)throw new Error(`Regular expression /${e.source}/${e.flags} contains the global flag, which is not allowed.`);return e.sticky?e:new RegExp(e.source,e.flags+"y")}(e.regex);if(function(e){return Object.prototype.hasOwnProperty.call(e,"str")}(e)){if(0===e.str.length)throw new Error(`Rule #${t} ("${e.name}") has empty "str" property, which is not allowed.`);return new RegExp(l(e.str),"y")}return new RegExp(l(e.name),"y")}function l(e){return e.replace(/[-[\]{}()*+!<=:?./\\^$|#\s,]/g,"\\$&")}exports.createLexer=function(e,n="",r={}){const s="string"!=typeof n?n:r,l="string"==typeof n?n:"",c=e.map(o),i=!!s.lineNumbers;return function(e,n=0){const r=i?t(e):()=>({line:0,column:0});let o=n;const s=[];e:for(;o<e.length;){let t=!1;for(const n of c){n.regex.lastIndex=o;const c=n.regex.exec(e);if(c&&c[0].length>0){if(!n.discard){const e=r(o),t="string"==typeof n.replace?c[0].replace(new RegExp(n.regex.source,n.regex.flags),n.replace):c[0];s.push({state:l,name:n.name,text:t,offset:o,len:c[0].length,line:e.line,column:e.column})}if(o=n.regex.lastIndex,t=!0,n.push){const t=n.push(e,o);s.push(...t.tokens),o=t.offset}if(n.pop)break e;break}}if(!t)break}return{tokens:s,offset:o,complete:e.length<=o}}};
|
165
node_modules/leac/lib/leac.d.ts
generated
vendored
Normal file
165
node_modules/leac/lib/leac.d.ts
generated
vendored
Normal file
@@ -0,0 +1,165 @@
|
||||
/** Lexer options (not many so far). */
|
||||
export declare type Options = {
|
||||
/**
|
||||
* Enable line and column numbers computation.
|
||||
*/
|
||||
lineNumbers?: boolean;
|
||||
};
|
||||
/** Result returned by a lexer function. */
|
||||
export declare type LexerResult = {
|
||||
/** Array of tokens. */
|
||||
tokens: Token[];
|
||||
/** Final offset. */
|
||||
offset: number;
|
||||
/**
|
||||
* True if whole input string was processed.
|
||||
*
|
||||
* Check this to see whether some input left untokenized.
|
||||
*/
|
||||
complete: boolean;
|
||||
};
|
||||
/**
|
||||
* Lexer function.
|
||||
*
|
||||
* @param str - A string to tokenize.
|
||||
* @param offset - Initial offset. Used when composing lexers.
|
||||
*/
|
||||
export declare type Lexer = (str: string, offset?: number) => LexerResult;
|
||||
/** Token object, a result of matching an individual lexing rule. */
|
||||
export declare type Token = {
|
||||
/** Name of the lexer containing the rule produced this token. */
|
||||
state: string;
|
||||
/** Name of the rule produced this token. */
|
||||
name: string;
|
||||
/** Text matched by the rule. _(Unless a replace value was used by a RegexRule.)_ */
|
||||
text: string;
|
||||
/** Start index of the match in the input string. */
|
||||
offset: number;
|
||||
/**
|
||||
* The length of the matched substring.
|
||||
*
|
||||
* _(Might be different from the text length in case replace value
|
||||
* was used in a RegexRule.)_
|
||||
*/
|
||||
len: number;
|
||||
/**
|
||||
* Line number in the source string (1-based).
|
||||
*
|
||||
* _(Always zero if not enabled in the lexer options.)_
|
||||
*/
|
||||
line: number;
|
||||
/**
|
||||
* Column number within the line in the source string (1-based).
|
||||
*
|
||||
* _(Always zero if line numbers not enabled in the lexer options.)_
|
||||
*/
|
||||
column: number;
|
||||
};
|
||||
/**
|
||||
* Lexing rule.
|
||||
*
|
||||
* Base rule looks for exact match by it's name.
|
||||
*
|
||||
* If the name and the lookup string have to be different
|
||||
* then specify `str` property as defined in {@link StringRule}.
|
||||
*/
|
||||
export interface Rule {
|
||||
/** The name of the rule, also the name of tokens produced by this rule. */
|
||||
name: string;
|
||||
/**
|
||||
* Matched token won't be added to the output array if this set to `true`.
|
||||
*
|
||||
* (_Think twice before using this._)
|
||||
* */
|
||||
discard?: boolean;
|
||||
/**
|
||||
* Switch to another lexer function after this match,
|
||||
* concatenate it's results and continue from where it stopped.
|
||||
*/
|
||||
push?: Lexer;
|
||||
/**
|
||||
* Stop after this match and return.
|
||||
*
|
||||
* If there is a parent parser - it will continue from this point.
|
||||
*/
|
||||
pop?: boolean;
|
||||
}
|
||||
/**
|
||||
* String rule - looks for exact string match that
|
||||
* can be different from the name of the rule.
|
||||
*/
|
||||
export interface StringRule extends Rule {
|
||||
/**
|
||||
* Specify the exact string to match
|
||||
* if it is different from the name of the rule.
|
||||
*/
|
||||
str: string;
|
||||
}
|
||||
/**
|
||||
* Regex rule - looks for a regular expression match.
|
||||
*/
|
||||
export interface RegexRule extends Rule {
|
||||
/**
|
||||
* Regular expression to match.
|
||||
*
|
||||
* - Can't have the global flag.
|
||||
*
|
||||
* - All regular expressions are used as sticky,
|
||||
* you don't have to specify the sticky flag.
|
||||
*
|
||||
* - Empty matches are considered as non-matches -
|
||||
* no token will be emitted in that case.
|
||||
*/
|
||||
regex: RegExp;
|
||||
/**
|
||||
* Replacement string can include patterns,
|
||||
* the same as [String.prototype.replace()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace#specifying_a_string_as_a_parameter).
|
||||
*
|
||||
* This will only affect the text property of an output token, not it's offset or length.
|
||||
*
|
||||
* Note: the regex has to be able to match the matched substring when taken out of context
|
||||
* in order for replace to work - boundary/neighborhood conditions may prevent this.
|
||||
*/
|
||||
replace?: string;
|
||||
}
|
||||
/**
|
||||
* Non-empty array of rules.
|
||||
*
|
||||
* Rules are processed in provided order, first match is taken.
|
||||
*
|
||||
* Rules can have the same name. For example, you can have
|
||||
* separate rules for various keywords and use the same name "keyword".
|
||||
*/
|
||||
export declare type Rules = [
|
||||
(Rule | StringRule | RegexRule),
|
||||
...(Rule | StringRule | RegexRule)[]
|
||||
];
|
||||
/**
|
||||
* Create a lexer function.
|
||||
*
|
||||
* @param rules - Non-empty array of lexing rules.
|
||||
*
|
||||
* Rules are processed in provided order, first match is taken.
|
||||
*
|
||||
* Rules can have the same name - you can have separate rules
|
||||
* for keywords and use the same name "keyword" for example.
|
||||
*
|
||||
* @param state - The name of this lexer. Use when composing lexers.
|
||||
* Empty string by default.
|
||||
*
|
||||
* @param options - Lexer options object.
|
||||
*/
|
||||
export declare function createLexer(rules: Rules, state?: string, options?: Options): Lexer;
|
||||
/**
|
||||
* Create a lexer function.
|
||||
*
|
||||
* @param rules - Non-empty array of lexing rules.
|
||||
*
|
||||
* Rules are processed in provided order, first match is taken.
|
||||
*
|
||||
* Rules can have the same name - you can have separate rules
|
||||
* for keywords and use the same name "keyword" for example.
|
||||
*
|
||||
* @param options - Lexer options object.
|
||||
*/
|
||||
export declare function createLexer(rules: Rules, options?: Options): Lexer;
|
1
node_modules/leac/lib/leac.mjs
generated
vendored
Normal file
1
node_modules/leac/lib/leac.mjs
generated
vendored
Normal file
@@ -0,0 +1 @@
|
||||
const e=/\n/g;function n(n){const o=[...n.matchAll(e)].map((e=>e.index||0));o.unshift(-1);const s=t(o,0,o.length);return e=>r(s,e)}function t(e,n,r){if(r-n==1)return{offset:e[n],index:n+1};const o=Math.ceil((n+r)/2),s=t(e,n,o),l=t(e,o,r);return{offset:s.offset,low:s,high:l}}function r(e,n){return function(e){return Object.prototype.hasOwnProperty.call(e,"index")}(e)?{line:e.index,column:n-e.offset}:r(e.high.offset<n?e.high:e.low,n)}function o(e,t="",r={}){const o="string"!=typeof t?t:r,l="string"==typeof t?t:"",c=e.map(s),f=!!o.lineNumbers;return function(e,t=0){const r=f?n(e):()=>({line:0,column:0});let o=t;const s=[];e:for(;o<e.length;){let n=!1;for(const t of c){t.regex.lastIndex=o;const c=t.regex.exec(e);if(c&&c[0].length>0){if(!t.discard){const e=r(o),n="string"==typeof t.replace?c[0].replace(new RegExp(t.regex.source,t.regex.flags),t.replace):c[0];s.push({state:l,name:t.name,text:n,offset:o,len:c[0].length,line:e.line,column:e.column})}if(o=t.regex.lastIndex,n=!0,t.push){const n=t.push(e,o);s.push(...n.tokens),o=n.offset}if(t.pop)break e;break}}if(!n)break}return{tokens:s,offset:o,complete:e.length<=o}}}function s(e,n){return{...e,regex:l(e,n)}}function l(e,n){if(0===e.name.length)throw new Error(`Rule #${n} has empty name, which is not allowed.`);if(function(e){return Object.prototype.hasOwnProperty.call(e,"regex")}(e))return function(e){if(e.global)throw new Error(`Regular expression /${e.source}/${e.flags} contains the global flag, which is not allowed.`);return e.sticky?e:new RegExp(e.source,e.flags+"y")}(e.regex);if(function(e){return Object.prototype.hasOwnProperty.call(e,"str")}(e)){if(0===e.str.length)throw new Error(`Rule #${n} ("${e.name}") has empty "str" property, which is not allowed.`);return new RegExp(c(e.str),"y")}return new RegExp(c(e.name),"y")}function c(e){return e.replace(/[-[\]{}()*+!<=:?./\\^$|#\s,]/g,"\\$&")}export{o as createLexer};
|
89
node_modules/leac/package.json
generated
vendored
Normal file
89
node_modules/leac/package.json
generated
vendored
Normal file
@@ -0,0 +1,89 @@
|
||||
{
|
||||
"name": "leac",
|
||||
"version": "0.6.0",
|
||||
"description": "Lexer / tokenizer",
|
||||
"keywords": [
|
||||
"lexer",
|
||||
"tokenizer",
|
||||
"lex",
|
||||
"token"
|
||||
],
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "git+https://github.com/mxxii/leac.git"
|
||||
},
|
||||
"bugs": {
|
||||
"url": "https://github.com/mxxii/leac/issues"
|
||||
},
|
||||
"homepage": "https://github.com/mxxii/leac",
|
||||
"author": "KillyMXI",
|
||||
"funding": "https://ko-fi.com/killymxi",
|
||||
"license": "MIT",
|
||||
"exports": {
|
||||
"import": "./lib/leac.mjs",
|
||||
"require": "./lib/leac.cjs"
|
||||
},
|
||||
"type": "module",
|
||||
"main": "./lib/leac.cjs",
|
||||
"module": "./lib/leac.mjs",
|
||||
"types": "./lib/leac.d.ts",
|
||||
"files": [
|
||||
"lib"
|
||||
],
|
||||
"scripts": {
|
||||
"build:docs": "typedoc",
|
||||
"build:deno": "denoify",
|
||||
"build:rollup": "rollup -c",
|
||||
"build:types": "tsc --declaration --emitDeclarationOnly && rimraf lib/!(leac).d.ts",
|
||||
"build": "npm run clean && concurrently npm:build:*",
|
||||
"checkAll": "npm run lint && npm test",
|
||||
"clean": "rimraf lib && rimraf docs && rimraf deno",
|
||||
"example:calc": "npm run ts -- ./examples/calc.ts",
|
||||
"example:json": "npm run ts -- ./examples/json.ts",
|
||||
"lint:eslint": "eslint .",
|
||||
"lint:md": "markdownlint-cli2",
|
||||
"lint": "concurrently npm:lint:*",
|
||||
"prepublishOnly": "npm run build && npm run checkAll",
|
||||
"test": "ava",
|
||||
"ts": "node --experimental-specifier-resolution=node --loader ts-node/esm"
|
||||
},
|
||||
"dependencies": {},
|
||||
"devDependencies": {
|
||||
"@rollup/plugin-typescript": "^8.3.4",
|
||||
"@tsconfig/node14": "^1.0.3",
|
||||
"@types/node": "14.18.23",
|
||||
"@typescript-eslint/eslint-plugin": "^5.33.1",
|
||||
"@typescript-eslint/parser": "^5.33.1",
|
||||
"ava": "^4.3.1",
|
||||
"concurrently": "^7.3.0",
|
||||
"denoify": "^1.0.0",
|
||||
"eslint": "^8.22.0",
|
||||
"eslint-plugin-jsonc": "^2.4.0",
|
||||
"eslint-plugin-tsdoc": "^0.2.16",
|
||||
"markdownlint-cli2": "^0.5.1",
|
||||
"rimraf": "^3.0.2",
|
||||
"rollup": "^2.78.0",
|
||||
"rollup-plugin-terser": "^7.0.2",
|
||||
"ts-node": "^10.9.1",
|
||||
"tslib": "^2.4.0",
|
||||
"typedoc": "~0.22.18",
|
||||
"typedoc-plugin-markdown": "~3.12.1",
|
||||
"typescript": "~4.7.4"
|
||||
},
|
||||
"ava": {
|
||||
"extensions": {
|
||||
"ts": "module"
|
||||
},
|
||||
"files": [
|
||||
"test/**/*"
|
||||
],
|
||||
"nodeArguments": [
|
||||
"--loader=ts-node/esm",
|
||||
"--experimental-specifier-resolution=node"
|
||||
],
|
||||
"verbose": true
|
||||
},
|
||||
"denoify": {
|
||||
"out": "./deno"
|
||||
}
|
||||
}
|
Reference in New Issue
Block a user