- No regexp
- Fast
- Minimum dependencies as possible
- Real tree structure
- Idempotency, which means reverse operations lead to the original state
Firstly, when i started building the org roam web implementation i faced with one big problem, there are no parsers which could satisfy all requirements that i need
Also, i figured out implementation a bit, and found that these very complex solutions, most of the time based on regexp, which are not good for parsing and readability, and also have poor performance.
This project completely ignore implementation from emacs data tree, cause i have been developing this code for my own purposes. I wanted to control every node and property which i have in my tree. Moreover, i wanted to have idempotency with converting from AST to text and vise versa. Right now, many parsers losing some characters, like newlines or extra spaces.
Of this requirements i want the following from this parser:
- Full testing all possible nodes.
- Development according to the TDD principles.
- Easy to read.
- There are small handlers for each node types
- The structure of tree is very similar to visible result (including nested folding)
- Each element of the tree has own range with begin and end positions.
- All operations with the tree, such as formatting, auto-insert, etc. must be implemented as a separate logic with its own class.
- [X] Bold/italic/crossed out text
- [X] Lists
- [ ] List progress
- [X] Headlines
- [X] Inline code
- [X] Inline quotes
- [ ] Tables
- [ ] Links
- [X] Src blocs
- [ ] Latex blocks
- [ ] Html/quotes blocks
- [ ] Inline html
- [ ] Html attributes
- [X] Comments
- [ ] Properties
- [ ] Date and time
- [ ] Keywords
- [ ] Cross links
- [X] Tags
Am i miss something? Please, let me know.
- [ ] Metadata collecting
Every help is greatly appreciated.