Skip to content

Commit

Permalink
Parser: Allow block nesting
Browse files Browse the repository at this point in the history
In this patch we're opening up a new avenue for allowing nested blocks
in the data structure.

For each block:
 - Nested blocks appear as `innerBlocks` as a sequential list
 - The contained HTML _without_ the nested blocks appear as a string
   property `innerHtml` which replaces `rawContent`

Also:
 - Remove `WP_` prefix on grammar terms - not needed - was there
   from the earliest iterations where different parts were prefixed
   by which spec they implemented, such as HTML_, URL_, etc...

Regenerates fixtures based on updated parser

Disable eslint for line so tests will run

Fix rebase issue

Update based on PR feedback

I'm breaking my own rules here by introducing more code into the
parser but I'm also not sure how we can escape this without
placing higher demands on some post-processing after the parse.

Changes:
 - Do away with non-supported language features
 - Abstract `joinBlocks( pre, tokens, post )` into a function
   basically just need to join non-empty items into a flat list

Tiny fix and big header comment

Added comment to start of rules

Actually return blocks from peg_join_blocks()

Minor adjustments to parser to preserve existing behavior

Update parser and fix bug in updates

When the `Balanced_Block` was rebuilt to be defined as a starting block,
some number of tokens and non-closing HTML, finished by a closing block,
I used a `+` to indicate that we needed at least _some_ content inside
of the block to be valid.

In some regards this is true because empty blocks should be void blocks.

On the other hand, it's very likely that we'll receive empty non-void
blocks in practice and the parser should not invalidate one because it
has chosen the wrong syntax.

This update replaces the `+` with a `*` such that we can have empty
blocks and they will be treated as normal.

Remove unnecessary semicolon

Blocks: Access block content by innerHTML

Parser: Drop callable type hint

Type hint supported only in PHP 5.4+, above support level. Only use is in call_user_func, supportable with expected string callable input prior to 5.4
  • Loading branch information
dmsnell authored and aduth committed Nov 9, 2017
1 parent 1cbe5e6 commit 4e0b589
Show file tree
Hide file tree
Showing 71 changed files with 904 additions and 492 deletions.
2 changes: 1 addition & 1 deletion blocks/api/parser.js
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ export function createBlockWithFallback( name, rawContent, attributes ) {
*/
export function parseWithGrammar( content ) {
return grammarParse( content ).reduce( ( memo, blockNode ) => {
const { blockName, rawContent, attrs } = blockNode;
const { blockName, innerHTML: rawContent, attrs } = blockNode;
const block = createBlockWithFallback( blockName, rawContent.trim(), attrs );
if ( block ) {
memo.push( block );
Expand Down
251 changes: 191 additions & 60 deletions blocks/api/post.pegjs
Original file line number Diff line number Diff line change
@@ -1,33 +1,179 @@
{

/*
*
* _____ _ _
* / ____| | | | |
* | | __ _ _| |_ ___ _ __ | |__ ___ _ __ __ _
* | | |_ | | | | __/ _ \ '_ \| '_ \ / _ \ '__/ _` |
* | |__| | |_| | || __/ | | | |_) | __/ | | (_| |
* \_____|\__,_|\__\___|_| |_|_.__/ \___|_| \__, |
* __/ |
* GRAMMAR |___/
*
*
* Welcome to the grammar file for Gutenberg posts!
*
* Please don't be distracted by the functions at the top
* here - they're just helpers for the grammar below. We
* try to keep them as minimal and simple as possible,
* but the parser generator forces us to declare them at
* the beginning of the file.
*
* What follows is the official specification grammar for
* documents created or edited in Gutenberg. It starts at
* the top-level rule `Block_List`
*
* The grammar is defined by a series of _rules_ and ways
* to return matches on those rules. It's a _PEG_, a
* parsing expression grammar, which simply means that for
* each of our rules we have a set of sub-rules to match
* on and the generated parser will try them in order
* until it finds the first match.
*
* This grammar is a _specification_ (with as little actual
* code as we can get away with) which is used by the
* parser generator to generate the actual _parser_ which
* is used by Gutenberg. We generate two parsers: one in
* JavaScript for use the browser and one in PHP for
* WordPress itself. PEG parser generators are available
* in many languages, though different libraries may require
* some translation of this grammar into their syntax.
*
* For more information:
* @see https://pegjs.org
* @see https://en.wikipedia.org/wiki/Parsing_expression_grammar
*
*/

/** <?php
// The `maybeJSON` function is not needed in PHP because its return semantics
// are the same as `json_decode`
// array arguments are backwards because of PHP
if ( ! function_exists( 'peg_array_partition' ) ) {
function peg_array_partition( $array, $predicate ) {
$truthy = array();
$falsey = array();
foreach ( $array as $item ) {
call_user_func( $predicate, $item )
? $truthy[] = $item
: $falsey[] = $item;
}
return array( $truthy, $falsey );
}
}
if ( ! function_exists( 'peg_join_blocks' ) ) {
function peg_join_blocks( $pre, $tokens, $post ) {
$blocks = array();
if ( ! empty( $pre ) ) {
$blocks[] = array( 'attrs' => array(), 'innerHTML' => $pre );
}
foreach ( $tokens as $token ) {
list( $token, $html ) = $token;
$blocks[] = $token;
if ( ! empty( $html ) ) {
$blocks[] = array( 'attrs' => array(), 'innerHTML' => $html );
}
}
if ( ! empty( $post ) ) {
$blocks[] = array( 'attrs' => array(), 'innerHTML' => $post );
}
return $blocks;
}
}
?> **/

function freeform( s ) {
return s.length && {
attrs: {},
innerHTML: s
};
}

function joinBlocks( pre, tokens, post ) {
var blocks = [], i, l, html, item, token;

if ( pre.length ) {
blocks.push( freeform( pre ) );
}

for ( i = 0, l = tokens.length; i < l; i++ ) {
item = tokens[ i ];
token = item[ 0 ];
html = item[ 1 ];

blocks.push( token );
if ( html.length ) {
blocks.push( freeform( html ) );
}
}

if ( post.length ) {
blocks.push( freeform( post ) );
}

return blocks;
}

function maybeJSON( s ) {
try {
return JSON.parse( s );
} catch (e) {
return null;
}
try {
return JSON.parse( s );
} catch (e) {
return null;
}
}

function partition( predicate, list ) {
var i, l, item;
var truthy = [];
var falsey = [];

// nod to performance over a simpler reduce
// and clone model we could have taken here
for ( i = 0, l = list.length; i < l; i++ ) {
item = list[ i ];

predicate( item )
? truthy.push( item )
: falsey.push( item )
};

return [ truthy, falsey ];
}

}

Document
= WP_Block_List
//////////////////////////////////////////////////////
//
// Here starts the grammar proper!
//
//////////////////////////////////////////////////////

WP_Block_List
= WP_Block*
Block_List
= pre:$(!Token .)*
ts:(t:Token html:$((!Token .)*) { /** <?php return array( $t, $html ); ?> **/ return [ t, html ] })*
post:$(.*)
{ /** <?php return peg_join_blocks( $pre, $ts, $post ); ?> **/
return joinBlocks( pre, ts, post );
}

WP_Block
= WP_Tag_More
/ WP_Block_Void
/ WP_Block_Balanced
/ WP_Block_Html
Token
= Tag_More
/ Block_Void
/ Block_Balanced

WP_Tag_More
Tag_More
= "<!--" WS* "more" customText:(WS+ text:$((!(WS* "-->") .)+) { /** <?php return $text; ?> **/ return text })? WS* "-->" noTeaser:(WS* "<!--noteaser-->")?
{ /** <?php
$attrs = array( 'noTeaser' => (bool) $noTeaser );
Expand All @@ -37,7 +183,7 @@ WP_Tag_More
return array(
'blockName' => 'core/more',
'attrs' => $attrs,
'rawContent' => ''
'innerHTML' => ''
);
?> **/
return {
Expand All @@ -46,12 +192,12 @@ WP_Tag_More
customText: customText || undefined,
noTeaser: !! noTeaser
},
rawContent: ''
innerHTML: ''
}
}

WP_Block_Void
= "<!--" WS+ "wp:" blockName:WP_Block_Name WS+ attrs:(a:WP_Block_Attributes WS+ {
Block_Void
= "<!--" WS+ "wp:" blockName:Block_Name WS+ attrs:(a:Block_Attributes WS+ {
/** <?php return $a; ?> **/
return a;
})? "/-->"
Expand All @@ -60,62 +206,47 @@ WP_Block_Void
return array(
'blockName' => $blockName,
'attrs' => $attrs,
'rawContent' => '',
'innerBlocks' => array(),
'innerHTML' => '',
);
?> **/

return {
blockName: blockName,
attrs: attrs,
rawContent: ''
innerBlocks: [],
innerHTML: ''
};
}

WP_Block_Balanced
= s:WP_Block_Start ts:(!WP_Block_End c:Any {
/** <?php return $c; ?> **/
return c;
})* e:WP_Block_End & {
/** <?php return $s['blockName'] === $e['blockName']; ?> **/
return s.blockName === e.blockName;
}
Block_Balanced
= s:Block_Start children:(Token / $(!Block_End .))* e:Block_End
{
/** <?php
list( $innerHTML, $innerBlocks ) = peg_array_partition( $children, 'is_string' );
return array(
'blockName' => $s['blockName'],
'attrs' => $s['attrs'],
'rawContent' => implode( '', $ts ),
'innerBlocks' => $innerBlocks,
'innerHTML' => implode( '', $innerHTML ),
);
?> **/

var innerContent = partition( function( a ) { return 'string' === typeof a }, children );
var innerHTML = innerContent[ 0 ];
var innerBlocks = innerContent[ 1 ];

return {
blockName: s.blockName,
attrs: s.attrs,
rawContent: ts.join( '' )
innerBlocks: innerBlocks,
innerHTML: innerHTML.join( '' )
};
}

WP_Block_Html
= ts:(!WP_Block_Balanced !WP_Block_Void !WP_Tag_More c:Any {
/** <?php return $c; ?> **/
return c;
})+
{
/** <?php
return array(
'attrs' => array(),
'rawContent' => implode( '', $ts ),
);
?> **/

return {
attrs: {},
rawContent: ts.join( '' )
}
}

WP_Block_Start
= "<!--" WS+ "wp:" blockName:WP_Block_Name WS+ attrs:(a:WP_Block_Attributes WS+ {
Block_Start
= "<!--" WS+ "wp:" blockName:Block_Name WS+ attrs:(a:Block_Attributes WS+ {
/** <?php return $a; ?> **/
return a;
})? "-->"
Expand All @@ -133,8 +264,8 @@ WP_Block_Start
};
}

WP_Block_End
= "<!--" WS+ "/wp:" blockName:WP_Block_Name WS+ "-->"
Block_End
= "<!--" WS+ "/wp:" blockName:Block_Name WS+ "-->"
{
/** <?php
return array(
Expand All @@ -147,21 +278,21 @@ WP_Block_End
};
}

WP_Block_Name
= WP_Namespaced_Block_Name
/ WP_Core_Block_Name
Block_Name
= Namespaced_Block_Name
/ Core_Block_Name

WP_Namespaced_Block_Name
Namespaced_Block_Name
= $(ASCII_Letter ASCII_AlphaNumeric* "/" ASCII_Letter ASCII_AlphaNumeric*)

WP_Core_Block_Name
Core_Block_Name
= type:$(ASCII_Letter ASCII_AlphaNumeric*)
{
/** <?php return "core/$type"; ?> **/
return 'core/' + type;
}

WP_Block_Attributes
Block_Attributes
= attrs:$("{" (!("}" WS+ """/"? "-->") .)* "}")
{
/** <?php return json_decode( $attrs, true ); ?> **/
Expand Down
5 changes: 3 additions & 2 deletions blocks/test/fixtures/core-embed__animoto.parsed.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@
"attrs": {
"url": "https://animoto.com/"
},
"rawContent": "\n<figure class=\"wp-block-embed-animoto\">\n https://animoto.com/\n <figcaption>Embedded content from animoto</figcaption>\n</figure>\n"
"innerBlocks": [],
"innerHTML": "\n<figure class=\"wp-block-embed-animoto\">\n https://animoto.com/\n <figcaption>Embedded content from animoto</figcaption>\n</figure>\n"
},
{
"attrs": {},
"rawContent": "\n"
"innerHTML": "\n"
}
]
5 changes: 3 additions & 2 deletions blocks/test/fixtures/core-embed__cloudup.parsed.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@
"attrs": {
"url": "https://cloudup.com/"
},
"rawContent": "\n<figure class=\"wp-block-embed-cloudup\">\n https://cloudup.com/\n <figcaption>Embedded content from cloudup</figcaption>\n</figure>\n"
"innerBlocks": [],
"innerHTML": "\n<figure class=\"wp-block-embed-cloudup\">\n https://cloudup.com/\n <figcaption>Embedded content from cloudup</figcaption>\n</figure>\n"
},
{
"attrs": {},
"rawContent": "\n"
"innerHTML": "\n"
}
]
5 changes: 3 additions & 2 deletions blocks/test/fixtures/core-embed__collegehumor.parsed.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,11 @@
"attrs": {
"url": "https://collegehumor.com/"
},
"rawContent": "\n<figure class=\"wp-block-embed-collegehumor\">\n https://collegehumor.com/\n <figcaption>Embedded content from collegehumor</figcaption>\n</figure>\n"
"innerBlocks": [],
"innerHTML": "\n<figure class=\"wp-block-embed-collegehumor\">\n https://collegehumor.com/\n <figcaption>Embedded content from collegehumor</figcaption>\n</figure>\n"
},
{
"attrs": {},
"rawContent": "\n"
"innerHTML": "\n"
}
]
Loading

0 comments on commit 4e0b589

Please sign in to comment.