Diggy is a simple wrapper around the PHP DOM extension that allow finding elements using simple query selectors and fail proof chaining.
- PHP 7.4 or PHP 8.0
Diggy includes a simple webclient that uses Guzzle under the hood to download a page and return a NodeCollection
object. However, you can use any webclient you prefer and pass a DOMNode
or DOMNodeList
object to the
NodeCollection
constructor.
$client = new \Jerodev\Diggy\WebClient();
$page = $client->get('https://www.deviaene.eu/');
var_dump($page->first('#social')->querySelector('a span')->texts());
// [
// 'GitHub',
// 'Twitter',
// 'Email',
// 'LinkedIn',
// ]
These are the available functions on a NodeCollection
object. All functions that do not return a native value can be
chained without having to worry if there are nodes in the collection or not.
Returns the value of the attribute of the first element in the collection if available.
$nodes->attribute('href');
Returns the number of elements in the current node collection.
$nodes->count();
Loops over all dom elements in the current collection and executes a closure for each element. The return value of this function is an array of values returned from the closure.
$nodes->each('a', static function (NodeFilter $node) {
return $a->attribute('href');
});
Indicates if an element exists in the collection. If a selector is given, the current nodes will first be filtered.
$nodes->exists('a.active');
Returns the first element of the node collection. If a selector is given, the current nodes will first be filtered.
$nodes->first('a.active');
Indicates if the first element in the current collection has a specified tag name.
$nodes->is('div');
Returns the last element of the node collection. If a selector is given, the current nodes will first be filtered.
$nodes->last('a.active');
Returns the tag name of the first element in the current node collection
$nodes->nodeName();
Returns the nth element of the node collection, starting at 0
.
If a selector is given, the current nodes will first be filtered.
$nodes->nth(1, 'a.active');
Finds all elements in the current node collection matching this css query selector.
$nodes->querySelector('a.active');
Returns the inner text of the first element in the node collection. If a selector is given, the current nodes will first be filtered.
$nodes->text('p.description');
Returns an array containing the inner text of every root element in the collection.
$nodes->texts('nav > a');
Filters the current node collection based on a given closure.
$nodes->whereHas(static function (NodeFilter $node) {
return $node->text() === 'foo';
});
Filters the current node collection by the existence of a specific attribute. If a value is given the collection is also filtered by the value of this attribute.
$nodes->whereHasAttribute('href');
Filters the current node collection by the existence of inner text.
Setting a value will also filter the nodes by the actual inner text based on $trim
and $exact
.
option | function |
---|---|
$trim |
Indicates the inner text value should be trimmed before matches with $value . |
$exact |
Indicates the inner text value should match $value exactly. |
$nodes->whereHasText('foo');
Finds all elements in the current node collection matching this xpath query selector.
$nodes->xPath('//nav/a[@href]');