Data Pickaxe for XML

Library and tool for extracting and checking parts of an XML document.

Quick Start

dpickx lets you use XPaths to select nodes in an XML document and then do the following:

Express your expectations/make assertions about the selected nodes
Process the selected nodes
Capture the values of selected nodes

The following snippets give an idea of what you can do.

For an XML document like this:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<RootElement>
  <SomethingUnique/>
  <Repeated/>
  <Repeated/>
  <ElementWithSizeAttribute size="15"/>
  <ContainsSeventeen>17</ContainsSeventeen>
  <ContainsAttributeWithEighteen attrOf18="18">blah blah</ContainsAttributeWithEighteen>
  <Duplicate>123</Duplicate>
  <Duplicate>123</Duplicate>
  <DuplicateEleDiffContent>123</DuplicateEleDiffContent>
  <DuplicateEleDiffContent>456</DuplicateEleDiffContent>
  <AlwaysTrue>true</AlwaysTrue>
</RootElement>

Convert it to a DOM, then you can make assertions like this:

Node eg = topLevelDocumentOrSomeNodeFromXml();
XmlDocumentChecker.check(eg)
        .andExpect(xpath("/RootElement/SomethingUnique").exists())
        .andExpect(xpath("//NeverExisting").doesNotExist())
        .andExpect(xpath("//Repeated").exists())
        .andExpect(xpath("//Repeated").nodeCount(2))
        .andExpect(xpath("/NeverExistingTopLevel").doesNotExist())
        .andExpect(xpath("//ContainsSeventeen").value(is("17")))
        .andExpect(xpath("//ContainsSeventeen").value(containsString("7")))
        .andExpect(xpath("//ContainsSeventeen").value(startsWith("1")))
        .andExpect(xpath("//ContainsSeventeen").value(is(numberOfValue(17.0))))
        .andExpect(xpath("//ContainsSeventeen").value(is(numberOfValue(17))))
        .andExpect(xpath("//ElementWithSizeAttribute/@size").exists())
        .andExpect(xpath("//ElementWithSizeAttribute/@size").value(is("15")))
        ;

And you can select and process nodes like this:

List<Node> fakeConsumer = new ArrayList<>();
XmlDocumentChecker.check(eg)
        .andExpect(xpath("//Repeated").nodeCount(2))
        .andDo(xpath("//Repeated").processEach(node -> fakeConsumer.add(node)))
        .andDo(xpath("//Repeated").processEach(node -> {
            // do something complicated with node
        }))
        ;

And you can capture a node's value like this:

XmlDocumentChecker checker = new XmlDocumentChecker(eg);
String drivingAge = checker.captureSoleRequired(xpath("//ContainsSeventeen"));

// do something with captured value...
assertThat(drivingAge, is("17"));

You can also capture optional values, like this:

String drivingAge = checker.captureSoleOptional(xpath("//ContainsSeventeen")).orElse("21");
assertThat(drivingAge, is("17"));

String votingAge = checker.captureSoleOptional(xpath("//MissingVotingAgeEntry")).orElse("18");
assertThat(votingAge, is("18"));

To see more examples, see Examples.java

Tool

dpickx-app has been begun as a command-line tool to let you apply dpickx to files. At the moment, it just applies captureSoleRequired and outputs it to stdout. You can build and run it like this:

mvn clean package && pushd dpickx-app && java -jar target/dpickx-app-0.10.0-SNAPSHOT-jar-with-dependencies.jar example.xml ///ContainsSeventeen&& popd

Applying XPaths

XPaths are applied to the node passed in to the XmlDocumentChecker.check() method in a conventional way. That is, relative XPaths are relative to the node passed in, but absolute paths still apply to the whole document. See the Examples.java for more details.

TODO

Library `dpickx`

Tool `dpickx-app`

Create a tool! - Make a command-line utility that can extract a value from an XML file (using a given XPath).
Use the added helper class XmlUnmarshaller to convert the XML file into a DOM.
Rearrange code to be in a folder called dpickx-app, not icm-dpickx-...
Rename artifact to be dpickx-app, not icm-dpickx-app, as the icm- is handled by the groupId.
Update this README to mention (and explain how to use) the app.
Spring Boot has only been used to "simplify" creating a runnable .jar (and to specify a set of versions of packages). This makes a 12 MB jar, which is excessive. Investigate a cleaner way of doing this, with a smaller (set of) jar(s). Using Apache Maven Assembly Plugin, reduced jar size to 101 kB.
De-duplicate POM settings by depending on a Parent POM, once there is one.
Replace the test file included with the app - example.xml - with a way of generating it from the code in XmlExampleFixture. The file was originally created by hand, capturing it from the console output when the library tests were run.
Add more features: flags to control what to do, separate logging (to file)

Origin

This code was extracted from a personal project with a view to one day publishing it as a separate module. It sat untouched for years, but perhaps now is the time to do that.

It was originally inspired by some of the XPath matching in Spring MVC Test, but I needed something more general for XML (and JSON), and I wanted something that would eventually allow me to capture required or optional values from the node too.

Coding Standard

Basic standard is icm-java-style, with the following notes:

Currently uses the "Eclipse [Built-in]" settings in Eclipse/Spring STS, for Java > Code Style's Clean Up and Formatter.

Except uses 4 spaces for indentation, not tab.
This includes maximum line lengths as follows
- 120 for code
- 80 for comments, but "from comment's starting position". The starting position bit is nice because it means that comment blocks don't need to be reformatted when the commented code's indentation level changes, for example when it is refactored to move it into or out of nested classes.

Don't (usually) modify method parameters. But don't (usually) use final on (every) method parameter, to prevent it - we think it makes the code unhelpfully verbose. Instead, turn on an IDE rule or code linter to warn on reassigned parameters. Only use final on method parameters in special cases - for example if the method is so long that it is hard to see at a glance that the parameters are not changed (although, avoid such long methods!). Or where some of the parameters are reassigned - use final to mark the ones that are not.

Coding Standard TODOs

Look for final variables that can be removed by inlining the variable

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
.github/workflows		.github/workflows
dpickx-app		dpickx-app
dpickx		dpickx
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE-Apache-2.md		LICENSE-Apache-2.md
LICENSE-GPL-2.md		LICENSE-GPL-2.md
LICENSE-GPL-3.md		LICENSE-GPL-3.md
LICENSE-MIT.md		LICENSE-MIT.md
LICENSE.md		LICENSE.md
README.md		README.md
infinitest.args		infinitest.args
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Data Pickaxe for XML

Quick Start

Tool

Applying XPaths

TODO

Library `dpickx`

Tool `dpickx-app`

Origin

Coding Standard

Coding Standard TODOs

About

Licenses found

Releases

Packages

Contributors 2

Languages

License

Licenses found

ayeseeem/icm-dpickx

Folders and files

Latest commit

History

Repository files navigation

Data Pickaxe for XML

Quick Start

Tool

Applying XPaths

TODO

Library dpickx

Tool dpickx-app

Origin

Coding Standard

Coding Standard TODOs

About

Resources

License

Licenses found

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Library `dpickx`

Tool `dpickx-app`

Packages