Saturday, April 5, 2014

Extracting Data from XML files using XSL and XPath...6

Removing the white space from the XML file

Sometimes when we try to shorten an XML file (see the above discussion on that), the file contains a lot of white space because of removed tags. New XML file can look something like this:

Such white space can be removed by adding the following template construct:

Just add in your code before the </stylesheet> tag. The new XML file looks as follows:

If not all, then at least some white space will go. See this link.

Conditional reports (based on group)

In point 4, I stated that we needed an exception for one of the groups. Extending that idea, we can actually segregate the reports pertaining to a single group also. The code will run as follows:
Notice that since we are taking counts at the testplan level, we are using ‘../’ 2 times. The above code can also be written as:
Notice that now we have just one ‘../’ because the condition is at the repository level. One can very well imagine that if we are taking all testcases, we can actually restrict the output by putting conditions at several levels (datasource, repository, testplan, build, platform etc. depending on the XML structure) E.g.
In the above code, the restrictive conditions at the datasource level, repository level, testplan level, build level and platform level can help us limit the number of testcases that appear in our output file. The above is essentially an advanced XPath expression, because we have introduced additional conditions at different navigation levels.

A typical problem

Recently, I faced a problem when I had to access two parts of an XML file. Actually, I had put condition on the 2nd part of the XML, which was not possible because there were two for-each loops involved. The XML Structure was something like.

The platforms, testplans, and execution_testplan were all at the same level. The outer ‘for-each’ loop accesses all the ‘execution_build’ tags. The inner ‘for-each’ loop accesses (based on the conditions), the details of ‘testplan’ tags (within ‘testplans’). To see how we can access the ‘testplan’, please see ‘Handling XML with non-regular structure’.

A condition needed to put on the Phase (a custom attribute within the testplan tag) value for a testplan. That looked impossible, since the Phase was in the inner loop where if the condition is true, will give blank data values for data elements within inner element, but will give some values for the execution_build tag. Putting the condition was necessary because we needed to segregate the report by Phase value.

The problem can be tackled easily. I created a new intermediate XML file from the original file. The new intermediate file contained all data values form inner and outer loop. Since it had no conditions, it contained all the data. The code for creating the intermediate XML file looked as follows:

The xsl:template match tag creates a template with the rules defined within the tag and applies to all execution_build tags. As you can also see the xsl:output method is ‘xml’ because the output file will be an XML file to be used by our next XSL code that will use this XML file for generating the final output file. Also, since we are generating the XML file, we can add our own tags. For example <Instance> tag has been added and it gives the value of instance with respect to the current execution_build (note the usage of ../ a number of times). Also, see how a tag is given value using the ‘xsl:value-of’ tag within a tag. The lower part of the code accessed the details of the execution_build.

The xsl:copy will then define the copying of all other tags within execution_build.  The xsl:apply-templates then applies that template to the XML file to create a new XML file. Also see that platforms and testplans tags and their contents are all ignored (see ‘Handling a big XML file’ for details).
The XML file that was created had the data as follows:

As you can see the execution_build contains all the details that we need in our output file. Hence the XSL code to generate the output file looked as follows:


Note that for an execution_build, this data is contained just above it in the XML file. Also, it is parent to the execution_build, hence the usage of ../. This can be changed if required. Now putting a condition became fairly easy because the Phase value lies close (in terms of hierarchy) to execution_build.

No comments:

Post a Comment