Wednesday, April 2, 2014

Extracting Data from XML files using XSL and XPath...2

XML Starlet

In <company> we have used XML Starlet, which is available on OLEX. We have used it to convert the data (XML file) from LTS Next team and using our code we get the data dump in csv format.
Following tools are necessary for developing code in XSLT:
1.    Notepad++ - This is a free tool available on OLEX. It is a great tool and XML files of the size of ~500 MB (>8628161 lines) can be opened and edited. A file of the size ~712 MB did not open in it.It is a must in case you need to handle large XML files. It is also useful for editing XSLT codes.
2.    Command line – This can be launched from Windows à Run à cmd à Enter key. Basic knowledge of windows command prompt is good. Basic commands can include:
a)    cd <name> – change directory
b)    cd.. – move to parent directory
c)     help – this will give you a summary of all command prompt commands that you can use
The folder of XML starlet must be present in C drive (c:) or within c:\users\<alias>. Once the XML Starlet is unzipped at one of these locations, it is ready for use. The ‘xml.exe’ file is present in this folder and it is this file that processes the code and data.

Command for transformation

On the command prompt following command can be run:
C:\users\u2p9>xml tr <code.xsl><xmlfile.xml>>> outputcsv.csv

In the above command ‘tr’ is the transformation command run against ‘xml’; ‘code.xsl’ is the XSLT code; ‘xmlfile.xml’ is the xml file that contains the source data; ‘outputcsv.csv’ file contains the output records, row by row as specified in the code. 

No comments:

Post a Comment