Practical Assigment for the Course "XML Techniques for E-Commerce"

by Judit Kapinya & Zoltán Gera

October 2004


Download: assignment.zip

Short description

To ease the usage of our assigment we provide an Ant script to execute the certain tasks. This script is to be found in the root directory of our assigment's zip file. Now we outline the working of our scipt:


Detailed description

First step: Converting Java Source Code to XML

To convert java source code to XML we applied the BeautyJ software. This software is available for free download at the website beautyj.berlios.de. This application normalizes Java source code to a clean structured format, and there is a Sourclet API providing a convenient way to apply further customizations. Although BeautyJ's possibilities and features are very wide (and most of them are not mentioned here), we utilized only its ability to convert Java source code to XJava XML.

Note that you do not have to install beautyJ, because it is provided in the lib/beautyJ directory of our assigment's zip file.

There are several alternative ways to start up BeautyJ (see documentation). We applied a script (beautyj.bat) which can be found in the bin directory of the BeautyJ distribution. Note that BeautyJ is not compatible with Java 5.0, so you have to have jre 1.4 installed on your computer! If you have Java 5.0 installed, you have to modify this script (insert the path of jre 1.4 before java command).

To convert a Java source code to XML, one have to use the following command:
beautyj -xml.out <xml-file> <source-file>

BeautyJ processes syntactically valid, compilable Java source code files only. Any syntax error in class member declarations or javadoc information will stop beautification. However, errors inside method-bodies (the actual program code) should not confuse BeautyJ. This is due to the fact that BeautyJ's standard Sourclet does not format the instruction-code inside method-bodies. In contrast, it cares about the overall organization of fields, constructors, methods and their javadoc information inside a Java source code file.

Note that BeautyJ is able to beautificate classes that are in the classpath! For example a source code that contains a HttpServlet class cannot be processed without defining the Servlet API in the classpath (because this is not part of the j2se anymore, rather that of j2ee). Additionally note that if we have a class that extends or implements other user-defined classes, they need to be put in the classpath, as well!

Second step: Refining the XML file

In addition to the parsings that BeautyJ made, we applied some refinements inside method-bodies. To this end we wrote a Java program (postParse.jar) that inspects the beautified XML file and introduces three types of new tags: <comment>, <qualifiedFunction>, <string> and <char>. This program is to be found in the lib directory of our assigment's zip file. Its source code is available here.

Note that this program is not a full lexical parser, it does not recognizes keywords for instance. Detecting keywords would increase the amount of coding (inspecing all the cases where certain keywords may appear), although it is not the main focus of our work (in our respect this mighty piece of code would mean only one template matching in the XSL file – for the tag <keyword>). What is more not only the keywords, but the type names (classes) from the imported packages should be recognized, as well – and this is even more difficult task.

This Java program – besides the new tags inside method-bodies – has two more important tasks:

Creating HTML

To create a HTML page from the XML file, we need to create a stylesheet-file (XSL). There are two technical issues about it: it needs to be placed in the xsl directory and the reference to this stylesheet needs to be inserted into the XML file (this done by the postParse application).

A transformation expressed in XSLT describes rules for transforming a source tree into a result tree. The result tree is constructed by finding the template rule for the root node and instantiating its template. This is done by <xsl:template match="/">. Inside this template we create the HTML page itself, inculding the header and the footer. The header contains the name of the source file which will be gained by the following expression: "xjava/class[@public='yes'][1]/@name".

Now its time to come to the discussion what kind of classes can we have in the XML file. There is only one public class, but can be more embedded classes. We apply the <xsl:template match="class[@public='yes']"> template to the public class, and the <xsl:template match="class[@packageprivate='yes']"> template to embedded classes.

Note that if we extend or implement a user-defined class BeautyJ will beautificate those classes, as well! At this stage we have to be careful, because we have to omit the description of those classes. This is done by the template matching: <xsl:template match="class[@public='yes'][position() > 1]"/> (note that we can be sure that the main class is the first one, i. e. any extended or implemented classes are at a later position).

The further refinements of the styling are not detailed here, just one example: for a method the XML file will contain a <method> tag, and the private, public and protected attributes denote the access level. If there is a method that is declared as a public method, the other attributes (private, public) are not present. So we have to apply an if -like strucutre to decide which modifier keyword to put before the method name. This is done as follows:


<xsl:choose>
	<xsl:when test='@private="yes"'>
		  private
	</xsl:when>
	<xsl:when test='@protected="yes"'>
		  protected			
	</xsl:when>
	<xsl:when test='@public="yes"'>
		  public
	<</xsl:when>
	<xsl:when test='@packageprivate="yes"'>
		  private
	</xsl:when>
</xsl:choose>

Note that there are other modifiers (static, final) that need such treating.

Creating PDF

We used Apache FOP 0.20.5 for creating PDF from our XML file. FOP is also part of our project tree, a FOP distribution is placed in the directory lib/fop.

It can be used from command line like:
fop -xml <source-xml-file> -xsl <xsl-file> -pdf <pdf-file-to-create>

This command does two things in one: it converts the XML file using the supplied XSL to a temporary XML-FO representation, and then it renders the XML-FO file into the destination PDF. Both processes can report errors. To be able to separate the two for debugging, we can do only the first transform step with the following command:
xalan -in <source-xml-file> -xsl <xsl-file> -out <xmlfo-file-to-create>

This is only for case of emergency. Our Ant script uses the two-in-one method and does not support debugging with Xalan.

To create a PDF from the XML file, we need to create a stylesheet-file (XSL-FO). There are two technical issues about it: it needs to be placed in the xsl-fo directory, but the reference to this stylesheet needs not to be inserted into the XML file, like in the previous case (HTML generation). When the browser loads the XML file, it looks into it for a link of a stylesheet, because that's the only way, it can be parameterized with one. The case of PDF creation with FOP is different, because FOP is parameterized with a different XSL, so it is not needed to change the original XML file (the link inside to the stylesheet).

The logical structure of the XSL file for PDF creation is basically the same, like the other XSL. The only difference is that it uses FO-specific XML tags in its templates. That is the reason why it is not discussed here in detail.

The XSL file for PDF creation describes only a simple layout-set and a simple page. It defines a static header and footer which will be visible on every page of the resulting PDF. The body of the pages contains only the formatted source code in exactly the same fashion as on the HTML page.