3) On calling DOMHelloWorld
The main method of DOMHelloWorld.java will check that the filenameof the xml file has been provided as an argument.
public static void main(String[] args)
String xmlfilename = args[0];
}
First we must create an instance of the parser (vendor specific parser). This is the same parser we imported earlier in step 2.
DOMParser xmlparser = new DOMParser();
5) Parse the xml file.
This is really easy because the parser does it for you.
All you have to do is call the parse method with the name of the xml file.
xmlparser.parse(xmlfilename);
If you look at the API documentation that comes with the Xerces parser and search for the parse method you will notice something special.
DOMParser and SAXParser are subclasses of XMLParser.
The parse method throws two exceptions, SAXException and java.io.IOException.
If you try to compile the source code without catching the exception, an error will occur (java.io.IOException must be caught).
Under the previous import statements add the importsfor the IOException and SAXException classes .
import
import org.xml.sax.SAXException;
So now we put a try/catch block around the parse method of DOMParser.
try {
xmlparser.parse(xmlfilename);
} catch (IOException e)
{
System.out.println("Error reading xml file: " e.getMessage());
}
catch (SAXException e)
{
System.out.println("Error in parsing: " e.getMessage());
}
6) Accessing the DOM tree
As DOM creates a tree-based structure based on the xml file,we will need to access the information stored in the tree.
To access the tree call getDocument() and this returns a Document.
Document doc = xmlparser.getDocument();
This Document represents the entire XML document. The data we need to access is stored in nodes. These nodes can have child nodes.
Therefore what you need to do next is walk through the Tree structure (Document) and display the data stored in the Nodes.
Now the Fun starts!!!
7) Walking the nodes
Next we will write a method to display the data in a node. The method will be called displayNode and will take one parameter, the start node.
This method will start from the first node and walk through all the nodes in the XML using recursion.
The nodes in the XML document are of different node types.
The node types can be divided into two broad categories; structural nodes and content nodes.
Structural nodes are not actually part of the content in the document but are used to provide syntax structure.
The following is a list of the different types of nodes:
The main nodes we are interested in are DOCUMENT_NODE, ELEMENT_NODE and ATTRIBUTE_NODE.
Should I use DOM or SAX?
If your document is really large, or you only need to extract a few elements from the whole document, then using DOM is not practical.
SAX Looks for startElement events in which the element name is verse, then looks at each character event for the name SAX is more efficient because we only care about a few elements in the document
SAX is good when:
1)You only need to go through a document once
2)You’re only looking for a few items
3)You’re not too concerned with document structure of the context of the elements you’re looking for
4)You don’t have much memory
DOM Builds Java objects for everything in the document, then walks through the tree looking for the name. Using DOM, we create a lot of Java objects we never use, and we go through the document twice (once to parse it and once to search it)
DOM is good when:
1)You need to go through a document more than once
2)You need to manipulate lots of things
3)in the document The structure of the document and the
4)context of the elements is a concern
5)Memory is not an issue
The main three main APIs that we shall focus on are:
1.JAXP – Parsing API.
2.JAXM – Messaging API.
3.JAXB – Binding API.
Java API for XML Processing
As a developer, you would program to a special interface. This interface isolates you from specific parsers and coding changes. SAX and DOM are language independent interfaces.
SAX and DOM have two different APIs to access the information from the XML parser. The different APIs use different approaches to access the information in the XML document.
SAX is a low level API and DOM is a high level API. The next sections will cover SAX and DOM in more detail.
XML applications will create a parser object, throw some XML at the parser and then process the results
Simple API for XML (SAX)
SAX is a standard interface for event based XML parsing.
SAX defines a number of events. It’s up to you to listen for them and respond to them.
The documents are accessed serially and an event is triggered at different parts of the document.
The common events are:
1.start of the document
2.start elements
3.characters
4.end elements
5.end of the document.
Basically you would write a program that has event handlers.
SAX is fast and has a low memory requirement.
SAX parsing is harder to setup.
Document Object Model (DOM)
Designed to be a portable interface for manipulating document structures.
Using DOM, the application builds a tree structure of the XML document in memory. The different parts of the XML file are stored in nodes in the DOM document. It then walks back and forth through the nodes in tree.
<site>
Basics of programming using DOM
After you have made sure that your environment has been set up correctly (see Setting up the environment for XML and Java section), you may write your first Java and XML example.
For this example we will use the DOM API discussed in the previous section.
This is a simple example that will read the text “Hello World” from a xml file called “hello.xml”.
1) Import package org.w3c.dom
The Java interfaces have been defined by W3C and are contained in the package org.w3c.dom.
import org.w3c.dom.*;
2) Import Vendor dependent Parser.
The next step is to import a vendor dependent XML parser. In our case it will be the xerces DOM parser that we configured.
import org.apache.xerces.parsers.DOMParser;
Setting up the environment for XML and Java:
To use XML you will need a XML parser but before downloading a XML parser, you must make sure you have Java (JDK).Setting up JavaDownload JDK 1.3 from the following URL:http://java.sun.com/j2se/1.3/
For Windows 95/98/ME you edit the AUTOEXEC.BAT file with the new PATH and CLASSPATH settings and reboot your machine.
For Windows NT/2000 you edit the environment settings.
Both of these changes are described in the Java installation instructions.Setting up XML
This tutorial will use the Xerces XML parser found on the Apache XML site.
1. Download the latest version of Xerces from the following URL:
http://xml.apache.org/dist/xerces-j/
If you are a Windows user, the following is the current download (at the time of writing - Xerces-J 1.4.0), you will need:
http://xml.apache.org/dist/xerces-j/Xerces-J-bin.1.4.3.zip
This tutorial assumes you copied the file to the c:\
2. Extract the contents of the zip and this will copy the files and create all the subdirectories. If you go to your Xerces-J 1.4.3 directory you should see xerces.jar and xercesSamples.jar
The next step is to edit the CLASSPATH in your AUTOEXEC.BAT file.
You need to tell Java where it can find xerces.jar and xercesSamples.jar.
Add the two files to your CLASSPATH.
For example,
set CLASSPATH=%CLASSPATH%;C:\Xerces-J-bin.1.4.3\xerces-1_4_3\ xerces.jar; C:\Xerces-J-bin.1.4.3\xerces-1_4_3\ xercesSamples.jar
In order to test your install, try one of the included samples, SAXCount.
Go to the directory of you have the two jar files, for example:
CD C:\Xerces-J-bin.1.4.3\xerces-1_4_3
(USEFUL TIP - in Windows 98/ME/2000 and NT you can drag and drop a directory into a command prompt window. This saves you from having to type in long directory names.)
Type the following to execute the SAXCount application:
java sax.SAXCount data/personal.xml
You should get the output of the application.
data/personal.xml: 280 ms (37 elems, 18 attrs, 140 spaces, 128 chars)
This is a breakdown of the personal.xml file in the data directory.
If you do not get this output then you are either in the wrong directory or most probably your CLASSPATH is incorrect.
Check your AUTOEXEC.BAT file (Windows 95/98/ME) or your environment settings.
If this is all working, then you have correctly setup the environment for Java and XML.
Well Done
XML Basics
In order to use XML, the document must comply with certain rules to be Well Formed.
The Well Formed rules for XML documents are:
1.Tags must be nested.
2. You cannot omit end tags
In HTML manydevelopers leave out </br> and </p> tags.
Most browsers will handlethis correctly.
3. New syntax for end tags.
<car engine="3000"/>
These are the same.
4. All documents must be contained in the root element.
Any document is well formed if it agrees with above rules.
If the document is to be checked for validity, the document uses a Document Type Definition (DTD). The document must begin with and agree to the above rules.
Why use XML in Java?
XML and Java work very well together.
1. Portability – Java is a platform independent development language. XML is an architecture and language independent data format. Both Java and XML do not care about the platform.
People are using XML for conducting business-to-business transactions and a standard way for different computers to communicate with each other.
1. Presentation Oriented Publishing (POP) – the same data in a web browser, mobile phone and a PDA.
2. Message Oriented Middleware (MOM) – B2B.
3. XML used for exchanging database contents.
1) The owner of the online shop wants to provide an online service so that it also works on mobile phones and PDAs.
Data in the store database is used to generate a XML document. This same XML document can be transformed using XSL (covered in a later section) into HTML, WML or any other format. This would allow the information to be displayed on mobile phones and PDA.
2) The owner wants to automate order processing and fulfillment.
Since the 1970s, systems have used Electronic Data Interchange for computer-to-computer communication. This allowed orders and invoices to be sent using a messaging standard.
The problem with EDI is that it is very expensive to buy and EDI system, therefore only large companies would use it.
Two companies agreeing on a DTD can send messages over the Internet using XML. Problems occur regarding security and reliability but these are being addressed.
Going back to our example, when a customer orders a product from the online shop, the shop sends a standard message to the delivery agent. The delivery computer system automatically updates itself with the latest orders and automatically sends an acknowledgement back to the shop.
Messages could also be sent from the suppliers to the online shop to update the inventory on the specific products.
3) The marketing department would like to extract the data from the online shop so they can organize product promotions and sales.
However, the marketing database is in MS Access and the shop uses Oracle. There is no specific standard for exchanging data from one database to another.
XML allows all the tables to be totally described by using custom tags.
<table>
Standards
There are several standard bodies involved in Java and XML.
XML specifications
From a specifications perspective World Wide Web Consortium (W3C) provides the base specifications for XML.
http://www.w3.org/
The Apache XML project provides open source XML implementation solutions.
You can find the following at the Apache XML site: http://xml.apache.org/
Listed below are the Apache projects related to using XML in Java:
Xerces - XML parsers in Java, C (with Perl and COM bindings)
Xalan - XSLT stylesheet processors, in Java and C
Cocoon - XML-based web publishing, in Java
FOP - XSL formatting objects, in Java
Xang - Rapid development of dynamic server pages, in JavaScript
SOAP - Simple Object Access Protocol
Batik - A Java based toolkit for Scalable Vector Graphics (SVG)
Crimson - A Java XML parser derived from the Sun Project X Parser.The Java Community Process (JCP) also has developed a comprehensive set of application programming interfaces (API) for developing XML applications in Java.
In this tutorial you will see how to develop XML applications in Java using different methods. All the methods are based on downloadable and free tools and technologies.
Table of Contents:
1. What is XML?
2. HTML vs. XML
3. XML Basics
4. Why use XML in Java?
5. Application areas of XML
6. Standards
7. Setting up the environment for XML and Java
8. JAVA XML API
9. Simple API for XML (SAX)
10. Document Object Model (DOM)
11. Basics of programming using DOM
12. Should I use DOM or SAX?
The eXtensible Markup Language (XML) is a universal way of structuring documents and other data.
Markup Languages have existed for many years before the start of the World Wide Web. WordPerfect and Rich Text Format (RTF) have used markup tags to provide special formatting commands that apply to specific words and text. Hyper Text Markup Language (HTML) is the markup language used for web pages.
HTML has gained widespread use and is easy to understand. Both HTML and XML are derived from the Standard Generalized Markup Language (SGML).
1986 SGML (document markup language)
1992 HTML (web page specific markup language)
1997 XML (web page and general documents markup language)
2001 XML 1.1
Previously anyone who wanted to create web pages would have to learn HTML syntax and make the page using simple text editors.
More advanced HTML specific editors appeared that checked the web pages and HTML tags. When applications such as MS FrontPage appeared, people could author web pages without learning all the HTML tags. Many thousands of web pages were created daily, mostly showing personal homepages or company marketing information. As the use of websites became more sophisticated, the limitations of HTML as become apparent. The next section covers the similarities and differences between HTML and XML.
Those developers that are familiar with HTML will recognize syntax used in XML, however XML describes the data better than HTML.
Similarities with HTML:
Firewalls do not need to be reconfigured
Known system for security (same web server, firewall, protocols).
Differences between XML and HTML:
HTML has a fixed set of tags, whereas XML allows you to define your own custom tags.
HTML was designed for rendering information from computer to human.
XML has a large overhead in tags used to define the document elements.
To see how XML separates data from the presentation format, the following example is provided.
After reading the HTML displayed above, you can see that it is not exactly clear what is being displayed. We can guess that it is a web site and an email address. A computer program will have great difficultly understanding what this text is in a reliable way.
Below is XML equivalent to represent the same text and data
You can figure out what this means, but the main reason is that the computer program can make use of it. XML takes more space but it defines the information more precisely and robustly.
Cross Rates Powered by Forex Pros - The Forex Trading Portal.