How does a DTD look like?

Consider this xml file:

<document>
	<database>
		<authors>
			<authors-ROW>
				<isbn_CODE>abc123</isbn_CODE>
				<author-NAME>Karl Peter</author-NAME>
			</authors-ROW>
			<authors-ROW>
				<isbn_CODE>xyz123</isbn_CODE>
				<author_NAME>John Smith</author_NAME>
			</authors-ROW>
		</authors>
		<publishers>
			<publishers-ROW>
				<pub_CODE>aaa123</pub_CODE>
				<pub_NAME>Martin Hall</pub_NAME>
			</publishers-ROW>
			<publishers-ROW>
				<pub_CODE>bbb123</pub_CODE>
				<pub_NAME>Mark Robinson</pub_NAME>
			</publishers-ROW>
		</publishers>
	</database>
</document>

It is perfectly legal to have XML documents without DTDs.Think of them as XML files without a pre-defined format.XML Documents that do not have DTDs are called well-formed XML documents.XML Documents that do have DTDs are called well-formed and valid XML documents.
To see how a DTD looks like, let’s create a DTD for out XML document above-

<?xml version = "1.0" standalone="no"?>
<doctype DOCUMENT [<!ELEMENT DOCUMENT(DATABASE)>
<!ELEMENT DATABASE(AUTHORS, PUBLISHERS)>
<!ELEMENT AUTHORS(AUTHORS-ROW)*>
<!ELEMENT AUTHORS-ROW(ISBN_CODE, AUTHOR_NAME)>
<!ELEMENT ISBN_CODE(#PCDATA)>
<!ELEMENT AUTHOR_NAME(#PCDATA)>
<!ELEMENT PUBLISHERS(PUBLISHERS-ROW)*>
<!ELEMENT PUBLISHERS-ROW(PUB_CODE, PUB_NAME)>
<!ELEMENT PUB_CODE(#PCDATA)>
<!ELEMENT PUB_NAME(#PCDATA)>

Explanation for this DTD –

<!ELEMENT DOCUMENT(DATABASE)>

This means that the “DOCUMENT” element should have only one “DATABASE” element inside it.

<!ELEMENT DATABASE(AUTHORS, PUBLISHER)>

This means that the “DATABASE” element should have two elements inside it “AUTHORS” and “PUBLISHERS” – in that order.

<!ELEMENT AUTHORS(AUTHORS-ROW)*>

This means that the “AUTHORS” element should have zero or more “AUTHORS_ROW” elements inside it.
(The * signifies the “zero or more” relationship).

<!ELEMENT AUTHORS-ROW(ISBN_CODE, AUTHOR_NAME)>

This means that the “AUTHORS-ROW” element should have two elements inside it “ISBN_CODE” and “AUTHOR_NAME” – in that order.

<!ELEMENT ISBN_CODE(#PCDATA)>

this means that the ISBN_CODE element has some data in it (PCDATA stands for Parsed Character DATA).

<!ELEMENT AUTHOR_NAME(#PCDATA)>

this means that the AUTHOR_NAME element has some data in it (PCDATA stands for Parsed Character DATA).

<!ELEMENT PUBLlSHERS(PUBLlSHERS-ROW)*>

This means that the “PUBLISHERS” element should have zero or more “PUBLISHERS-ROW” elements inside
it. (The * signifies the “zero or more” relationship).

<!ELEMENT PUBLISHERS-ROW(PUB_CODE, PUB_NAME)>

This means that the “PUBLISHERS-ROW” element should have two elements inside it “PUB_CODE” and “PUB_NAME” – in that order.

<!ELEMENT PUB_CODE(#PCDATA)>

this means that the PUB_CODE element has some data in it (PCDATA stands for Parsed Character DATA).

<!ELEMENT PUB_NAME(#PCDATA)>

this means that the PUB_NAME element has some data in it (PCDATA stands for Parsed Character DATA).

About cuppajavamattiz
Matty Jacob - Avid technical blogger with interests in J2EE, Web Application Servers, Web frameworks, Open source libraries, Relational Databases, Web Services, Source control repositories, ETL, IDE Tools and related technologies.

Comments are closed.

%d bloggers like this: