Saturday, July 17, 2010

XMLCheatSheet

URI:
Uniform Resource Identifiers.
Describe all points (locations) in the information space, even
those that do not have a physical presence. Consist of:
  • schema;
    http,
    ftp,
    file,
    mailto,
    imap,
    https
  • schema-specific
    part:
    all that goes behind the colon
  • URL:
    Uniform Resource Locators.
    Type of URI. Consist of:
  • schema:
    protocol, e.g.,
    http
  • server:
    e.g., www.w3.org
  • path;
    e.g., projects/project1
  • URN:
    Uniform Resource Names. Type
    of URI. They are pointers to resources, but without reference to
    particular locations.
  • schema:
    urn
XML:
Extensible Markup Language.
Framework for defining markup languages. Inherently
internationalized

all xml documents are written in the Unicode alphabet.
Root node: the tag (conceptual object) at the top.
Root element: all the information contained in the children of the root node.
Text node: node that contain only text. It has no children (it’s the text itself, not the tags
it’s between)
This text is also called character
data.
Attribute node: pair or name and value associated with an element
node
(not only tags are nodes)
Element: logical grouping of
the information represented by its descendants.
XML parser: tool that constructs a tree representation of a textual XML document.
XML serializer: tool that constructs an XML document from a tree.
Namespaces: used for solving name clashes.
They can be given shorter names using namespace declaration, e. g.:
<… xmls: foo=“http://www.w3.org/pjts”>
URLs are used because they are unique (an only the owner of a domain would use it).
XPath:
language for navigating xml trees.
An XPath location path (expression) evaluates to a sequence of nodes of a specific tree. It is built as a sequence of location steps, each step separated from the previous one by /. Each step consists of:
  • Axis:
    keyword that indicates the node or nodes we are looking for in
    relation to the node test.
    child,
    descendant, parent, attribute, …
  • Node
    test
    : specifies a node by name or a type of nodes by their
    properties.
    • text()
    • comment()
    • the name of the node (the tag)
    • node()
  • Predicates
    (not necessary): Boolean conditions for selecting, or not, the
    nodes. Written in
    […].
Schema:
formal definition of the syntax of an XML-based language.
Schema language: formal language for expressing schemas.
Schema processor: implementation of a schema language that checks if a document is valid (syntactically correct according to the
schema).
Schema languages:
  • DTD:
    Document Type Definition.
    First schema language. Not written in XML. Does not support namespaces.
  • Document type declarations: reference to the DTD schema:
root SYSTEMURI
  • Element declarations
    element-name content-model
Content models:
EMPTY,ANY,#PCDATA


  • Attribute list declarations
    element-name attribute- definitions



  • Each attribute definition has the form:

att-name
att-type att-declaration

Attribute types: CDATA,NMTOKEN,ID,IDREF.

Declaration types: #REQUIRED, #IMPLIED,value, #FIXED “value.

  • XML Schema: official schema language written in Xml. Main constructs:
    • Simple type definition: describes text without markup (character
      data + attributes).
    • Complex type definition: describes text that may contain markup (elements + attributes + character data).
    • Element declarations: associates an element name with a simple
      type or with a complex type.
    • Attribute declaration: associates an attribute name with a simple
      type.
- The root element contains an attribute targetNamespace that indicates the namespace being described.

- A document can point to the schema with a schemaLocation attribute in the root.

XSL:
Extensible Stylesheet Language.
Language for specifying presentations of XML documents. Components:
  • XSLT:
    XSL Transformations. Language
    for specifying transformations between XML languages.
  • XSL-FO:
    XSL Formatting Objects.
    Target language for specifying physical layout.

XQuery:
language to query XML documents in a similar way to SQL. It’s an extension of XPath.





DOM:
Document Object Model. API
that allow us to parse, navigate, manipulate, and serialize a XML document. It’s common to all languages and thus very general and complex.
Methods:
parentNode,
previousSibling, nextSibling, firstChild, childNodes,
getAttributeNode, attributes
Interfaces:
Node, Element, Attr, Text, DocumentType,
Notation, Entity, EntityReference, CharacterData,
ProcessingInstruction, CDATASection, Comment, NodeList,
NamedNodeMap

JDOM:
DOM for Java. It’s an API for XML that is specific to Java, and thus easier to use.
Interfaces: Parent.
Abstract classes: Content.
Classes: Comment, DocType, Element, EntityRef, ProcessingInstruction, Text, Document, CDATA.
Methods:
getContent, getNamespace, getDescendants, getAttributeValue,
setAttribute, getChildren, getName
JDOM has built in support for evaluating XPath expressions and for performing XSLT transformations.
JDOM doesn’t have its own parser, so it uses the DOM parser or the SAX parser.
SAX:
Simple API for XML. Framework
for streaming XML. It views an XML document as a stream of events.
It calls the appropriate method when it reads through an event (document starts, start tag encountered…).
The DefaultHandler class provides empty implementations of all possible event handlers. We must extend it and override its methods.
SAX Filters: events handlers that may act upon the various events and send them on to a parent handler (similar to UNIX
pipes).
XML Data Binding: mapping schemas into collections of classes = generating a group of classes from a schema.
JAXB:
Java Architecture for XML Binding.

No comments:

Post a Comment