Institution: | Slovak University of Technology |
Technologies used: | Java |
Inputs: | abstract model of input and output data, abstract description of the transformation |
Outputs: | executable data transformation in XSLT language (Extensible Stylesheet Language Transformations) |
Addressed problem
Data transformations are a crucial part of the process of software systems and services integration. Currently, a significant technology for creating data transformations of XML documents is the Extensible Stylesheet Language Transformations (XSLT). Development of XSLT artifacts is often difficult and error-prone, which increases the time and effort required to develop them. For many developers the syntax of XSLT is hard to read (and write), increasing the likelihood of making a mistake when creating or maintaining a data-transforming software component. Moreover, as processed XML documents do not always have a defined schema (and the majority, if not all, of XSLT editors do not work with such schemas), subtle bugs, stemming e.g. from confusing an attribute with a subelement (person/@id vs. person/id), can easily “creep” into the code. Even more problems occur when the schema of source or target documents evolves, and it is necessary to correspondingly change the transformations.
Description
Our approach to streamline the process of creating XSLT documents is to shift the attention of the developer to the higher level of abstraction: instead of working with XML logical entities, i.e. elements and their attributes, we concentrate on the conceptual schema of the data that is to be processed, as elaborated in the works of Czech researcher Martin Necasky and his colleagues. This conceptual schema captures data content at the level of abstract entities: classes and their relationships. All the details, like concrete names of XML entities and attributes and their structure; e.g. if data is represented as a subelement or as an attribute; if referenced data is nested within parent element as its subelement or it is stored independently and referenced using an identifier, and so on.
Along with this shift from concrete to abstract representation, we have simplified the syntax of XSLT to work with these abstract entities, creating a kind of “abstract-XSLT” language. In this way, although we have lost some of the flexibility of XSLT and XML, which are designed to work well both with highly structured as well as unstructured documents, we expect to gain – by concentrating on highly structured case – much more in the area of developer’s productivity.