on the topic of modeling

from my point of view, the problem with many logical data models is 4 fold:

  1. overall, its conceptual but has bits and pieces of all these other disciplines – i.e. it’s not a consistent representation of any single discipline.
  2. it offers no method for validation beyond the subjective – the vast majority of classes (read as: All) comes from a single individual and the relationships are born out of real-world observations. any material disagreement/objection with the model can only be subjective because the model itself is subjective. non-subjective commentary inherently forces the comparison of the LDM to a focused discipline such as Semantic/Canonical/Ontological.
  3. its fluid and changes as the domain/landscape changes – this makes it a poor tool for directly creating an instantized thing (model, schema, blueprint) for implementation.
  4. it has no way to represent, define or specify interoperability – granted an LDM is not supposed to do this but that expectation seems to always linger in the primitive brain of most audiences when they view the model.

what the LDM does do effectively:

  1. introduces the novice to a very complex domain in an easy manner – by representing real-world things as class objects, providing some basic data concepts as attributes and how they all relate to one another.
  2. it allows for what-if scenarios – creating ad hoc diagrams on the fly once the classes are enumerated allows one to see how objects would relate to each other if one were to use them in non-standard ways (i.e. what if i wanted a class, that would ordinarily belong to one geospatial location, in another).
  3. it provides a heads-up of what’s to come – the attributes and relationships paint a very good picture of what type of data will need to be captured, stored and/or transformed.
  4. it allows one to see the scope in a visual way – i challenge anyone to produce a document that conveys the same information without ambiguity.
  5. most LDMs are created in the UML – it is the correct notation.

after the LDM, what’s needed is the next level of model – one that builds on the LDM (one can always update the LDM as new things are uncovered – issues, classes, relationships, etc. – that’s why it’s fluid) and codifies the fuzziness an LDM cannot inherently escape. start with the priority items and expand based on the business requirements.

Logical data model

A logical data model (LDM) in systems engineering is a representation of an organization’s data, organized in terms entities and relationships and is independent of any particular data management technology.

Why Build Logical Data Model

  • Helps common understanding of business data elements and requirements
  • Provides foundation for designing a database
  • Facilitates avoidance of data redundancy and thus prevent data & business transaction inconsistency
  • Facilitates data re-use and sharing
  • Decreases development and maintenance time and cost
  • Confirms a logical process model and helps impact analysis.

Logical Modeling Benefits

  • Clarifies functional specifications and avoids assumption
  • Confirms business requirements
  • Facilitates business process improvement
  • Focuses on requirements independent of technology
  • Decreases system development time and cost
  • Becomes a template for the enterprise
  • Facilitates data re-use and sharing
  • Faster ROI
  • Gathers metadata
  • Foster seamless communication between applications
  • Focuses communication for data analysis and project team members
  • Establishes a consistent naming scheme

Conceptual Schema

A conceptual schema or conceptual data model is a map of concepts and their relationships. This describes the semantics of an organization and represents a series of assertions about its nature. Specifically, it describes the things of significance to an organization (entity classes), about which it is inclined to collect information, and characteristics of (attributes) and associations between pairs of those things of significance (relationships).

Because a conceptual schema represents the semantics of an organization, and not a database design, it may exist on various levels of abstraction. The original ANSI four-schema architecture began with the set of external schemas that each represent one person’s view of the world around him or her. These are consolidated into a single conceptual schema that is the superset of all of those external views. A data model can be as concrete as each person’s perspective, but this tends to make it inflexible. If that person’s world changes, the model must change. Conceptual data models take a more abstract perspective, identifying the fundamental things, of which the things an individual deals with are just examples.

The model does allow for what is called inheritance in object oriented terms. The set of instances of an entity class may be subdivided into entity classes in their own right. Thus, each instance of a sub-type entity class is also an instance of the entity class’s super-type. Each instance of the super-type entity class, then is also an instance of one of the sub-type entity classes.

Super-type/sub-type relationships may be exclusive or not. A methodology may require that each instance of a super-type may only be an instance of one sub-type. Similarly, a super-type/sub-type relationship may be exhaustive or not. It is exhaustive if the methodology requires that each instance of a super-type must be an instance of a sub-type.

Ontology [Model]

In computer science and information science, an ontology is a formal representation of the knowledge by a set of concepts within a domain and the relationships between those concepts. It is used to reason  about the properties of that domain, and may be used to describe the domain.

In theory, an ontology is a “formal, explicit specification of a shared conceptualisation”. An ontology provides a shared vocabulary, which can be used to model a domain — that is, the type of objects and/or concepts that exist, and their properties and relations.

[It] is the representation of entities, ideas, and events, along with their properties and relations, according to a system of categories.

Common components of ontologies include:

  • Individuals: instances or objects (the basic or “ground level” objects)
  • Classes: sets, collections, concepts, classes in programming, types of objects, or kinds of things.
  • Attributes: aspects, properties, features, characteristics, or parameters that objects (and classes) can have
  • Relations: ways in which classes and individuals can be related to one another
  • Function terms: complex structures formed from certain relations that can be used in place of an individual term in a statement
  • Restrictions: formally stated descriptions of what must be true in order for some assertion to be accepted as input
  • Rules: statements in the form of an if-then (antecedent-consequent) sentence that describe the logical inferences that can be drawn from an assertion in a particular form
  • Axioms: assertions (including rules) in a logical form that together comprise the overall theory that the ontology describes in its domain of application. This definition differs from that of “axioms” in generative grammar and formal logic. In those disciplines, axioms include only statements asserted as a priori knowledge. As used here, “axioms” also include the theory derived from axiomatic statements.
  • Events: the changing of attributes or relations

Semantic data model

1. It is a conceptual data model in which semantic information is included. This means that the model describes the meaning of its instances. Such a semantic data model is an abstraction that defines how the stored symbols (the instance data) relate to the real world.

2. It is a conceptual data model that includes the capability to express information that enables to interpret meaning (semantics) from the instances, without the need to know the meta-model. Such semantic models are fact oriented (as opposed to object oriented). Facts are typically expressed by binary relations between data elements, whereas higher order relations are expressed as collections of binary relations. Typically binary relations have the form of triples: Object-RelationType-Object. For example: the Eiffel Tower <is located in> Paris.

Typically the instance data of semantic data models explicitly include the kinds of relationships between the various data elements, such as <is located in>. To interpret the meaning of the facts from the instances it is required that the meaning of the kinds of relations (relation types) is known. Therefore, semantic data models typically standardize such relation types. This means that the second kind of semantic data models enable that the instances express facts that include their own meaning. The second kind of semantic data models are usually meant to create semantic databases. The ability to include meaning in semantic databases facilitates building distributed databases that enable applications to interpret the meaning from the content. This implies that semantic databases can be integrated when they use the same (standard) relation types. This also implies that in general they have a wider applicability than relational or object oriented databases.

The logical data structure of a database management system (DBMS), whether hierarchical, network, or relational, cannot totally satisfy the requirements for a conceptual definition of data, because it is limited in scope and biased toward the implementation strategy employed by the DBMS. Therefore, the need to define data from a conceptual view has led to the development of semantic data modeling techniques. That is, techniques to define the meaning of data within the context of its interrelationships with other data. As illustrated in the figure. The real world, in terms of resources, ideas, events, etc., are symbolically defined within physical data stores. A semantic data model is an abstraction which defines how the stored symbols relate to the real world. Thus, the model must be a true representation of the real world.

“…overall goal of semantic data models is to capture more meaning of data by integrating relational concepts with more powerful abstraction concepts known from the Artificial Intelligence field. The idea is to provide high level modeling primitives as integral part of a data model in order to facilitate the representation of real world situations”.

Canonical Model

Canonical Model is a design pattern used to communicate between different data formats. A form of Enterprise Application Integration, it is intended to reduce costs and standardize on agreed data definitions associated with integrating business systems. A Canonical Model is any model that is canonical  in nature, i.e. a model which is in the simplest form possible based on a standard, common view within a given context. Often the term Canonical Model is used interchangeably with Canonical Data Model, an enterprise design pattern which provides common data naming, definition and values within a generalized data framework.

Adoption of a comprehensive enterprise integration strategy often entails a move to a message based integration methodology. A typical migration from point-to-point interfacing to message based integration begins with a decision on the middleware to be used to transport messages between endpoints. Often this decision results in the adoption of an Enterprise Service Bus (ESB) or Enterprise Application Integration (EAI) solution. Most organizations also adopt a set of standards for message structure and content (message payload). The desire for consistent message payload results in the construction of an enterprise or business domain Canonical Model or adoption of an XML message standard used as the basis for message objects.

The goal of the Canonical Model is to provide a dictionary of reusable common objects and definitions at an enterprise or business domain level to enhance system interoperability. “A Canonical Data Model allows developers and business users to discuss the integration solution in terms of the company’s business domain, not a specific package implementation. For example, packaged applications may represent the common concept of a customer in many different internal formats, such as ‘account’, ‘payer’, and ‘contact’. Defining a Canonical Data Model is often the first step to resolving cases of semantic dissonance between applications.”  Enterprise integration models provide a foundation for a decoupled, consistent, reusable integration methodology which can be implemented using messaging supported by middleware products. Message payloads (business data content) in the form of XML schema are built from the common model objects thus providing the desired consistency and re-usability while ensuring data integrity.


wanna say something?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s