High performance
multi-model database

Data Modeling – Relational or Object Access

Early in the process of designing a new application, developers must decide upon their approach towards data modeling. For most, this comes down to a choice between the traditional modeling of data as relational tables and the newer approach of modeling as objects. Faced with the need to handle complex data, many developers believe that modeling with objects is a more effective approach.

Of course, when moving an existing application to Caché, the first step is to migrate the existing data model. There are easy ways to import data models from various relational or object representations so that the result is a standard Caché data definition. Once migrated to Caché, data can be simultaneously accessed as objects, relational tables, and multidimensional arrays.

Caché supports both SQL and object data access, and at times each is appropriate. To understand the uses of each and why data modeling with objects is generally preferred by modern-day developers, it is useful to understand how and why each has developed.

Caché supports both SQL and object data access

Relational Technology

In the early days of computing, information processing was done on huge mainframe systems and data access was, for the most part, limited to IT professionals. Databases tended to be home grown, and retrieving data effectively required a thorough knowledge of the database. If users wanted a special report, they usually had to ask an overworked central staff to write it, and it usually wasn’t available in time to influence decisions.

Although relational technology was originally developed in the 1970s on the mainframe, it remained largely a research project until it began to appear in the 1980s on mini-computers. With the advent of PCs, the world entered a more “user-centric” era of computing with more user-friendly report writers based on SQL – the query language introduced by relational technology. Users could now produce their own reports and ad hoc queries of the database, and relational usage exploded.

SQL allows a consistent language to be used to ask questions of a wide variety of data. SQL works by viewing all data in a very simple and standardized format – a two-dimensional table with rows and columns. While this simple data model allowed the construction of an elegant query language with which to ask questions, it came at a severe price. The inherent complexity of real-world data relationships doesn’t fit naturally into simple rows and columns, so data is often fragmented into multiple tables that must be “joined” in order to complete even simple tasks. This results in two problems: a) queries can become very difficult to write due to the need to “join” many tables (often with complex outerjoins); and b) the processing overhead required when relational databases have to deal with complex data can be enormous.

SQL has become a standard for database interoperability and reporting tools. However, it is important to understand that while SQL grew out of relational databases, it need not be constrained by them. Caché supports standard SQL as a query and update language, using a much stronger multidimensional database technology, and it is extending to include object capabilities.

Object Technology and Object Databases

Object programming and object databases are a practical result of work to simulate complex activities of the brain. It was observed that the brain is able to store very complex and different types of data and yet still manipulate such seemingly different information in common ways. To support that simulation, very complex behavior needed to be implemented in programs while hiding that complexity – supporting simpler, more generalized and understandable logic with adaptable, reusable functionality. Clearly, these characteristics are also true of today’s leading-edge applications, and a technology that lets developers work in a natural manner that mimics how humans think is a huge advantage.

Object vs. Relational Access

InnovationArtIn object technology, the complexity of the data is contained within the object, and the data is accessed by a simple consistent interface. In contrast, relational technology also provides a simple consistent interface, but because it does nothing to manage real-world data complexity – the data is scattered among multiple tables – the user or programmer is responsible for constantly dealing with that complexity.

Because objects can model complex data simply, object programming is the best choice for programming complex applications. Similarly, object access of the database is the best choice for inserting and updating the database (i.e., for transaction processing).

Caché complements object access with an object-extended SQL query language. SQL is a powerful language for searching a database and is widely used by reporting tools. However, we believe SQL is best suited for that purpose – queries and reports – rather than for transaction processing (for which it is cumbersome and often inefficient). Caché SQL object extensions eliminate much of the cumbersome join syntax, making SQL even easier to use.

Overview of the Caché Object Data Model and Object Programming

The Caché object model is based upon the ODMG (Object Database Management Group) standard and supports many advanced features, including multiple inheritance.

Object technology attempts to mirror the way that humans actually think about and use information. Unlike relational tables, objects bundle together both data and code. For example, an Invoice object might have data, such as an invoice number and a total amount, and code, such as Print().

Conceptually, an object is a package that includes that object’s data values (“properties”) and a copy of all of its code (“methods”). An object’s methods send messages to communicate with other objects. To reduce storage, it is common for objects of the same class to share the same copy of code (e.g., it would be unrealistic for each Invoice object to have its own private copy of code). Also, in Caché, method calls typically result in efficient function calls rather than enduring the overhead of passing messages. However, these implementation techniques are hidden from the programmer; it is always accurate to think in terms of objects passing messages.

What is the difference between an object and a class? A class is the definitional structure and code provided by the programmer. It includes a description of the nature of the data (its “type”) and how it is stored as well as all of the code, but it does not contain any data. An object is a particular “instance” of a class. For example, invoice #123456 is an object of the Invoice class.

Data is stored using a Name datatype.
Data might be a simple data type, such as an integer, or a more complex programmer defined data type, such as a nine-digit string that matches the pattern: NNN-NN-NNNN
This is an example of how objects can be embedded within other objects. In this example, Address is an embedded object that contains the properties Street and City.
A Customer has a collection of Invoices, each of which is a complex object stored separately with its own database ID. In this example, there is a one-to-many relationship between Customers and Invoices (one Customer to many Invoices) using a parent child relationship (Invoices cannot exist without a Customer, but a Customer can exist without Invoices). A collection of embedded objects is also possible.
AccountRep is a property that connects a Customer to a SalesPerson object in a many to-one relationship (many Customers to one SalesPerson). Unlike an embedded object, the related object has its own database ID and is stored separately using that ID. That ID can be used to directly access that SalesPerson without accessing the Customer. In Caché, the syntax for accessing an embedded or related object is the same (e.g.,Customer.Address.City and Customer.AccountRep.Name use the same “dot syntax”).

Key Object Concepts

Inheritance is the ability to derive one class of objects from another. The new class (a subclass) contains all of the properties and methods of its superclass, as well as additional properties and methods unique to it. Objects of the subclass can be thought of as having an “is a” relationship to its superclass. For example, a dog “is a” mammal, so it makes sense for the Dog class to inherit all the properties and methods of the Mammal class plus have additional properties and methods such as a DogTagNumber. A subclass may also override an inherited definition (e.g., the Print() method for a subclass of the Invoice class may be different from the Print() method of Invoice). Inheritance promotes reusability of code and makes it easier to introduce major improvements.

Multiple inheritance means a subclass can be derived from more than one superclass. For example, a dog “is a” mammal and “is a” pet, so the object class “Dog” can inherit the properties and methods of both the Mammal class and the Pet class.

Encapsulation means that objects can be viewed as a sort of “black box”. Public properties and methods can be accessed by methods of any class, whereas private properties and methods can only be accessed by methods of the same class. Thus, the application doesn’t need to know the internal workings of an object – it deals only with the public properties and methods. The power of encapsulation is that programmers can improve the inner workings of a class without affecting the rest of the application.

Polymorphism refers to the fact that methods used in multiple classes can share a common interface, even if the underlying implementation is different. For example, suppose the classes Letter, Mailing Label, and ID Badge all contain a method called Print( ). To print, an application doesn’t need to know which type of object it is accessing; it merely calls the object’s Print( ) method.

The Caché Advantage

Caché is fully object-enabled, providing all the power of object technology to developers of high-performance transaction processing applications.

Intuitive Data Modeling: Object technology lets developers think about and use information – even extremely complex information – in simple and realistic ways, thus speeding the application development process.

Rapid Application Development: The object concepts of encapsulation, inheritance, and polymorphism allow classes to be reused, repurposed, and shared between applications, enabling developers to leverage their work over many projects.

Why Choose Objects for Your Data Model?

For new database applications, most developers choose to use object technology because they can develop complex applications more rapidly and more easily modify them later. Object technology provides many benefits:

  • Objects support a richer data structure that more naturally describes real-world data.
  • Programming is simpler – it is easier to keep track of what you are doing and what you are manipulating.
  • Customized versions of classes can easily replace standard ones, making it easier to customize an application.
  • The black box approach of encapsulation means programmers can improve the internal workings of objects without affecting the rest of the application.
  • Objects provide a simple way to connect different technologies and different applications.
  • Object technology is a natural match with graphical user interfaces.
  • Many new tools assume object technology.
  • Objects provide a good insulation between the user interface and the rest of the application. Thus, when it becomes necessary to adopt a new user interface technology (perhaps some currently unforeseen future technology), you can reuse most of your code.

Object Data Storage…

Unfortunately, although many applications are now being written with object programming languages, they often try to force object data into flat relational tables. This significantly impairs the advantages of object technology.

Caché provides a multidimensional data structure that naturally stores rich object data. The result is faster data access and faster programming.

…plus Relational Access

Of course, many tools (such as report writers) use SQL, not object technology, for accessing data.

A unique feature of Caché is that whenever a database object class is defined, Caché automatically provides full SQL access to that data. Thus, with no additional work, SQL-based tools will immediately work with Caché data, and even they will experience the high-performance advantage of the Caché multidimensional data server.

The reverse is also true. When a DDL definition of a relational database is imported, Caché automatically generates an object description of the data, enabling immediate access as objects, as well as through SQL.

The Caché Unified Data Architecture keeps these access paths synchronized; there is only one data description to edit.

Not Only SQL

Caché also allows direct access to its multidimensional data structures. This enables Caché to be used as a NoSQL or “Not Only SQL” database in situations where that is desirable.