Performance Analysis of CORBA-based Distributed Data Handling Systems
Aleksey BurdakovThe process of development of distributed data handling systems (DDHS) requires the future system performance analysis. Existing performance characteristics evaluation methods, which are mostly based on the Queuing Theory (QT), have a set of disadvantages making it difficult to use them on the earliest stages of a project. This paper surveys existing methods of analysis and makes some proposals of new ones. The directions of the future research are given as well.
1. Introduction
The process of development of distributed data handling systems (DDHS) requires the future system performance analysis. Existing performance characteristic evaluation methods, which are mostly based on the Queuing Theory (QT), have a set of disadvantages making it difficult to use them on the earliest stages of a project.
Nowadays, it becomes more usual to use special software (so-called “middle-ware”), which helps to build complex distributed systems on its base. Such software “glues” different parts of a system into one piece. CORBA is believed to be the most mature solution in this area, which allows building of DDHS in heterogeneous environments, using object-oriented technologies. Database management system (DBMS) – is an essential component of any DDHS.
In this article an overview of CORBA-based DDHS development methods, as well as several proposals of systems and their components (relational RDBMS, object-relational ORDBMS and object-oriented OODBMS) performance evaluation methods are given.
2. CORBA-based DDHS Performance Evaluation
CORBA (Common Object Request Broker Architecture) allows developing of distributed systems based on heterogeneous components (objects) by tiding them together, giving a wide range of services: naming, transactional, persistence services, etc. ORB (Object Request Broker) supports seamless intercommunication between remote objects gives independence of physical location, platform features (software and hardware) and the language a system was designed on. By means of IIOP, ORB also gives an independence of particular ORB supplier. The features mentioned above make CORBA a universal layer, which allows building of DDHS from different components, i.e. DBMS, clients, application servers, etc.
2.1. Performance evaluation
Intercommunication between CORBA objects is performed via remote object method calls, and remote objects activation on remote servers (the notion “remote” is used here for general case, i.e. sometimes an object can reside on the same node and be “local”).
A call of a remote object includes the following steps:
An object activation process includes the following steps:
2.2. CORBA and OO/ORDBMS integration methods
DBMS is the major component of any distributed data handling system. There are a few ways of CORBA and OO/ORDBMS integration:
In the first case, the query execution time is approximately equal to the sum of query execution in DBMS and query transportation time through ORB.
In the second case, the query execution time depends on various factors, among them: presence of an object in an object cache, object search in cache, etc. It is necessary to know the object cache hit probability and object management time in the cache. In this case the total execution time is equal to data transportation time through ORB, data processing time in a DBMS and object storage and search time in a cache.
3. DBMS Performance Evaluation
The main part of processing job in a DDHS is done on DBMS servers, where usually huge tables are processed into small result sets, and then transferred to the calling client as a result for its query. This makes it quite important to analyze performance of this part of a DDHS.
There are three major DBMS data models: hierarchical (IMS), network (CODASYL) and relational. Although relational model is semantically weak, almost all modern DBMSs are based on relational, object-relational and object-oriented models. The latter DBMS type is the most perspective. As it is stated in a number of publications, the situation will remain motionless in the nearest future with a little increase of OODBMS part.
The data processing in the DBMS types mentioned above is performed by means of non-procedural (declarative) language SQL (Structured Query Language) and its variations. SQL was proposed soon after the Codd’s publication on relational model in 1970. Since then SQL has become both de-facto and de-jury standard for relational and even for non-relational DBMSs. There are two major standards nowadays: SQL-1999 by ANSI and ISO committees for ORDBMS and ODMG 3.0 proposed by ODMG (Object Data Management Group).
3.1. DBMS Query Processing and its Performance Evaluation
Having knowledge of query processing principles, statistics on data in a database, hardware and software platform performance characteristics it is possible to predict query execution time. However it is necessary to perform all the query optimization stages to get the final query execution plan, which is executed by a DBMS. A lot of researches have been carried out in the field of RDBMS, ORDBMS and OODBMS query optimization recently. In spite of differences, they all are based on algorithms and principles described by Codd in 1972. These algorithms are based on relational algebra, by means of which it is possible to transform the form of a query to another with the minimal execution time.
Query execution involves the following steps: query parsing and validation, view resolution, query optimization, plan compilation, and query execution. The result execution plan is mostly influenced by optimization phase. Optimization includes the following two steps: heuristic optimization and cost-based optimization.
Heuristic (or algebraic) optimization is a query transformation independently of system-dependent cost model. Such an optimization is based on the rules (e.g. bringing selections into productions), that are used to transform a query into more efficient form.
Cost-based optimization differs from heuristic optimization, since it is based on specific knowledge on a database and DBMS (i.e. query execution methods, physical storage structures, access methods, indices, etc.).
In the work, a mathematical method of query execution time evaluation in relational DBMS (based on the knowledge of RDBMS query optimization methods, conceptual scheme parameters, and statistics on a database) is proposed.
3.2. Query Processing Features in OO/ORDBMS
Query processing in OODBMS and ORDBMS differs from RDBMS due to the following features of the former:
4. Conclusion
In the framework of this research it is planned to develop DDHS performance evaluation methods, carry out experiments on a real information system with the help of “KISP” (Automated Systems Project Development Lifecycle and Decision Support System). It is also planned to prove the proposed methods on the basis of the experiment results.
References
© Aleksey Burdakov, 1999-2000