Purchase your Section 508 Compliance Support guide now!

Purchase your Section 508 Compliance Support guide now!

OLAP architectures

There is no single, ideal way of storing or processing multidimensional data

You can contact Nigel Pendse, the author of this section, by e-mail on NigelP@olapreport.com if you have any comments or observations. Last updated on June 27, 2006.

Contents

Data staging
Storing active OLAP data
Processing OLAP data
The OLAP architectural matrix
Is there an ideal choice?

Introduction

Much confusion, some of it deliberate, abounds about OLAP architectures, with terms like ROLAP, HOLAP, MOLAP and DOLAP (with more than one definition) proliferating at one stage, though the last of these is used less these days. In fact, there are a number of options for where OLAP data could be stored, and where it could be processed. Most vendors only offer a subset of these, and some then go on to attempt to 'prove' that their approach is the only sensible one. This is, of course, nonsense. However, quite a few products can operate in more than one mode, and vendors of such products tend to be less strident in their architectural arguments.

There are many subtle variations, but in principle, there are only three places where the data can be stored, and three where the majority of the multidimensional calculations can be performed. This means that, in theory, there are a possible nine basic architectures, although only six make any sense.

Data staging

Most data in OLAP applications originates in other systems. However, in some applications (such as planning and budgeting), the data might be captured directly by the OLAP application. When the data comes from other applications, it is usually necessary for the active data to be stored in a separate, duplicated, form for the OLAP application. This may be referred to as a data warehouse or, more commonly today, as a data mart. For those not familiar with the reasons for this duplication, this is a summary of the main reasons:

Performance
OLAP applications are often large, but are nevertheless used for unpredictable interactive analysis. This requires that the data be accessed very rapidly, which usually dictates that it be kept in a separate, optimized structure which can be accessed without damaging the response from the operational systems.

Multiple data sources
Most OLAP applications require data sourced from multiple feeder systems, possibly including external sources and even desktop applications. The process of merging these multiple data feeds can be very complex, because the underlying systems probably use different coding systems and may also have different periodicities. For example, in a multinational company, it is rare for subsidiaries in different countries to use the same coding system for suppliers and customers, and they may well also use different ERP systems, particularly if the group has grown by acquisition.

Cleansing data
It is depressingly common for transaction systems to be full of erroneous data which needs to be 'cleansed' before it is ready to be analyzed. Apart from the small percentage of accidentally mis-coded data, there will also be examples of optional fields that have not been completed. For example, many companies would like to analyze their business in terms of their customers' vertical markets. This requires that each customer (or even each sale) be assigned an industry code; however, this takes a certain amount of effort on the part of those entering the data, for which they get little return, so they are likely, at the very least, to cut corners. There may even be deliberate distortion of the data if sales people are rewarded more for some sales than others: they will certainly respond to this direct temptation by 'adjusting' (ie distorting) the data to their own advantage if they think they can get away with it.

Adjusting data
There are many reasons why data may need adjusting before it can be used for analysis. In order that this can be done without affecting the transaction systems, the OLAP data needs to be kept separate. Examples of reasons for adjusting the data include:

    • Foreign subsidiaries may operate under different accounting conventions or have different year-ends, so the data may need modifying before it can be used.
    • The source data may be in multiple currencies that must be translated.
    • The management, operational and legal structures of a company may be different.
    • The source applications may use differenrt codes for products and customers.
    • Inter-company trading effects may need to be eliminated, perhaps to measure true added value at each stage of trading.
    • Some data may need obscuring or changing for reasons of confidentiality.
    • There may be analysis dimensions that are not part of the operational data (such as vertical markets, television advertising regions or demographic characteristics).

Timing
If the data in an OLAP application comes from multiple feeder systems, it is very likely that they are updated on different cycles. At any one time, therefore, the feeder applications may be at different stages of update. For example, the month-end updates may be complete in one system, but not in another and a third system may be updated on a weekly cycle. In order that the analysis is based on consistent data, the data needs to be staged, within a data warehouse or directly in an OLAP database.

History
The majority of OLAP applications include time as a dimension, and many useful results are obtained from time series analysis. But for this to be useful it may be necessary to hold several years' data on-line in this way — something that the operational systems feeding the OLAP application are very unlikely to do. This requires an initial effort to locate the historical data, and usually to adjust it because of changes in organizational and product structures. The resulting data is then held in the OLAP database.

Summaries
Operational data is necessarily very detailed, but most decision-making activities require a much higher level view. In the interests of efficiency, it is usually necessary to store merged, adjusted information at summary level, and this would not be feasible in a transaction processing system.

Data Updating
If the application allows users to alter or input data, it is obviously essential that the application has its own separate database that does not over-write the 'official' operational data.

Storing active OLAP data

Given the necessity to store active OLAP data in an efficient, duplicated form, there are essentially three options. Many products can use more than one of these, sometimes simultaneously. Note that 'store' in this context means holding the data in a persistent form (for at least the duration of a session, and often shared between users), not simply for the time required to process a single query.

Relational database
This is an obvious choice, particularly if the data is sourced from an RDBMS (either because a data warehouse has been implemented using an RDBMS or because the operational systems themselves hold their data in an RDBMS). In most cases, the data would be stored in a denormalized structure such as a star schema, or one of its variants, such as snowflake; a normalized database would not be appropriate for performance and other reasons. Often, summary data will be held in aggregate tables.

Multidimensional database
In this case, the active data is stored in a multidimensional database on a server. It may include data extracted and summarized from legacy systems or relational databases and from end-users. In most cases, the database is stored on disk, but some products allow RAM based multidimensional data structures for greater performance. It is usually possible (and sometimes compulsory) for aggregates and other calculated items to be pre-computed and the results stored in some form of array structure. In a few cases, the multidimensional database allows concurrent multi-user read-write access, but this is unusual; many products allow single-write/multi-read access, while the rest are limited to read-only access.

Client-based files
In this case, relatively small extracts of data are held on client machines. They may be distributed in advance, or created on demand (possibly via the Web). As with multidimensional databases on the server, active data may be held on disk or in RAM, and some products allow only read access.

These three locations have different capacities, and they are arranged in descending order. They also have different performance characteristics, with relational databases being a great deal slower than the other two options.

Processing OLAP data

Just as there are three possible locations for OLAP data, exactly the same three options are available for processing the data. As will be seen, the multidimensional calculations do not need to occur in the place where the data is stored.

SQL
This is far from being an obvious choice to perform complex multidimensional calculations, even if the live OLAP data is stored in an RDBMS. SQL does not have the ability to perform multidimensional calculations in single statements, and complex multi-pass SQL is necessary to achieve more than the most trivial multidimensional functionality. Nevertheless, this has not stopped vendors from trying. In most cases, they do a limited range of suitable calculations in SQL, with the results then being used as input by a multidimensional engine, which does most of the work, either on the client or in a mid-tier server. There may also be a RAM resident cache which can hold data used in more than one query: this improves response dramatically.

Multidimensional server engine
This is an obvious and popular place to perform multidimensional calculations in client/server OLAP applications, and it is used in many products. Performance is usually good, because the engine and the database can be optimized to work together, and the availability of plenty of memory on a server can mean that large scale array calculations can be performed very efficiently.

Client multidimensional engine
On the assumption that most users have relatively powerful PCs, many vendors aim to take advantage of this power to perform some, or most, of the multidimensional calculations. With the expected rise in popularity of thin clients, vendors with this architecture are having to move most of the client based processing to new Web application servers.

The OLAP architectural matrix

Three places to store multidimensional data, and the same three locations for multidimensional engines: combining these gives nine possible storage/processing options. But some of these are nonsensical: it would be absurd to store data in a multidimensional database, but do multidimensional processing in an RDBMS, so only the six options on or below the diagonal make sense. In the interests of completeness, some older products are included in this chart even if they are no longer on sale.


Multidimensional data storage options

Multidimensional processing options

RDBMS

Multidimensional database server

Client files

Multi-pass SQL

1

Cartesis Magnitude
MicroStrategy

 



Multidimensional server engine

2

Extensity MPC
Hyperion Essbase
Longview Khalix
Microsoft Analysis Services
Mondrian
Oracle Express (ROLAP mode)
Oracle OLAP Option (ROLAP mode)
Pilot Analysis Server

WhiteLight

4

Hyperion Essbase
Oracle Express
Oracle OLAP Option AW
Microsoft Analysis Services
PowerPlay Enterprise Server
Pilot Analysis Server
Applix TM1


Client multidimensional engine

3

Oracle Discoverer

5

Comshare FDC
Dimensional Insight
Hyperion Enterprise
Hyperion Pillar

6

Hyperion Intelligence
BusinessObjects
Cognos PowerPlay
Personal Express
TM1 Perspectives
 

 The widely used (and misused) nomenclature is not particularly helpful, but roughly speaking:

Relational OLAP (ROLAP) products are in squares 1, 2 and 3

MDB (also known as MOLAP) products are in squares 4 and 5

Desktop OLAP products are in square 6

Hybrid OLAP products are those that are in both squares 2 and 4 (shown in italics)

The fact that several products are in the same square, and therefore have similar architectures, does not mean that they are necessarily very similar products. For instance, DB2 OLAP Server and Eureka are quite different products that just happen to share certain storage and processing characteristics.

Is there an ideal choice?

Each of these options has its own strengths and weaknesses, and there is no single optimum choice. It is perfectly reasonable for sites to use products from more than one of the squares, and even more than one from a single square if they are specialized products used for different applications. As might be expected, the squares containing the most products are also the most widely used architectures, and vice versa. The choice of architecture does affect the performance, capacity, functionality and particularly the scalability of an OLAP solution, and this is discussed elsewhere in The OLAP Report.


This page is part of the free content of The OLAP Report, but which represents less than a tenth of the information available to subscribers. You can register to access a free preview of a small sample of the large volume of subscriber-only information.

Is your BI system firing duds?

If you want useful answers, you need to ask the right questions. It's no good blaming the software.

You can contact David Harvey, the author of this section, by e-mail on dharvey@olapreport.com if you have any comments, observations or user experiences to add. Last updated on February 11, 2006.

Looking at the recently launched web analytic software from Google, Andy experienced a child-in-a-toyshop rush of excitement. See what you can do with this, he enthused, careering from one goody to another: geo-maps of your visitors' location, data visualization tools, time-series charts. All-in-all, a rich array of slickly-presented information.

A cooler voice was heard asking: "Yes, it's impressive. But what are you going to do with all this information?" It is the killer question for every enthusiast who gets carried away with the latest software release and a vision of information El Dorado. Unless you are able to pose well-focused questions, all the tools and data under the sun will not give you useful answers.

Paul Strassmann, the information technology measurement expert, was once asked at a seminar by an eager IT manager who wanted to know how he should measure the value of information in his organization. Strassmann's response was that it was a meaningless question. Any information sitting in a data warehouse only acquires value when it is applied.

Software and data provide the tools and raw material, but they depend on a whole chain of processes, disciplines and working practices to be turned to good account. It is not sufficient that some piece of business analytical software can tell you more about your data than you had ever dreamed was possible.

Throwing sophisticated business intelligence tools at an organization in the hope that they will help people make sense of a mass of information is not the way to spend time and money. There are at least three conditions that need to be satisfied for analytical tools to earn their keep:

  • There must be an actionable business decision that can be supported with relevant data. Without that follow-through, you are just staring at so-what? data.
  • There must be a managerial process into which any insights can be fed. For example, you might see that you are falling short of, or exceeding, a target, but what action does this trigger?
  • There should be a virtuous feedback loop that enables questions to be refined and supplementary analysis produced, leading to even better questions and smarter actions.

Obvious stuff? Perhaps. But the majority of organizations miss these targets by a mile. A number of characteristics single out the minority of effective business intelligence users including:

  • Well-defined business aims and a sound performance management framework and culture
  • Structured opportunities to drive performance and meet strategic goals
  • A strong link between the vision and strategy of the organization and the practical provision of information.

Get these things right and business intelligence and performance management systems may be worthy objects of wonder.

To find out more about winning strategies for business intelligence and performance management systems, see the article Applying the SPM Maturity Model in the Analyses section.


This page is part of the free content of The OLAP Report, but ten times more information is available only to subscribers, including reviews of dozens of products, case studies and in-depth analyses. You can register for access to a preview of some of the subscriber-only material in The OLAP Report or subscribe on-line.

OLAP architectures

There is no single, ideal way of storing or processing multidimensional data

You can contact Nigel Pendse, the author of this section, by e-mail on NigelP@olapreport.com if you have any comments or observations. Last updated on June 27, 2006.

Contents

Data staging
Storing active OLAP data
Processing OLAP data
The OLAP architectural matrix
Is there an ideal choice?

Introduction

Much confusion, some of it deliberate, abounds about OLAP architectures, with terms like ROLAP, HOLAP, MOLAP and DOLAP (with more than one definition) proliferating at one stage, though the last of these is used less these days. In fact, there are a number of options for where OLAP data could be stored, and where it could be processed. Most vendors only offer a subset of these, and some then go on to attempt to 'prove' that their approach is the only sensible one. This is, of course, nonsense. However, quite a few products can operate in more than one mode, and vendors of such products tend to be less strident in their architectural arguments.

There are many subtle variations, but in principle, there are only three places where the data can be stored, and three where the majority of the multidimensional calculations can be performed. This means that, in theory, there are a possible nine basic architectures, although only six make any sense.

Data staging

Most data in OLAP applications originates in other systems. However, in some applications (such as planning and budgeting), the data might be captured directly by the OLAP application. When the data comes from other applications, it is usually necessary for the active data to be stored in a separate, duplicated, form for the OLAP application. This may be referred to as a data warehouse or, more commonly today, as a data mart. For those not familiar with the reasons for this duplication, this is a summary of the main reasons:

Performance
OLAP applications are often large, but are nevertheless used for unpredictable interactive analysis. This requires that the data be accessed very rapidly, which usually dictates that it be kept in a separate, optimized structure which can be accessed without damaging the response from the operational systems.

Multiple data sources
Most OLAP applications require data sourced from multiple feeder systems, possibly including external sources and even desktop applications. The process of merging these multiple data feeds can be very complex, because the underlying systems probably use different coding systems and may also have different periodicities. For example, in a multinational company, it is rare for subsidiaries in different countries to use the same coding system for suppliers and customers, and they may well also use different ERP systems, particularly if the group has grown by acquisition.

Cleansing data
It is depressingly common for transaction systems to be full of erroneous data which needs to be 'cleansed' before it is ready to be analyzed. Apart from the small percentage of accidentally mis-coded data, there will also be examples of optional fields that have not been completed. For example, many companies would like to analyze their business in terms of their customers' vertical markets. This requires that each customer (or even each sale) be assigned an industry code; however, this takes a certain amount of effort on the part of those entering the data, for which they get little return, so they are likely, at the very least, to cut corners. There may even be deliberate distortion of the data if sales people are rewarded more for some sales than others: they will certainly respond to this direct temptation by 'adjusting' (ie distorting) the data to their own advantage if they think they can get away with it.

Adjusting data
There are many reasons why data may need adjusting before it can be used for analysis. In order that this can be done without affecting the transaction systems, the OLAP data needs to be kept separate. Examples of reasons for adjusting the data include:

    • Foreign subsidiaries may operate under different accounting conventions or have different year-ends, so the data may need modifying before it can be used.
    • The source data may be in multiple currencies that must be translated.
    • The management, operational and legal structures of a company may be different.
    • The source applications may use differenrt codes for products and customers.
    • Inter-company trading effects may need to be eliminated, perhaps to measure true added value at each stage of trading.
    • Some data may need obscuring or changing for reasons of confidentiality.
    • There may be analysis dimensions that are not part of the operational data (such as vertical markets, television advertising regions or demographic characteristics).

Timing
If the data in an OLAP application comes from multiple feeder systems, it is very likely that they are updated on different cycles. At any one time, therefore, the feeder applications may be at different stages of update. For example, the month-end updates may be complete in one system, but not in another and a third system may be updated on a weekly cycle. In order that the analysis is based on consistent data, the data needs to be staged, within a data warehouse or directly in an OLAP database.

History
The majority of OLAP applications include time as a dimension, and many useful results are obtained from time series analysis. But for this to be useful it may be necessary to hold several years' data on-line in this way — something that the operational systems feeding the OLAP application are very unlikely to do. This requires an initial effort to locate the historical data, and usually to adjust it because of changes in organizational and product structures. The resulting data is then held in the OLAP database.

Summaries
Operational data is necessarily very detailed, but most decision-making activities require a much higher level view. In the interests of efficiency, it is usually necessary to store merged, adjusted information at summary level, and this would not be feasible in a transaction processing system.

Data Updating
If the application allows users to alter or input data, it is obviously essential that the application has its own separate database that does not over-write the 'official' operational data.

Storing active OLAP data

Given the necessity to store active OLAP data in an efficient, duplicated form, there are essentially three options. Many products can use more than one of these, sometimes simultaneously. Note that 'store' in this context means holding the data in a persistent form (for at least the duration of a session, and often shared between users), not simply for the time required to process a single query.

Relational database
This is an obvious choice, particularly if the data is sourced from an RDBMS (either because a data warehouse has been implemented using an RDBMS or because the operational systems themselves hold their data in an RDBMS). In most cases, the data would be stored in a denormalized structure such as a star schema, or one of its variants, such as snowflake; a normalized database would not be appropriate for performance and other reasons. Often, summary data will be held in aggregate tables.

Multidimensional database
In this case, the active data is stored in a multidimensional database on a server. It may include data extracted and summarized from legacy systems or relational databases and from end-users. In most cases, the database is stored on disk, but some products allow RAM based multidimensional data structures for greater performance. It is usually possible (and sometimes compulsory) for aggregates and other calculated items to be pre-computed and the results stored in some form of array structure. In a few cases, the multidimensional database allows concurrent multi-user read-write access, but this is unusual; many products allow single-write/multi-read access, while the rest are limited to read-only access.

Client-based files
In this case, relatively small extracts of data are held on client machines. They may be distributed in advance, or created on demand (possibly via the Web). As with multidimensional databases on the server, active data may be held on disk or in RAM, and some products allow only read access.

These three locations have different capacities, and they are arranged in descending order. They also have different performance characteristics, with relational databases being a great deal slower than the other two options.

Processing OLAP data

Just as there are three possible locations for OLAP data, exactly the same three options are available for processing the data. As will be seen, the multidimensional calculations do not need to occur in the place where the data is stored.

SQL
This is far from being an obvious choice to perform complex multidimensional calculations, even if the live OLAP data is stored in an RDBMS. SQL does not have the ability to perform multidimensional calculations in single statements, and complex multi-pass SQL is necessary to achieve more than the most trivial multidimensional functionality. Nevertheless, this has not stopped vendors from trying. In most cases, they do a limited range of suitable calculations in SQL, with the results then being used as input by a multidimensional engine, which does most of the work, either on the client or in a mid-tier server. There may also be a RAM resident cache which can hold data used in more than one query: this improves response dramatically.

Multidimensional server engine
This is an obvious and popular place to perform multidimensional calculations in client/server OLAP applications, and it is used in many products. Performance is usually good, because the engine and the database can be optimized to work together, and the availability of plenty of memory on a server can mean that large scale array calculations can be performed very efficiently.

Client multidimensional engine
On the assumption that most users have relatively powerful PCs, many vendors aim to take advantage of this power to perform some, or most, of the multidimensional calculations. With the expected rise in popularity of thin clients, vendors with this architecture are having to move most of the client based processing to new Web application servers.

The OLAP architectural matrix

Three places to store multidimensional data, and the same three locations for multidimensional engines: combining these gives nine possible storage/processing options. But some of these are nonsensical: it would be absurd to store data in a multidimensional database, but do multidimensional processing in an RDBMS, so only the six options on or below the diagonal make sense. In the interests of completeness, some older products are included in this chart even if they are no longer on sale.


Multidimensional data storage options

Multidimensional processing options

RDBMS

Multidimensional database server

Client files

Multi-pass SQL

1

Cartesis Magnitude
MicroStrategy

 



Multidimensional server engine

2

Extensity MPC
Hyperion Essbase
Longview Khalix
Microsoft Analysis Services
Mondrian
Oracle Express (ROLAP mode)
Oracle OLAP Option (ROLAP mode)
Pilot Analysis Server

WhiteLight

4

Hyperion Essbase
Oracle Express
Oracle OLAP Option AW
Microsoft Analysis Services
PowerPlay Enterprise Server
Pilot Analysis Server
Applix TM1


Client multidimensional engine

3

Oracle Discoverer

5

Comshare FDC
Dimensional Insight
Hyperion Enterprise
Hyperion Pillar

6

Hyperion Intelligence
BusinessObjects
Cognos PowerPlay
Personal Express
TM1 Perspectives
 

 The widely used (and misused) nomenclature is not particularly helpful, but roughly speaking:

Relational OLAP (ROLAP) products are in squares 1, 2 and 3

MDB (also known as MOLAP) products are in squares 4 and 5

Desktop OLAP products are in square 6

Hybrid OLAP products are those that are in both squares 2 and 4 (shown in italics)

The fact that several products are in the same square, and therefore have similar architectures, does not mean that they are necessarily very similar products. For instance, DB2 OLAP Server and Eureka are quite different products that just happen to share certain storage and processing characteristics.

Is there an ideal choice?

Each of these options has its own strengths and weaknesses, and there is no single optimum choice. It is perfectly reasonable for sites to use products from more than one of the squares, and even more than one from a single square if they are specialized products used for different applications. As might be expected, the squares containing the most products are also the most widely used architectures, and vice versa. The choice of architecture does affect the performance, capacity, functionality and particularly the scalability of an OLAP solution, and this is discussed elsewhere in The OLAP Report.


This page is part of the free content of The OLAP Report, but which represents less than a tenth of the information available to subscribers. You can register to access a free preview of a small sample of the large volume of subscriber-only information.

Is your BI system firing duds?

If you want useful answers, you need to ask the right questions. It's no good blaming the software.

You can contact David Harvey, the author of this section, by e-mail on dharvey@olapreport.com if you have any comments, observations or user experiences to add. Last updated on February 11, 2006.

Looking at the recently launched web analytic software from Google, Andy experienced a child-in-a-toyshop rush of excitement. See what you can do with this, he enthused, careering from one goody to another: geo-maps of your visitors' location, data visualization tools, time-series charts. All-in-all, a rich array of slickly-presented information.

A cooler voice was heard asking: "Yes, it's impressive. But what are you going to do with all this information?" It is the killer question for every enthusiast who gets carried away with the latest software release and a vision of information El Dorado. Unless you are able to pose well-focused questions, all the tools and data under the sun will not give you useful answers.

Paul Strassmann, the information technology measurement expert, was once asked at a seminar by an eager IT manager who wanted to know how he should measure the value of information in his organization. Strassmann's response was that it was a meaningless question. Any information sitting in a data warehouse only acquires value when it is applied.

Software and data provide the tools and raw material, but they depend on a whole chain of processes, disciplines and working practices to be turned to good account. It is not sufficient that some piece of business analytical software can tell you more about your data than you had ever dreamed was possible.

Throwing sophisticated business intelligence tools at an organization in the hope that they will help people make sense of a mass of information is not the way to spend time and money. There are at least three conditions that need to be satisfied for analytical tools to earn their keep:

  • There must be an actionable business decision that can be supported with relevant data. Without that follow-through, you are just staring at so-what? data.
  • There must be a managerial process into which any insights can be fed. For example, you might see that you are falling short of, or exceeding, a target, but what action does this trigger?
  • There should be a virtuous feedback loop that enables questions to be refined and supplementary analysis produced, leading to even better questions and smarter actions.

Obvious stuff? Perhaps. But the majority of organizations miss these targets by a mile. A number of characteristics single out the minority of effective business intelligence users including:

  • Well-defined business aims and a sound performance management framework and culture
  • Structured opportunities to drive performance and meet strategic goals
  • A strong link between the vision and strategy of the organization and the practical provision of information.

Get these things right and business intelligence and performance management systems may be worthy objects of wonder.

To find out more about winning strategies for business intelligence and performance management systems, see the article Applying the SPM Maturity Model in the Analyses section.


This page is part of the free content of The OLAP Report, but ten times more information is available only to subscribers, including reviews of dozens of products, case studies and in-depth analyses. You can register for access to a preview of some of the subscriber-only material in The OLAP Report or subscribe on-line.

Business Intelligence Competency Centers

You can email Barney Finucane, the author of this section, if you have any comments, observations or user experiences to add. Last updated on July 12, 2007.

The human factor

The results of The OLAP Survey, which is the largest independent survey of customer satisfaction in the area of BI, show that difficulties involving people are one of the main factors causing delays and failure in BI projects. In fact, people-related problems cause many more difficulties than bad data and buggy software put together.

Communication between IT and business users

The main fault line in any BI project lies between the business users and the IT department. For people caught up in this situation, it is tempting to blame the problems on the intransigence of the other side, or on management issues that can only be dealt with on the top level.

In fact, the real issue involved is that business intelligence solutions are a unique challenge to any IT environment. To work effectively, at least some of the business users need to get their hands on the data and actively manipulate it in way that goes far beyond the scope of standard business practice.

Business intelligence is a balancing act between complexity and flexibility, and business intelligence vendors are accustomed to their customers saying they want complete control and flexibility without any complexity. But from the point of view of an IT department, it means that a small minority of the users are unreasonably trying to take control of processes that they are unlikely to be able to handle, such as database administration. For the business users, IT seems intransigent — preventing business users from getting the data they need just because they can.

There is no simple black and white solution to the problem. The question is: where do you draw the line? Where do the responsibilities of IT end, and where do those of the business users begin? The answer varies from company to company, and depends upon the skills and interests of the individuals involved. Finding the right solution requires close cooperation, patience and good will on both sides.

Specifications and the learning curve

Another problem is deciding what needs to be done. A large minority of BI projects run into trouble because the project team is unsure what needs to be done. Again, in a situation like this it is tempting to point the finger at certain individuals, but this fails to explain why the problem occurs so often.

There are several reasons why planning a BI project is so difficult. First, the people planning the project tend to have a relatively weak grasp of what is possible and what the true costs of implementing individual features will be. This is made worse by the vendor strategy of saying "we can do anything", which is usually more or less true, but tends to ignore the difficulties of delivering some capabilities. Vendors prefer to discuss their products strengths. Second, it is hard to judge the results of a BI project in advance, because good BI software changes the way people work, so planners have no firm ground to stand on. Third, particularly in large projects, there may be widely different priorities among the project sponsors.

Project drift

As the project ages, priorities begin to shift. Key players enter or leave the scene, company reorganizations change the working assumptions, and impatience may lead users to adapt stopgap measures that take on a life of their own as rival solutions to the problem. This can be avoided to a certain extent by clearly defining goals from the start, and working to execute as quickly as possible, but it cannot be completely avoided. Project drift can lead to inefficiency, but in some cases, changing your plans part way through a project is unavoidable.

Business Intelligence Competency Centers

One increasingly popular way of coordinating the complexity surrounding business intelligence is setting up a business intelligence Competency center. A BI Competency center can be a dedicated organization to deal with the questions that arise on a full-time basis, but usually only the largest companies have enough projects to justify the expense involved. More commonly, the BI Competency center is a virtual cross-departmental organization that meets regularly to coordinate policy. It is important to have members from various departments involved so that all key issues receive the consideration they deserve.

A BI Competency center may address a variety of issues. These may include:

  • Identifying and prioritizing potential projects
  • Selecting appropriate software for specific projects
  • Setting company-wide standards and policies on software usage
  • Setting standards for support levels
  • Supporting the specification process
  • Reviewing the progress of projects
  • Coordinating data quality management initiatives

However, BI Competency centers are a relatively new trend that parallels the trend for standardization of BI software and the move towards suites. So the details would vary from company to company. In some cases, BI Competency centers also take on responsibility for providing support for users of BI solutions or even play a role in implementation, but usually they have an advisory or coordinating role.

This page is part of the free content of The OLAP Report, but which represents less than a tenth of the information available to subscribers. You can register to access a free preview of a small sample of the large volume of subscriber-only information.

Business Intelligence Competency Centers

You can email Barney Finucane, the author of this section, if you have any comments, observations or user experiences to add. Last updated on July 12, 2007.

The human factor

The results of The OLAP Survey, which is the largest independent survey of customer satisfaction in the area of BI, show that difficulties involving people are one of the main factors causing delays and failure in BI projects. In fact, people-related problems cause many more difficulties than bad data and buggy software put together.

Communication between IT and business users

The main fault line in any BI project lies between the business users and the IT department. For people caught up in this situation, it is tempting to blame the problems on the intransigence of the other side, or on management issues that can only be dealt with on the top level.

In fact, the real issue involved is that business intelligence solutions are a unique challenge to any IT environment. To work effectively, at least some of the business users need to get their hands on the data and actively manipulate it in way that goes far beyond the scope of standard business practice.

Business intelligence is a balancing act between complexity and flexibility, and business intelligence vendors are accustomed to their customers saying they want complete control and flexibility without any complexity. But from the point of view of an IT department, it means that a small minority of the users are unreasonably trying to take control of processes that they are unlikely to be able to handle, such as database administration. For the business users, IT seems intransigent — preventing business users from getting the data they need just because they can.

There is no simple black and white solution to the problem. The question is: where do you draw the line? Where do the responsibilities of IT end, and where do those of the business users begin? The answer varies from company to company, and depends upon the skills and interests of the individuals involved. Finding the right solution requires close cooperation, patience and good will on both sides.

Specifications and the learning curve

Another problem is deciding what needs to be done. A large minority of BI projects run into trouble because the project team is unsure what needs to be done. Again, in a situation like this it is tempting to point the finger at certain individuals, but this fails to explain why the problem occurs so often.

There are several reasons why planning a BI project is so difficult. First, the people planning the project tend to have a relatively weak grasp of what is possible and what the true costs of implementing individual features will be. This is made worse by the vendor strategy of saying "we can do anything", which is usually more or less true, but tends to ignore the difficulties of delivering some capabilities. Vendors prefer to discuss their products strengths. Second, it is hard to judge the results of a BI project in advance, because good BI software changes the way people work, so planners have no firm ground to stand on. Third, particularly in large projects, there may be widely different priorities among the project sponsors.

Project drift

As the project ages, priorities begin to shift. Key players enter or leave the scene, company reorganizations change the working assumptions, and impatience may lead users to adapt stopgap measures that take on a life of their own as rival solutions to the problem. This can be avoided to a certain extent by clearly defining goals from the start, and working to execute as quickly as possible, but it cannot be completely avoided. Project drift can lead to inefficiency, but in some cases, changing your plans part way through a project is unavoidable.

Business Intelligence Competency Centers

One increasingly popular way of coordinating the complexity surrounding business intelligence is setting up a business intelligence Competency center. A BI Competency center can be a dedicated organization to deal with the questions that arise on a full-time basis, but usually only the largest companies have enough projects to justify the expense involved. More commonly, the BI Competency center is a virtual cross-departmental organization that meets regularly to coordinate policy. It is important to have members from various departments involved so that all key issues receive the consideration they deserve.

A BI Competency center may address a variety of issues. These may include:

  • Identifying and prioritizing potential projects
  • Selecting appropriate software for specific projects
  • Setting company-wide standards and policies on software usage
  • Setting standards for support levels
  • Supporting the specification process
  • Reviewing the progress of projects
  • Coordinating data quality management initiatives

However, BI Competency centers are a relatively new trend that parallels the trend for standardization of BI software and the move towards suites. So the details would vary from company to company. In some cases, BI Competency centers also take on responsibility for providing support for users of BI solutions or even play a role in implementation, but usually they have an advisory or coordinating role.

This page is part of the free content of The OLAP Report, but which represents less than a tenth of the information available to subscribers. You can register to access a free preview of a small sample of the large volume of subscriber-only information.

Why you need a proof of concept before buying BI products

Why you need a proof of concept before buying BI products

You can email Barney Finucane, the author of this section, if you have any comments, observations or user experiences to add. Last updated on May 29, 2007 .

When implementing complex solutions, particularly in the area of BI, customers often run into the 'tool assumption' problem. Although they are not usually discussed in these terms, advanced BI products can be seen as rapid development tools that minimize the need for technical know-how. They work by limiting themselves to a specific type of application. This involves making assumptions about what kind of data will be processed and how it will be processed.

The difficulty that arises is that the assumptions made by the product designer are aimed at a fairly general market, which means that the products can do a lot of impressive things very well, but do not necessarily fill the specific requirements of each individual customer.

Unfortunately, the very features that make this kind of product so convenient to use the way it was designed to be used tend to make them difficult to use in any other way. In other words, when the salesman says in his presentation "All you need to do to define this type of report is press this button", it often implies that there is no convenient way to define reports that are somewhat different.

BI software has come a long way, but the trade-off between convenience and flexibility still exists. A typical scenario is that a customer sees a presentation of a product, and buys the product because he is convinced that it basically fulfils his requirements. And in the project, it turns out that the product does indeed fulfil nearly all of the requirements. But as the project proceeds, one detail after another is discovered where the product does not quite work the way the customer expected it to.

Aside from performance, most of the quality problems and price overruns in BI projects are the result of seemingly small issues that the software platform fails to address, and have to be dealt with using complex workarounds. The customer will have to either accept these cost overruns or compromise his requirements.

To avoid this problem it is advisable to carry out a formal proof of concept workshop. Here, a short list of vendors are given a set of scenarios based on real data and asked to implement a small project.

A proof of concept should include the following:

  • A list of requirements for the vendors
  • A set of scenarios the vendors can implement in a day or two, which should include the requirements.
  • A final presentation by the vendors to the end users
  • A questionnaire for the end users to allow them to give their opinion of the results.

When carrying out a proof of concept, don't forget the basics. For example:

  • About two thirds of the time spent on a typical BI project is spent on data import. Make sure the vendors can import your data.
  • BI projects often run into difficulties because the product runs into technical or security issues at the customer site that are not anticipated by the vendor. Allow half a day or more for installation.


If the vendor or other external implementers will be carrying out the project, then it also makes sense to check their technical and communication abilities. This is vital to the success of the project.

Most importantly, make sure the proof of concept offers you some kind of closure: when it is over, no matter what the results are, you should be in a better position to make a final decision about which vendors to remove from the list, and which to keep.

This page is part of the free content of The OLAP Report, but which represents less than a tenth of the information available to subscribers. You can register to access a free preview of a small sample of the large volume of subscriber-only information.

Why you need a proof of concept before buying BI products

Why you need a proof of concept before buying BI products

You can email Barney Finucane, the author of this section, if you have any comments, observations or user experiences to add. Last updated on May 29, 2007 .

When implementing complex solutions, particularly in the area of BI, customers often run into the 'tool assumption' problem. Although they are not usually discussed in these terms, advanced BI products can be seen as rapid development tools that minimize the need for technical know-how. They work by limiting themselves to a specific type of application. This involves making assumptions about what kind of data will be processed and how it will be processed.

The difficulty that arises is that the assumptions made by the product designer are aimed at a fairly general market, which means that the products can do a lot of impressive things very well, but do not necessarily fill the specific requirements of each individual customer.

Unfortunately, the very features that make this kind of product so convenient to use the way it was designed to be used tend to make them difficult to use in any other way. In other words, when the salesman says in his presentation "All you need to do to define this type of report is press this button", it often implies that there is no convenient way to define reports that are somewhat different.

BI software has come a long way, but the trade-off between convenience and flexibility still exists. A typical scenario is that a customer sees a presentation of a product, and buys the product because he is convinced that it basically fulfils his requirements. And in the project, it turns out that the product does indeed fulfil nearly all of the requirements. But as the project proceeds, one detail after another is discovered where the product does not quite work the way the customer expected it to.

Aside from performance, most of the quality problems and price overruns in BI projects are the result of seemingly small issues that the software platform fails to address, and have to be dealt with using complex workarounds. The customer will have to either accept these cost overruns or compromise his requirements.

To avoid this problem it is advisable to carry out a formal proof of concept workshop. Here, a short list of vendors are given a set of scenarios based on real data and asked to implement a small project.

A proof of concept should include the following:

  • A list of requirements for the vendors
  • A set of scenarios the vendors can implement in a day or two, which should include the requirements.
  • A final presentation by the vendors to the end users
  • A questionnaire for the end users to allow them to give their opinion of the results.

When carrying out a proof of concept, don't forget the basics. For example:

  • About two thirds of the time spent on a typical BI project is spent on data import. Make sure the vendors can import your data.
  • BI projects often run into difficulties because the product runs into technical or security issues at the customer site that are not anticipated by the vendor. Allow half a day or more for installation.


If the vendor or other external implementers will be carrying out the project, then it also makes sense to check their technical and communication abilities. This is vital to the success of the project.

Most importantly, make sure the proof of concept offers you some kind of closure: when it is over, no matter what the results are, you should be in a better position to make a final decision about which vendors to remove from the list, and which to keep.

This page is part of the free content of The OLAP Report, but which represents less than a tenth of the information available to subscribers. You can register to access a free preview of a small sample of the large volume of subscriber-only information.

Triggering a Cognos 8 BI event externally

Triggering a Cognos 8 BI event externally

Republished from IBM Cognos Proven Practices

The Cognos 8 BI scheduler allows report execution and other events to run on a regular basis, for example every hour, day, week. However, instead of an event being executed to a schedule, it may be assigned to a trigger.  A triggered event is a task within Cognos 8 that is executed by an external process. Any content that can be scheduled can be executed by a trigger instead, for example a single report or a job.

Before creating an event that can be triggered, trigger support must be enabled. See the Disable Support for Trigger-based Scheduling section in the Administration and Security Guide for a step-by-step guide on how to enable / disable trigger support.

In order to allow a Cognos 8 event to be executed by an external process it must be scheduled as it would be if the internal Scheduler was being used.  However instead of setting a frequency of execution, you designate a trigger name. 

Setting a trigger in IBM Cognos 8 BI

Trigger.bat

Once a trigger has been created with Cognos 8 it can be executed from an external process.  Environments that have the Cognos 8 BI Software Development Kit (SDK) installed are able to execute the trigger from a custom application using the documented APIs.  For environments without the SDK, a convenient execution method is included with the standard Cognos 8 BI distribution.  A utility called trigger.bat is installed under <cognos8>\webapps\utilities\trigger.  This is a shell script which calls a Java program that in turn executes a Cognos 8 trigger.  In a UNIX environment, you can use trigger.sh.

The syntax for using trigger.bat is:

trigger.bat <gateway_URL> <username> <password> <SecurityNamespaceID> <triggername>

Being able to execute a trigger this way provides a lot of flexibility as the .bat or .sh script can be incorporated into a larger task such as the last step of an ETL task that performs a database update.

The provided trigger method does not have the capability to check that the tasks in the given trigger complete successfully – it simply initiates it.  For more sophisticated status checks, a custom SDK application would need to be written.

Running trigger.bat remotely

In many cases it is more desirable to run the trigger.bat file from a server that has no Cognos 8 BI components.  For example, a server may be dedicated to performing ETL jobs and needs to be able to trigger a Cognos 8 event.
 
In order to do this a Java runtime environment must be installed on the server where trigger.bat is to be executed from and you must copy the following .JAR files from the <cognos8>\webapps\p2pd\WEB-INF\lib directory -

  • axis.jar
  • jaxrpc.jar
  • axisCrnpClient.jar
  • saaj.jar
  • commons-discovery.jar
  • xml-apis.jar
  • commons-logging.jar
  • xercesImpl.jar

Additionally Trigger.class needs to be copied from the <cognos8>\webapps\utilities directory.
 
It is possible to also copy the trigger.bat file across, but it is recommended that the version below be used and the appropriate environment variables be modified accordingly (shown in red). This version of trigger.bat assumes that trigger.class and the .jar files have all been placed in the same directory.

Trigger Dot Bat