Purchase your Section 508 Compliance Support guide now!

Purchase your Section 508 Compliance Support guide now!

The Cash Free Way to Increase Your Cash Flow -- e-book!

The Cash Free Way to Increase Your Cash Flow -- e-book!

I am pleased to invite you to my first blog post about my upcoming book called "The Cash Free Way to Increase Your Cash Flow".  It will be available as an e-book on this site in December 2011.  In my book we will cover revenue generating options that will increase your monthly cash flow.  The ideas are easy to implement and I will provide you with a step-by-step example to get you up and running in no time.


So if you need to increase your monthly cash flow then you will benefit from my e-book.


Watch this blog for the "The Cash Free Way to Increase Your Cash Flow" e-book soon.

Cheers


-- WT Paige

More BI Packages Add Collaboration

Tibco is now squeezing on board, joining the likes of Actuate, Adaptive Planning, Birst, LogiXML, and QlikTech, and others that have added collaborative features in 2011. I'm not saying BI vendors shouldn't be adding these features, but as we learned at last week's Enterprise 2.0 Conference, it's hard enough for enterprises to figure out their general-purpose collaboration strategies, as they try to decide how to mimic the influence of social networks inside companies. SharePoint isn't going to easily operate as a social platform for most companies without help from other software, conclude analysts Tony Byrne of the Real Story Group and Rob Koplowitz at Forrester Research. Facebook and Google are likely to launch business social networks, says tech blogger Robert Scoble, which would confuse the collaboration market more.
If enterprises are wrestling with sort out these choices, are they really ready for another layer of collaborative tools around their BI? And what about collaborative tools proposed by supply chain management vendors, or talent management vendors, or performance management vendors, or customer relationship management vendors? Just how many tools and interfaces are you prepared to support? Continuity, usability, training and licensing efficiencies will surely suffer if you offer too many options.
To its credit, Tibco has tried to avoid a siloed approach by integrating with Microsoft SharePoint, as well as Tibco's own enterprise-focused Tibbr social networking option. The SharePoint part is a smart choice given that more than 100 million people are licensed to use that software, according to Microsoft stats. By letting users post Spotfire dashboards, reports and data visualizations alongside Word documents, spreadsheets and team-collaboration environments in SharePoint, Tibco is promoting what it calls contextual collaboration.
But when it comes to enterprise social networking, Tibco insists the capabilities built into SharePoint and Microsoft Outlook aren't adequate. That's where Tibco's Tibbr social platform comes in. Tibbr is aimed at supporting collaboration across multiple applications and environments, not just BI and analytics. These characteristics also apply to SharePoint and Outlook, in my book, but Tibco Spotfire's Lou Bajuk, senior director of product management, insists Tibbr "brings collaboration into one environment and ties into multiple systems more effectively than you can do with SharePoint."

Tibbr was released in 2009, and Tibco claims it has more than 40,000 individual users. The company couldn't come up with any customers using both Spotfire and Tibbr, but it did refer me to Spotfire customer Wildhorse Resources, a small, Texas-based oil and gas company that's considering using SharePoint and Tibbr together.
With just 70 employees, most of whom are on a single floor in an office building in Houston, Wildhorse doesn't have a big collaboration challenge today. But Steve Habachy, vice president of operations, says the firm is looking at SharePoint to provide a single place -- portal, content repository, he's not hung up on the term -- where all information about particular oil fields and wells can be accessed. That might include documents, spreadsheets, Spotfire analyses and so on. Tibbr, meanwhile, would provide a kind of corporate Facebook, a familiar and user-friendly interface with big advantages over email.
"Tone is hard to get across in e-mail and I think people are more comfortable when they communicate through a Facebook-type interface," Habachy explains. "You see a picture, the person is typically smiling, and people don't assume the worst and think people are being critical as they often do when you're communicating by email."

I'm guessing Wildhorse will soon find a lot of overlapping capabilities between SharePoint, Outlook, and Tibbr. Microsoft builds a lot of collaborative functionality -- including presence awareness, personal profiles, and Facebook-style social networking -- into the combination of SharePoint and Outlook.
Collaboration strategy goes well beyond BI and analytics, and business leaders have a lot of broad choices they're trying to figure out well before going after the niche needs. There may well be a place for BI-specific collaboration options, particularly when it gets down into details that can't be addressed by the general-purpose tools, such as drilling down into root-cause data analysis below the top-layer dashboards and reports. If it's free functionality that vendors are simply adding to their existing BI systems, so much the better. But even then, you have to watch out for collaborative featuress overlaps, and collaboration overload.

DB2 best practices: Implementing DB2 workload management in a data warehouse

Using a staged approach, this article guides you through the steps needed to implement the best practices workload management configuration on IBM® DB2 ® for Linux®, UNIX®, and Windows® with sufficient controls to help ensure a stable, predictable system for most data warehouse environments. This initial configuration is intended to be a good base for implementing additional tuning and configuration changes as needed, in order for you to achieve your specific workload management objectives.
In this article
This article presents a set of definitions representing the different stages of maturity for a workload management configuration in a DB2 for Linux, UNIX, and Windows database. These stages range from stage 0 through to the advanced stage 3 configuration. A specific configuration template and process is provided as part of these best practices to enable customers to progress from a stage 0 configuration to a stage 2 configuration. General descriptions and advice are also given about common stage 3 scenarios.
The article assumes a novice beginner and describes the individual steps and mechanisms at each point. A more experienced user can condense many of the listed steps to move from stage 1 to stage 2, making the transition in days of elapsed time rather than weeks as the suggested timeline indicates in a later section.
The steps outlined in this document are focused on the efficiency of the system as a whole, regardless of where the work itself comes from. It is important to note that achieving the goal of a stable system might not necessarily also result in the achievement of any individual application service-level agreement (SLA) or specific performance objectives for queries. These more granular objectives might require subsequent changes to the workload management configuration, such as outlined in the section on stage 3 scenarios, which is outside the main scope of this document.
This article is not a tutorial on DB2 workload management capabilities and does not attempt to provide comprehensive guidance in addressing all possible scenarios where DB2 workload management might be employed. It also does not cover all features within the DB2 product that might be of use in controlling resource consumption. The scope of this article is focused on describing the system stabilization approach in some detail and provides some general guidance for common advanced scenarios.

Downloads
DescriptionNameSizeDownload method
Article in PDF formatDB2BP_Workload_Management_1111.pdf1.21MBHTTP
Sample scriptsDB2BP_WLM_Supplements_1111.zip446KBHTTP
Information about download methods          Get Adobe® Reader®

Resources
Learn
Get products and technologies
Discuss
Biographies
Paul Bird author photo Paul Bird is a senior technical staff member (STSM) within the IBM Software Group development organization, sharing his time between the Optim and DB2 development organizations. Since 1991, he has worked on the inside of the DB2 for Linux, UNIX, and Windows product as a lead developer and architect with a focus on diverse areas such as workload management, monitoring, security, and general SQL processing. He recently became a member of the Optim development organization to expand his experiences. You can reach him at pbird@ca.ibm.com.
Rimas Kalesnykas photo Rimas Kalesnykas is a technical writer for DB2 for Linux, UNIX, and Windows. In the last 5 years, he was the documentation owner for a variety of DB2 subject areas, including the Command and API references, the Partitioning and Clustering Guide, Troubleshooting, and Workload Management. In 2008, he was a co-author of a best practices paper that describes how to improve data server utilization and management through virtualization. You can reach him at rimask@ca.ibm.com.

Transform and model your DB2 data using WebSphere Transformation Extender

An application takes data as input for business logic processing in a specific format. The format of the data could be text, XML, EDIFACT, X12, flat files, and so on. The database contains the raw data that needs to be modeled and converted into a specific data format so that the application can use it. For example, a web services-based application requires data in XML format for processing. To achieve this business requirement, you can use WebSphere Transformation Extender for data modeling and data transformation.
Figure 1 shows how WebSphere Transformation Extender performs transformation and routing of data in any format from source system to target system in real-time environments.

Figure 1. Working functionality of WebSphere Transformation Extender
This diagram describes how WebSphere Transformation Extender performs transformation and routing of data in any format from source system to target system in real-time environments.
The source system can include files and databases. After retrieving the data from its source, WebSphere Transformation Extender transforms the data and routes it to any number of target systems where it is needed, for example, legacy, J2EE, or web services applications, providing the appropriate content and format for each target system.
WebSphere Transformation Extender uses type trees and maps for defining input data, output data, and transformation logic. A type tree describes the hierarchical structure of a data format. There are two type trees, input and output. The input type tree describes input data in a hierarchical structure, while the output type tree describes output data in a hierarchical structure.
A map is a hierarchical structure that contains input and output cards and encapsulates the rules for data transformation from one data format to another. These cards represent data objects. The input card of a map is associated with a single input terminal of a map, and the output card of a map is associated with a single output terminal of a map. Each input type tree is configured with an input card, and each output type tree is configured with an output card with adapter settings. Each input type tree is mapped to an output type tree using transformation rules. The adapter setting contains the information (location, type, platform, and so on) of a source system and a target system. After running a map successfully, data from a source system is transformed and routed to a target system.
WebSphere Transformation Extender uses load, update, and delete operations during data modeling.
  • In a load operation, WebSphere Transformation Extender exports data from a source system, transforms the data, and writes the data to a target system using transformation rules.
  • In an update operation, WebSphere Transformation Extender exports data from a source system, transforms the data, and updates the data to a target system using transformation rule.
  • In a delete operation, WebSphere Transformation Extender exports data from a source system, transforms the data, and deletes the data from a target system using transformation rules.
This article describes the complete process of transforming DB2 data using WebSphere Transformation Extender with the following three scenarios:
  • Database to file system: WebSphere Transformation Extender performs transformation and routing of data from database tables to files. This article describes transformation from a IBM DB2 database to a file.
  • Database to database (DB2, ORACLE, Sybase, and so on): WebSphere Transformation Extender performs transformation and routing of data from any database table to any other database table. In this article, we describe transformation from a DB2 database to another DB2 database.
  • File system to database: WebSphere Transformation Extender performs transformation and routing of data from file to any database. In this article, we describe transformation from a file to a DB2 database.
Software requirements
This article uses the following software.
  • WebSphere Transformation Extender V8.3
  • IBM DB2 V9.7
However, these instructions will work with all versions of WebSphere Transformation Extender V8.x and with other databases such as ORACLE, Sybase, and Informix as well.

Database to file system
The following sections demonstrate how to use the WebSphere Transformation Extender framework to export data from a DB2 database, perform data modeling, and route the data to a file system. In this example, the source is EMPLOYEE table and the target is a text file called output.txt.
Creating the database object
Use the following SQL statements to create a database SAMPLE with table EMPLOYEE, as shown in Listing 1.

Listing 1. Creating the database and employee table
CREATE DATABASE SAMPLE AUTOMATIC STORAGE YES  
ON 'C:\' DBPATH ON 'C:\' USING CODESET IBM-1252 TERRITORY US
COLLATE USING SYSTEM PAGESIZE 4096;
CREATE TABLE ADMINISTRATOR.EMPLOYEE
(EMPLOYEE_ID CHARACTER (10) NOT NULL,
FIRST_NAME VARCHAR (40),
LAST_NAME VARCHAR (40),
MANAGER_ID CHARACTER (10) NOT NULL,
MANAGER VARCHAR (50),
CONSTRAINT CC1317720073859 PRIMARY KEY ( EMPLOYEE_ID, MANAGER_ID) ) ;

The Employee table with the data is shown in Table 1.

Table 1. EMPLOYEE table with data

EMPLOYEE_IDFIRST_NAMELAST_NAMEMANAGER_IDMANAGER
071DF_Name1L_Name1072EManager1
072DF_Name2L_Name3073EManager2
073DF_Name3L_Name3074EManager3
074DF_Name4L_Name4074EManager3
075DF_Name5L_Name5072EManager1

Attaching the database
To attach the database to WebSphere Transformation Extender using Database Interface Designer, perform the following steps.
  1. Open Database Interface Designer by clicking Start > WebSphere Transformation Extender V8.3 > Design Studio > Database Interface Designer, as shown in Figure 2.

    Figure 2. Database Interface Designer
    This figure launches Database Interface Designer.

  2. Right-click Database/Query Files in the Database Interface Designer navigator and select New Database/Query File, as shown in Figure 3. Name it DB2_txt.

    Figure 3. New Database/Query File
    This figure is used to create New Database/Query File

  3. Right-click Databases and select New as shown in Figure 4.

    Figure 4. Select Database
    This figure is used to select New Database.

  4. In the Database Definition dialog, type SAMPLE in the Database Name field as shown in Figure 5.

    Figure 5. Database Adapter List
    This figure shows the Database Adapter List.

  5. In the Adapter field, select DB2 as database type, and Microsoft Windows as platform. WebSphere Transformation Extender offers a number of database adapters, including Oracle, MS SQL Server, Sybase, and Informix. In this article, DB2 is selected as database adapter. The drop-down list for database adapter is shown previously in Figure 5.
  6. Expand Data Source as shown in Figure 6.

    Figure 6. Data Source and Security fields
    This figure shows data source                             options with user id and password fields.

  7. The Database interface Designer option identifies the database you want to access. Select SAMPLE from the drop-down list as shown in Figure 6.
  8. The Runtime option identifies the database to access at runtime from the Map Designer, Command Server or Launcher. The runtime and development data sources are same for this article, so select SAMPLE as shown previously in Figure 6.
  9. Expand the Security options. These options are used to specify the user ID and password to connect to the database instance. Type your user ID and password for DB2. Click OK to save the database connection information as shown previously in Figure 6.
Creating the input type tree for the database table
The input type tree describes the hierarchical structure of input data format. You can create the input type tree manually or automatically using Database Interface Designer. Type trees can be automatically created from databases, queries, stored procedures, or views using Database Interface Designer. Database connections can be established according to the previous section. The following steps are used to create input type tree using tables.
  1. In the Database Interface Designer, right-click Tables in the navigator, and select Generate Tree as shown in Figure 7.

    Figure 7. Generate Tree from Tables
    This Figure is used to Generate a Type Tree from Tables.
  2. Select the EMPLOYEE table from the Tables dialog as shown in Figure 8.

    Figure 8. Configuration of Generate Tree from Table
    This figure shows the Configuration of Generate Tree from Table.

  3. The File Name field specifies the name of the Type tree file to create. Type Emp_Query.mtt in the File Name field as shown in Figure 8.
  4. Also under Type options select the Override type check box, as shown previously in Figure 8.
  5. When the type tree is generated for a database table, a group named Row is automatically defined by the Database Interface Designer. The Row group drop-down list lets you specify the format of the Row group as delimited or fixed format. For this example, keep the default values as shown previously in Figure 8.
  6. Specify Group options such as the delimiter between each field of the record, and the terminator for each record and release character as shown previously in Figure 8.
  7. For this example, keep the Represent date/time columns as text items default values for National and Data Language fields as shown previously in Figure 8.
  8. Click the Generate button. As shown in Figure 9, the Database Interface Designer will produce a type tree that corresponds to the EMPLOYEE table and the “Command file completed successfully” notification message will be displayed.

    Figure 9. Create Input Type Tree from Employee table
    This figure is used to Create Input Type Tree from  Employee Table.

Generating type trees for queries
In the previous section you established a connection with the database. Now the same connection is used to generate a type tree from a query. The query can be simple, complex or join, which means that type tree can be generated that references multiple tables. In this section, the type tree is created using a simple query referencing the EMPLOYEE table in the database SAMPLE.
  1. In the Database Interface Designer, right-click Queries in the Navigator and select New as shown in Figure 10.

    Figure 10. Create new query
    This figure is used to Create New Query.
  2. The query name uniquely identifies the query. Type Emp_Query in the name field as shown in Figure 11.

    Figure 11. New Query screen
    This figure shows New Query screen with Emp_Query as                             name, and select * from EMPLOYEE as Query.
  3. Type select * from EMPLOYEE as the SQL statement in the Query window, which will select all columns in the EMPLOYEE table as shown previously in Figure 11.
  4. Click OK. The query name appears in the Navigator under the Queries subheading, as shown previously in Figure 11.
  5. Select Generate Tree, as shown in Figure 12. The Generate Tree dialog is the same as when generating a type tree from a table.

    Figure 12. Generating type tree for a query
    This figure is used to generate Type Tree for a query

  6. Generate the type tree and save it as Emp_Query.mtt. This type tree will be used by map in the next section.
  7. Right-click DB2_txt.mdq and save it. DB2_txt.mdq is an XML file that contains the information from the database. Figure 13 shows the content of the DB2_txt.mdq file.

    Figure 13. XML format of Database_QueryFile1.mdq
    This figure represents the XML Format of  Database_QueryFile1.mdq
Output type tree
Output type tree describes a hierarchical structure of output data format. In this example, Emp_Query.mtt is used as output type tree which is created in the previous section.
Transformation logic implementation in WTX Design Studio using map
The following steps are for the map development of this scenario.
  1. Start WebSphere TX Design Studio.
  2. Create an Extender Project by launching WebSphere TX Design Studio for mapping and transformation. Then select Start > IBM WebSphere Transformation Extender > Design Studio. Select a workspace and close the Welcome view.
  3. Create an Extender project as shown in Figure 14.

    Figure 14. Create an Extender Project
    This figure is used to create an Extender Project.

  4. Enter the name as TestProject, and then click Finish. A project will be added to your workspace.
  5. Develop the map with the following steps. The map source file will be created first. The following is the structure for the executable map node in the map source file. Configure input type tree in input card with the adapter settings. Configure output type tree in output card with the adapter settings. Define the transformation rule, and build and run the map to check the results.
    1. Right-click the Map Files folder in your TestProject and select New > Map Source, as shown in Figure 15.

      Figure 15. Create new Map Source file
      This figure is used to create new Map Source file.
    2. Right-click the Map source file and create a new Map source with the name EmpMap.mms in the outline view of the Design Studio. Then click Finish.

      Figure 16. Map Node Creation
      This figure is used to create Map Node.

    3. Create a map called DB2Totxt by selecting New Map from the context menu of the Map source.
  6. Add the input card with the following steps.
    1. Add an input card called InputDBCard in CardName field as shown in Figure 17.

      Figure 17. Input Card Settings
      This figure shows Input Card Settings.

    2. Select Emp_Query.mtt in TypeTree and select DBTable in Type field because this group represents the entire table and not a single record as shown previously in Figure 17.
    3. Select Emp_Query.mtt in TypeTree and select DBTable in Type field because this group represents the entire table and not a single record as shown previously in Figure 17.
    4. Identify the type of data being used as the data source. When you change the Source setting to Database, the DatabaseQueryFile settings appear in the input card as shown previously in Figure 17.
    5. The File setting identifies the database/query file (.mdq) that contains the definition for the Table and query. Select DB2_txt.mdq as the File setting which is shown previously in Figure 17.
    6. The Database drop-down list is automatically updated to display all databases defined in the selected file. SAMPLE is the only database defined in the file DB2_txt.mdq so that it is automatically selected in the database field as shown previously in Figure 17.
    7. The Query setting identifies the query to use as the data source as shown in Figure 17. If more than one query is defined in the selected database, a drop-down list of all queries is displayed.
    8. Click OK to save the settings as shown previously in Figure 17.
  7. Add the output card with the following steps.
    1. Add an output card named OutputTXTCard as shown in Figure 18.

      Figure 18. Output Card Setting
      This figure shows Output Card Setting.
    2. Select Emp_Query.mtt in TypeTree and select DBTable in Type field as shown previously in Figure 18.
    3. Select Target as the File adapter because the output will be written to a text file. so keep c:/output.txt in the path field as shown previously in Figure 18.
    4. Click OK to save.
Building and running the map
After successfully configuring the map input and output cards, drag and drop the group InputDBCard from the input card to the output card group rule column as shown in Figure 19.

Figure 19. Mapping between input card and output card
This figure shows Mapping between input card and output card.
Right-click the map in the outline view and click Build to compile the map, then right-click the map again and click Run. You should get the message map completed successfully as shown in Figure 20.

Figure 20. Run the map
This figure shows the result of Run the Map.
Verifying the result
Right-click the map in the outline view and click Run results. Choose your result file and click OK. As shown in Figure 21, you will be able to see output.txt in the Design Studio with your records from the EMPLOYEE table that have been transformed into a text file.

Figure 21. Output File
This figure shows Output File.
In this example, simple mapping is used so that the same data from the EMPLOYEE table is transformed to a text file. However, you can manipulate data using the feature and function of WebSphere Transformation Extender according to your requirements.

Database to database, such as DB2, Informix, Oracle, or Sybase
The following sections demonstrate how to use WebSphere Transformation Extender framework to export data from a DB2 database, perform data modeling, and route the data to another database table. In this example, the source is the EMPLOYEE table and target is the EMP_OUTPUT table in the SAMPLE database.
Creating the database objects
Use the following command to create table EMP_OUTPUT in the SAMPLE database as shown in Listing 2.

Listing 2. Creating the output table in the sample database
CREATE TABLE ADMINISTRATOR.EMP_OUTPUT 
(EMPLOYEE_ID CHARACTER (10) NOT NULL ,
FIRST_NAME VARCHAR (40) ,
LAST_NAME VARCHAR (40) ,
MANAGER_ID CHARACTER (10) NOT NULL ,
MANAGER VARCHAR (50) ,
CONSTRAINT CC1317720073860 PRIMARY KEY ( EMPLOYEE_ID, MANAGER_ID) ) ;

Attach to database
Use the instructions in the previous section to attach database SAMPLE with table EMPLOYEE and EMP_OUTPUT.
Create the input type tree of database table (EMPLOYEE)
The Database Interface Designer will produce a type tree as shown previously using EMPLOYEE table. An alert message will be displayed that employee.mtt is created successfully.
Create the output type tree of database table (EMP_OUTPUT)
The output type tree can be created from database SAMPLE and table EMP_OUTPUT with the steps mentioned previously. In this example, you use employee.mtt for output type tree. The employee.mtt is generated in the previous section.
Transformation logic implementation in WTX Design Studio using map
In this section, a map named DB2ToDB2 is created with the detail steps mentioned in the Transformation Logic Implementation in WTX Design Studio using Map section. You will develop the map, configure input Type Tree in input card with the adapter settings, configure output Type Tree in output card with the adapter settings, define the transformation rule, then build and run the map to check the results.
  1. Add the input card as follows. The input card setting is similar to what is shown in Figure 17 previously.
    1. Add an input card named InputDBCard.
    2. Select employee.mtt, created previously.
    3. Select Type in the DBSelect field.
    4. Change the Source setting to database adapter.
    5. Change the Source setting to Database, the DatabaseQueryFile settings appear in the input card.
    6. The File setting identifies the database/query file (.mdq) that contains the definition for the query. Select db2_db2.mdq as the file setting.
    7. The Database drop-down list is automatically updated to display all databases defined in the selected file. SAMPLE is the only database defined in the file db2_db2.mdq so that it is automatically selected as the database.
    8. The Query setting identifies the query to use as the data source. If more than one query is defined in the selected database, a drop-down list of all queries is displayed.
    9. Click OK to save the settings.
  2. Add the output card as follows.
    1. Add an output card named OutputDBCard as shown in Figure 22.
    2. Select employee.mtt, as shown in Figure 22.
    3. Select Type in DBSelect field as shown in Figure 22.
    4. Change the Source setting to the adapter Database as shown in Figure 22.

      Figure 22. Output Card Setting
      This figure shows Output Card Setting.

    5. The File setting identifies the database/query file (.mdq) that contains the definition for the query. Keep db2_db2.mdq in the File Path as shown previously in Figure 22.
    6. The Database drop-down list is automatically updated to display all databases defined in the selected file. SAMPLE is the only database defined in the file db2_db2.mdq so that it is automatically selected as the database.
    7. Enter Table name as EMP_OUTPUT which is shown previously in Figure 22.
    8. Click OK to save the settings. The output card setting is shown previously in Figure 22.
Building and running the map
After successfully configuring the input card and output card of the map, drag and drop the group InputDBCard from the input card to the output card group rule column. The mapping between input and output card is similar to what is shown in Figure 19.
Right-click the map in the outline view, click Build to compile the map, click Run to execute the map. You should get the message map completed successfully which is similar to Figure 20.
Verifying the result
EMP_OUTPUT table is empty before running the map as shown in Figure 23.

Figure 23. EMP_OUTPUT Table is empty
This figure shows records in EMP_OUTPUT Table before modeling.
Data in EMP_OUTPUT table after running the map successfully which is shown in Figure 24.

Figure 24. EMP_OUTPUT table with modeled data
This figure shows records in EMP_OUTPUT Table after modeling.
In this example, simple mapping is used so that the same data from the EMPLOYEE table is modeled to EMP_OUTPUT table at run time. However, we can manipulate data using the feature and function of WebSphere Transformation Extender according to our requirement.

File system to database
This section describes WebSphere Transformation Extender framework to read data from input file, perform data modeling, and store the data to a database table. In this example, the source is input.txt file and the target is EMP_OUTPUT table of SAMPLE database.
Creating the input file
The input data is read from a text file called input.txt as shown in Listing 3.

Listing 3. input.txt
071D      |F_Name1|L_Name1|072E      |Manager1
072D |F_Name2|L_Name2|073E |Manager2
073D |F_Name3|L_Name3|074E |Manager3
074D |F_Name4|L_Name4|074E |Manager3
075D |F_Name5|L_Name5|072E |Manager1

Create the target database and table
The following commands are used to create EMPLOYEE table in SAMPLE database, as shown in Listing 4.

Listing 4. Creating the target database and table
DROP TABLE EMPLOYEE;
CREATE TABLE ADMINISTRATOR.EMPLOYEE
( EMPLOYEE_ID CHARACTER (10) NOT NULL ,
FIRST_NAME VARCHAR (40) ,
LAST_NAME VARCHAR (40) ,
MANAGER_ID CHARACTER (10) NOT NULL ,
MANAGER VARCHAR (50) , CONSTRAINT CC1317720073859 PRIMARY KEY
( EMPLOYEE_ID, MANAGER_ID));

Attach to database
Use the details from the first section to attach database SAMPLE with table EMPLOYEE.
Create the output type tree of database table (EMPLOYEE)
The following steps are used to create output type tree which is similar to the steps shown previously in section 1.
  1. In the Database Interface Designer, right-click Tables in the navigator and select Generate Tree.
  2. Select the EMPLOYEE table from the Tables dialog.
  3. The File Name field specifies the name of the type tree file to create. Type employee.mtt in File Name field.
  4. Change the Type option to Override type.
  5. The Database Interface Designer will produce a type tree that corresponds to the EMPLOYEE table, and alert message will be displayed that employee.mtt is created successfully.
Create the input type tree of database table (EMP_OUT)
Use the same type tree generated in the previous section.
Transformation logic implementation in WTX Design Studio using map
In this section, a map named as TxtToDB2 is created with the detailed steps described in the first section. You will develop the map by configuring Input type tree in input card with the adapter settings, configure output type tree in output card with the adapter settings, define the transformation rule, then build and run the map to check the results.
  1. Add the input card. The input card setting is similar to what is previously shown in Figure 17.
    1. Add an input card named InputTXTCard.
    2. Select employee.mtt, which you created previously.
    3. Select Type as DBTable.
    4. Change the Source setting to adapter File.
    5. Enter the file location, such as input.txt.
    6. Click OK to save the settings.
  2. Add the output card. The output card setting is similar to what was previously shown in Figure 18.
    1. Add an output card named OutputDBCard.
    2. Select employee.mtt.
    3. Change the Source setting to the adapter Database.
    4. Identify the type of data being used as the data source. When you change the Source setting to Database, the DatabaseQueryFile settings appear in the input card.
    5. The File setting identifies the database/query file (.mdq) that contains the definition for the query. Select txtToDB2.mdq.
    6. The Database drop-down list is automatically updated to display all databases defined in the selected file. SAMPLE is the only database defined in the file txtToDB2.mdq, so it is automatically selected as the database.
    7. The Query setting identifies the query to use as the data source. If more than one query is defined in the selected database, a drop-down list of all queries is displayed. Select Employee as the Query.
    8. Click OK to save the settings.
Building and running the map
After successful configuration of input card and output card, drag and drop the group InputTXTCard from the input card to the output card group rule column. The mapping between input and output card is similar to what was previously shown in Figure 19.
Right-click the map in the outline view, click Build to compile the map, then click Run to execute the map. You should get the message "Map completed successfully" which is similar to what was shown in Figure 20.
Verifying the result
Before running the map, EMPLOYEE table contained zero records. After running the map successfully, the data from the text file is routed to EMPLOYEE table as shown in Figure 25.

Figure 25. EMPLOYEE table with modeled data
This figure shows EMPLOYEE Table with modeled data.
In this example, simple mapping was used so that the same data from the text file was transformed to EMPLOYEE table, but you can manipulate data using the feature and function of WebSphere Transformation Extender according to your requirement.

Conclusion
This article has shown how WebSphere Transformation Extender can be used with DB2 database to generate data in different formats. After reading this article, you should be able to use a database with WebSphere Transformation Extender for data modeling to quickly transform the data in a specific format, and in a simple manner.

Resources
Learn
Get products and technologies
Discuss
About the author
Photo of author Anuruddha Kumar Pandey Anuruddha Kumar Pandey works for IBM India software Labs as a Software Engineer. He has been working on the DB2 Tools Continuing Engineering team. He has completed a Master Of Technology degree from Indian Institute of Information Technology and Management, Gwalior.

Integrate InfoSphere Guardium Data Redaction with IBM Classification Module

InfoSphere Guardium Data Redaction is a product aimed at achieving a balance between openness and privacy. Often, the same regulations require organizations to share their documents with regulators, business partners, or customers, and at the same time to protect sensitive information which may be buried in these documents. With thousands of document in Enterprise Content Management systems such as IBM FileNet ® and IBM Content Manger®, automation combined with a well-structured workflow is essential for practically controlling access to private information in documents at a fine grain.
For example, in eDiscovery, lawyers must share documents with the opposing lawyer adversaries. But lawyers do not want to release any information they don't need to, and attorney-client privileged information must be carefully protected. Similarly, The Freedom of Information Act (FOIA) is intended to hold government organizations more accountable for their actions by making information about those actions available on demand. However, individuals are not entitled to access sensitive personal information. But on the other hand, the same regulation requires that those ordering the documents must not see any sensitive personal or national security information embedded in documents that might be made public.
InfoSphere Guardium Data Redaction product automatically finds and deletes sensitive text within a document, redacting the document. It then outputs the redacted document in a format such as a PDF. Alternatively, the product includes a web-based Secure Viewer for even more control over the release of private information. Each user sees just what they are allowed to see. In some cases, even if a user is allowed to see some information, it is withheld unless they ask for it, specifying the reason for their need to know.
Within an organization, not all documents contain sensitive data. For the redaction to be effective, it is critical that relevant documents be identified. InfoSphere Guardium Data Redaction is capable of identifying and redacting many types of personally identifiable information, but not all occurrences constitute sensitive data. The sensitivity of the entities is often dependent on context. For example, names of medical procedures in administrative documents catalog are not sensitive, but in patient records they are. IBM Classification Module is capable of identifying sensitive documents containing data that requires redaction.
The level of sensitivity varies across documents of different types. A group of documents from one department within the organization may require a customized redaction policy. Other groups of documents may have been created for public consumption, and it can be assumed that these documents contain no sensitive data. These document groupings may or may not be part of a formalized classification system.
Below is an example of a sensitive document, and its redacted version. Personal names, addresses, account and telephone numbers have been removed.

Figure 1. An overview of the redaction process
Shows original document, sensitive information removed, and resulting document
There are different formats available for the redacted version of the document. In addition to the usual formats (PDF, Microsoft Word document, TIFF, text, and so on), a propriety format is available that can be viewed by the Secure Viewer (an application shipped with InfoSphere Guardium Data Redaction).
IBM Classification Module is capable of identifying documents according to a large range of criteria, including statistical classification and rule-based decisions. The implementation involves these stages:
  1. Create a knowledge base and train it using user-defined groups of sample documents.
  2. Create a decision plan that will:
    • Categorize new documents based on the knowledge base results.
    • Move documents to relevant folders.
  3. Run the Classification Module Classification Center using the created decision plan. Documents are moved to relevant folders.
  4. Run redaction batch processes on the repository folders. Redacted versions of the document are created; original copies are kept.
The implementation described here involves documents stored in a file system. Both Classification Module and InfoSphere Guardium Data Redaction are capable of accessing and processing documents on IBM FileNet and IBM Content Manager systems.
The workflow described here uses IBM Classification Module's Classification Center to classify documents into a taxonomy tree.
For information on how to create a knowledge base and decision plan, and to set up Classification Center for classifying documents into folders, see the IBM Classification Module Information Center
Guardium Data Redaction then redacts documents in two different category folders, nested within the Repository Folders, according to two different redaction policies. Guardium Data Redaction uses a specific folder structure (repository folders) which serves as the basis for its data processors.
The workflow described here involves these steps:
  1. Set the configuration for redaction: Configure two processors in InfoSphere Guardium Data Redaction.
  2. Start the Data Redaction server in order to create the relevant processors and their repository folders.
  3. Create the Classification Module knowledge base and decision plan.
  4. Run the Classification Module Classification Center to move the documents to the redaction in folders.
  5. Restart the InfoSphere Guardium Data Redaction server to redact documents and move them to the appropriate folders for further processing.
Set the configuration for redaction
Before running the Classification Module Classification Center or InfoSphere Guardium Data Redaction, the processors should be set up.
Configure two repositories
Two separate processors (Legal and IBM Global Financing) are defined in two processor configuration files found in the IBM\GuardiumDataRedaction\server\conf folder.
Each processor has one configuration file named in the IBM\GuardiumDataRedaction\server\conf\plugins.xml file:

Listing 1. Sample processor setup in plugins.xml

com.ibm.nex.redaction.docrepository.SimpleFilesDocumentRepository

batchFileSystemProcessorIBM_Legal.xml


com.ibm.nex.redaction.docrepository.SimpleFilesDocumentRepository

batchFileSystemProcessorIBM_Finance.xml



Each XML configuration file contains the following settings:
  • The base folder for the repository This folder should match the directory used by Classification Center, for example:
    c:/data/IBM Products CC Output Folder
  • Repository folder name The folder name should match exactly the associated category name in the Classification Module knowledge base.

Setting different data policies
We will set two policies:
  • Legal role: US dollar amounts are redacted.
  • Financial role: Organization names are redacted.
These profiles are configured in the XmlPolicyModel.xml file in IBM\GuardiumDataRedaction\server\conf
Each ns21:permission element maps one role with one category. The ns21:redact element sets this as a redacted category. The categories are mapped in the within the same file.
Below, each user has one redacted category. Each mapping maps a single user to a single category. The user role (userRoleID) and category (semanticCategoryId) are configured elsewhere in the same file. Here, each category is set to redacted.

Listing 2. Legal role





Listing 3. Financial role





Start the InfoSphere Guardium Data Redaction server
From the IBM InfoSphere Guardium Data Redaction Windows menu, choose Start server. This will start the server and create the configured repositories. You can optionally stop the server in order to prevent it from processing the files created by the Classification Center before you have checked them. If the in folder becomes populated while the Data Redaction server is running, these files will be picked up for processing.

Create the Classification Module knowledge base and decision plan
Classification Module Classification Center is capable of copying and/or moving files within a file system and reading/modifying metadata associated with a document within a full content management system. These actions are based on a series of decisions made within a decision plan running on the Classification Module server. Although this decision plan takes actions based on triggers, these rules can consider results from statistic analysis of the document content returned by the knowledge base (also running on the server). The knowledge base typically assigns a category to the document, based on statistical similarities.
For details on how to create a knowledge base and decision plan, see the Classification Module InfoCenter Workbench topic in the Information Center, accessible from the Resources section.
Create the knowledge base
Classification Module Workbench is shipped with a project called IBM Products. This project contains the basis for the knowledge base used here. The following figures shows the list of categories.

Figure 2. The IBM Products knowledge base
Explorer view of the knowledge base The IBM Products Knowledge Base
The knowledge base structure mimics the target folder structure. The following figure shows the folder structure, each folder named after a category.

Figure 3. The folder structure
opened up explorer view of the folder structure for organizing classified documents
Create the decision plan
The decision plan includes a set of rules. Below is an example of a rule that moves documents to the target folders based on the highest category match (for an example of such rules, see the Rules for File System project in Classification Module Workbench).

Figure 4. The decision plan (first rule)
The rule for matching the document against the knowledge base.
The folders that will be redacted are a special case. The figure below shows an action for moving the document to the in subfolder within a redaction repository:

Figure 5. The decision plan (second rule)
The rule for moving files to the correct repository folder.

Run Classification Module Classification Center
For details on how to set up Classification Center for classifying documents into folders, see InfoSphere Classification Module InfoCenter Classification Center topic.
Once Classification Center is run, the documents for redaction should be moved to the redaction in folders; non-redacted documents should be moved to the Products subcategories within this structure. The figure below shows the in folder for two repositories and other non-repository folders named after categories.

Figure 6. The redaction repository file structure
The redaction repository file structure, the Classification Center inserts documents into the input directories.
Check to see that the above folders were populated by Classification Center.
The following figure shows two folders (Financial and Legal) that will serve also as Data Repository folders:

Figure 7. The Financial and Legal repository folders
Explorer view of the Financial and Legal repository folders
Here, Classification Center moves files to the subfolder in of each repository folder.

Restart the InfoSphere Guardium Data Redaction server
From the IBM InfoSphere Guardium Data Redaction Windows menu, choose Start server. Since the in folder of the two new repositories now contain the documents created by the Classification Center, Redaction will now process these files.
The figure below shows the orig and out folders within each repository structure.

Figure 8. The out folder now contains redacted documents. The orig folder contains the original copies.
The out folder now contains redacted documents. The orig folder contains the original copies.
Data redaction processes documents from the in folder and creates redacted and non-redacted versions in the respective folders:
orig folders: original documents
out folders: redacted copies
The percentage of files that are sent for review depends on the percentage set in the relevant repository file (such as batchFileSystemProcessorIBM_Legal.xml above):
0
We now have various versions, redacted and non-redacted, of our original documents classified into folders. There are various aspects of this model that can be adapted according to business needs.

Some ideas for varying the model
Finding sensitive documents for redaction without subject classification
In the case where the only goal is to locate sensitive data, there is no need for conventional content classification. In this case a Classification Module knowledge base can be created that recognizes the nature of the sensitive documents, and the decision plan can be used to move only those documents to the Redaction repository folder. There is no need for a folder dedicated to CC output. Because the 2-category knowledge base is often used for finding a few relevant items within a large content set, this method is often called "pinpointing." However, it can be used also for finding a large group of similar documents among non-relevant documents.

Figure 9. The pinpointing knowledge base
A two-category knowledge base for finding sensitive material within a large collection of documents.
To create such a knowledge base, choose a number of sensitive documents and an equal number of non-sensitive documents.
Adding manual review of the Classification Center output before and/or after redaction
The Classification Center can be used to manually review documents before they are sent for redaction.
This method can be used early on when the system is first put into production when knowledge base confidence may be low. In addition feedback can be submitted to improve the knowledge base.
The Redaction Manager can be used to review documents, after they are classified and redacted. The document redaction can be edited or removed and sent to another Repository Folder for redaction according to a different policy.
Using multiple pinpointing knowledge bases
Multiple knowledge bases could be set up for pinpointing specific documents for redaction. One or more processes could be implemented consecutively according to need, until all documents are moved to a folder for redaction. This would be helpful, for example, in the case where new sensitive documents of a different nature need to be located for redaction, or where the nature of new documents changes.

Resources
Learn
Get products and technologies
  • Build your next development project with IBM trial software, available for download directly from developerWorks.
Discuss
About the author
Jane Singer photo Jane Singer is on the QA teams for both InfoSphere Guardium Data Redaction and InfoSphere Classification Module at the IBM Israel Software Lab. In addition she leads L3 and presales support for InfoSphere Classification Module.

Do you use mobille apps for your business intelligence solutions?



The dramatic increased growth of SmartPhones, such as iPhones and BlackBerrys, over the
last three years has created numerous opportunities for independent developers to express
their creativity and talents. Now, the growth trend is expected to continue with the the current success of BlackBerry’s App World storefront, and the anticipated release of RIM's new PlayBook mobile device. RIM has opened the doors for "weekend warrior" developers to
develop mobile apps and best of all --- make money!



I have been personally successfull in developing and selling BlackBerry applications in
RIM's BlackBerry App Store for the last several months. I've had some days where over
500 individual downloads for one of my BlackBerry apps has put a smile on my face. I find it
extremely rewarding to develop an app that is of interest to someone else. Therefore, I'd like to share my experience and my personal strategy that can help you to get started developing and make money selling $0.99 cent BlackBerry apps.



My e-book will give you greater insight into what tools are required to get started, how does
the app submission and payment process work, help in determining what apps to develop,
BlackBerry code samples written in Java, and tips on how to manage and market your
BlackBerry app development services. I will also let you in on my biggest BlackBerry App
development tip --- how I have been making money selling BlackBerry apps with no overhead
charges.



That's right! Everything that I have produced from BlackBerry App World is 100% profit!
I'd like to share this experience and knowledge with and I will even stick to my own $0.99 cents strategy.



If you would like to learn more then please download my e-Book today. It may well be the best
$0.99 cents that you have spent yet!



Oh, and I guess that I should also tell you that the formula that I'm using for BlackBerry App
development sales isn't just a new strategy that I have created. I've been doing the exact same strategy for the last 5 years with other software, and I'm consistently ranked on the 1st page in the top 5 search results.

Gold

One hundred thirty-three analysts have projected gold will hit $2,500 an ounce - 90 of them say the precious metal will hit $5,000 - including the original gold bug, James Dines. Still others, like analyst Peter Schiff, are calling for $10,000 an ounce gold!



What does that mean for the average investor and should we take out our grandparents and Ludacris' gold fillings???

Sent wirelessly from my BlackBerry device on the Bell network.

Envoyé sans fil par mon terminal mobile BlackBerry sur le réseau de Bell.

Debit cards truth

Some debit card issuers offer no protection against fraud and theft.



What you may not know is that to reap those benefits, you may have to use the card with a signature instead of a PIN, says Linda Sherry, director of national priorities for Consumer Action, a national consumer education and advocacy group based in San Francisco.



Federal law limits personal liability for unauthorized transactions to $50 for credit cards, but offers more limited fraud protection for debit cards.



How to protect yourself: Find out if your bank offers theft and fraud protection. Get specific. Under what circumstances is it honored? How do you have to use the card? What's your timetable for reporting the loss?



"Most of these promises have limits and asterisks," says Ed Mierzwinski, consumer program director with U.S. Public Interest Research Groups.



As for disputed funds, some banks will put them back in your account, provisionally, while they investigate. Others will wait until their inquiries are completed.



"We still like to tell people if they're ordering things online or over the phone, they might want to use a credit card because they have superior charge-back protection," says Sherry. "When something goes wrong with a credit card, you're not out the money."

Sent wirelessly from my BlackBerry device on the Bell network.

Envoyé sans fil par mon terminal mobile BlackBerry sur le réseau de Bell.