Sunday 29 June 2014

How to import Metadata using DB2 Connector from IMAM (InfoSphere Metadata Asset Manager) Window (or Information Server DataStage Client) on Unix OS



For any Connector Metadata import user needs to restart the Node agent after setting all the required parameters of a connector. For DB2 Connector we can do MDI (Metadata Import) using below steps
Steps to setup DB2 environment prior to import (Please use root user for below steps)
1)    Install DB2 Client/Server on Information Server machine

2)    Set the DB2 related environments like DB2INSTANCE, LD_LIBRARY-PATH/LIBPATH, PATH using one of the below steps
Source db2profile, it can be done by below command
 source /home/db2inst1/sqllib/db2profile   (Where /home/db2inst1 is DB2 Home)
OR
Add the below parameters in dsenv
            PATH=/home/db2inst1/sqllib/bin:$PATH export PATH

LD_LIBRARY_PATH=/home/db2inst1/sqllib/lib:$LD_LIBRARY_PATH                            export LD_LIBRARY_PATH  
           
DB2INSTANCE=db2inst1 export DB2INSTANCE
            Now source dsenv and restart the DSEngine.

*db2inst1- Instance name of DB2 Client/Server
*LIBPATH needs to use for AIX instead of LD_LIBRARY_PATH

3)    Source dsenv, it can be done by below command
 source /opt/IBM/InformationServer/Server/DSEngine/dsenv

4)    Goto /opt/IBM/InformationServer/ASBNode/bin/ and run restart the Node agent as  ./NodeAgents.sh restart

5)    Goto DataStage Client or IMAM GUI Window and run the DB2 Connector Metadata import steps.



Steps to Import MDI using IMAM GUI
1) Connect to IMAM GUI by using below link
https://localhost:9443/ibm/imam/console/loginForm.jsp  (where as localhost is your machine hostname or IP)
2) Goto Administration tab and create Metadata Interchange Server.

3) Goto Import Tab and Click on New Import Area

3) Select IBM InfoSphere  DB2 Connector From the Bridge/Connector Selection after selecting Metadata Interchange Server and give the Import area name and Click on Next button

4) Select the DataConnection


5) Select Asset to Import  and click on Next button

6) Provide the Host System name and click on next button
7) In next page Select Express Import  and click on Import button
8) You can see the imported assets Shared Import Tabs of IMAM Window

Summary: We can import Metadata using DB2 Connector from IMAM GUI (or DataStage Client) after restarting Node agent (after setting all the DB2 client related environments) from root user.

Disclaimer: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”

Saturday 14 June 2014

Extract Unstructured Data using Information Server Connectivity Stage – Unstructured Connector

Abstract

Unstructured data is information that does not have a predefined data model or does not fit well into relational tables. Unstructured data can be text from books, journals, metadata, audio, video files, the body of word processor documents, web pages, and presentation charts. This blog describes about the stage that handles the unstructured data.

Introduction:

With the release of Information Server 9.1, it supports the Unstructured Data with Microsoft Excel files as data source.In this blog,we are going to create a job to extract the unstructured data and explain about the properties with an example.


Configure to extract data

Create a stage in information server with the following two components:

  1. Unstructured data stage
  2. Output sequential file

Following figure illustrates how to extract data by using unstructured data stage. Double click the unstructured data stage icon to enter or modify the options. Options for the respective stage need to be specified at stage level appropriately and in configure window to make the job run.


Configure Window

Following figure describes the Configure window of unstructured stage which is displayed when configure button in the Stage properties is clicked. Configure data source,import,map sections appropriately and click OK .Respective stage properties and column schema is populated.


 

Configure Window Properties

Specify filename,Password if file is password protected in Data source Section and specify template file for design time when wild cards specified in the input excel files. Select Use template file for design time and specify a template file name. Template file is used for subsequent configuration steps, and not used at runtime.Specify a value for Template password if the specified template file is password protected.

Specify range expression and click load button. Appropriate excel columns will be displayed in Import Section.

From the Column header field, select First row of data ranges. When First row of data ranges is selected, first row is regarded as the header and the Unstructured Data stage starts extracting from the second row.

Different types of data like File/Document properties , Sheet/Row/Cell Information values can be extracted using property tab and Custom properties can be specified in Advanced tab in Import Section.

Click Import button to get the data stage columns. Other cell information like formula,comment,author of the comment,Hyperlink label,type,address can be extracted by specifying the appropriate value for Import option in Map section. Click OK 

Stage Level Properties

Unstructured data stage can be configured to skip null values in the cells that are being extracted by specifying skip null rows option in stage properties

Run time column propagation should be enabled and appropriate property should be set for hidden columns option to extract hidden column data

When extracting data from the file, specify how you want to handle errors. Select Fail to abort the job. Select Skip to skip the data and continue to run the job

Click OK , compile and execute the job to get the required output.

Example:

In this example there are two input excel file names (Employee1.xls and Employee2.xls). To retrieve the data from both the files mention the File name as Employee*.xls. To load the excel columns in Import section template file ie., original file name either employee1.xls or employee2.xls should be specified in the Template File option in Configure window.

Input Excel Sheet Data (Employee1.xls)
 

Input Excel Sheet Data (Employee2.xls)


 Output Data (sequential File)

Column_A – Column_G contains actual data.RowNumber Column the details of the row from where the data is extracted. Sheet Name columns provides the details from which sheet the data is extracted. Cell_D7 is the value extracted for the property specified in the particular cell property in Advanced tab of Import section.


Conclusion

Information Server 9.1 release supports single output link which has single schema. Upcoming release supports multiple output links to extract multiple data structures.


Disclaimer: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”