Load Excel File Dynamically Into Database Using SQLBulkCopy and GetOleDbSchemaTable in C#

Previously, I have posted some SSIS solutions which would handle dynamic column mapping in data loading. This time, I was wondering how to handle dynamic column mapping when the source is an excel file. I was quick to realize that it would be entirely a C# solution and the program should be able to handle any change in the number or name of columns. In simple words, you give any excel file and the program will load it into database. I used GetOleDbSchemaTable in order to fetch the schema information from the excel file, then built a similar table in the database. Excel data types and SQL data types do not match. Sometimes, they make me panic while loading files using SSIS. In my program also, I had to do some change in the data types; like WChar becomes NVARCHAR, Double becomes INT. Yes, I could use an XML file/database table to save a detailed mapping of this conversion of excel data types to SQL data types to make my program more capable and versatile. But, the conversion is hard-coded at the moment. Another challenge was to get all the data loaded to SQL table without using any column names or hard coded column mapping. SQLBulkCopy came to my rescue. By the way, I had to use a dataview over the table returned by GetOleDbSchemaTable because it was returning the column names in alphabetical order. So, I had to sort the rows using “ORDINAL_POSITION”

using System;

using System.Data.OleDb;

using System.Data.SqlClient;

using System.Data;

using System.Windows;

namespace ConsoleApplication2

{

class Program

{

           public static void Main()

{

try

{

///*****************Establish Excel Connection*****************///

String strExcelConn = “Provider=Microsoft.ACE.OLEDB.12.0;”

+ “Data Source=C:\\files\\Book1.xlsx;”

+ “Extended Properties=’Excel 8.0;HDR=Yes'”;

OleDbConnection connExcel = new OleDbConnection(strExcelConn);

OleDbCommand cmdExcel = new OleDbCommand();

cmdExcel.Connection = connExcel;

///*****Get schema info and Read Sheet Name and save it as TableName *******//

connExcel.Open();

DataTable dtExcelSchema;

dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);

string TableName = dtExcelSchema.Rows[0][“TABLE_NAME”].ToString();

string Query = “CREATE TABLE “ + TableName + “(“;

/*****Get schema info and read the column names and build Query to Create similar SQL table*****/

//

dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, new object[] { null, null, TableName, null });

connExcel.Close();

string FieldNames = “”;

DataView dv = new DataView(dtExcelSchema);

dv.Sort = “ORDINAL_POSITION”;

foreach (DataRowView rowView in dv)

{

DataRow row = rowView.Row;

Query += row[“COLUMN_NAME”].ToString() + ” “ + ((OleDbType)row[“DATA_TYPE”]).ToString() + “,”;

FieldNames += row[“COLUMN_NAME”].ToString() + “,”;

}

/////******************Replace data types************/////

Query = Query.TrimEnd(“,”.ToCharArray()) + “)”;

Query = Query.Replace(“WChar”, “NVARCHAR(500)”);

Query = Query.Replace(“Double”, “INT”);

FieldNames = FieldNames.TrimEnd(“,”.ToCharArray()) + “”;

/******************read records from excel file************/

cmdExcel.CommandText = “Select “ + FieldNames + ” from [“ + TableName + “]”;

DataSet ds = new DataSet();

OleDbDataAdapter da = new OleDbDataAdapter();

connExcel.Open();

da.SelectCommand = cmdExcel;

da.Fill(ds);

connExcel.Close();

//******** Establish SQL Server connection and create table*****//

SqlConnection conn = new SqlConnection();

string sqlConn = “Data Source=YourServerName;Initial Catalog = YourDBName; Integrated Security = SSPI;Persist Security Info=False;”;

conn.ConnectionString = sqlConn;

SqlCommand cmd = new SqlCommand(Query);

cmd.Connection = conn;

conn.Open();

cmd.ExecuteNonQuery();

conn.Close();

//*****************Do bulk copy to table*****************//

DataTable dt = new DataTable();

dt = ds.Tables[0];

SqlBulkCopy sbc = new SqlBulkCopy(conn);

sbc.DestinationTableName = TableName;

conn.Open();

sbc.WriteToServer(dt);

conn.Close();

}

catch (Exception ex)

{

Console.WriteLine(ex.Message);

}

}

}

}

 

Posted in MS Visual Studio, C#.NET, SQL, Database, SSIS (SQL Server Integration Service) | Tagged , , , , , | 13 Comments

Access SSIS variables in script task and script component

You have to write codes to access the variables in these two components in a slight different way.

For Script Task:

DTS.Variables[“VariableName”].Value

For Script Component:

Variables.VariableName

Don’t forget to add these variables in task editor first. 🙂

Posted in SSIS (SQL Server Integration Service) | Tagged , , | Leave a comment

Find out anagram using SQL

Any word or phrase that exactly reproduces the letters in another order is an anagram. For example: “Hamlet” is an anagram of the word “Amleth”. We will show u a query today which can determine if two given words are anagram or not. We have used a Tally table for parsing the given strings. It has only one column called ‘N’ and the values of this column are like 1,2,3,4…….100

DROP TABLE #a
DROP TABLE #b

DECLARE @str1 VARCHAR(30)
DECLARE @str2 VARCHAR(30)

SET @str1 = ‘ramy’
SET @str2 = ‘mary’

SELECT N, SUBSTRING(@str1,N,1)letters
INTO #a
FROM dbo.Tally  WHERE N <= LEN(@str1)
ORDER BY N

SELECT N, SUBSTRING(@str2,N,1)letters
INTO #b
FROM dbo.Tally  WHERE N <= LEN(@str2)
ORDER BY N

IF EXISTS(
SELECT A.*,B.* FROM
(
SELECT letters,COUNT(*)C FROM #a GROUP BY letters
)A
FULL OUTER JOIN
(
SELECT letters,COUNT(*)C FROM #b GROUP BY letters
)B
ON A.letters = B.letters AND A.C = B.C
WHERE A.letters IS NULL OR B.letters IS NULL
)
PRINT ‘NOT ANAGRAM’
ELSE
PRINT ‘ANAGRAM’

Posted in SQL, Database | Tagged , | Leave a comment

Avoid Concurrency problem while running SQL Server Jobs

Hello everyone,

This time my problem was to delay the start of a SQL server job if another job is already running. Obviously, the reason for this is that the jobs use the same database resources. The sysjobhistory table in MSDB database has the job ids and their status stored. First, I selected the job id which have the run_status=4 (4 means “Job in progress”) from sysjobhistory table. Then I selected the job_id and name of my suspected jobs from sysjobs table. Then, I joined these two result sets and put the joined result set inside IF NOT EXISTS command. In short, I check if any of my suspected jobs is running before I start my job. If the “IF NOT EXISTS” command returns true, I break the while loop and start my job. If it returns false, the process goes to sleep mode for 10 minutes. After 10 minutes the process wakes up and continue from the beginning of the while loop. Below is the code I have used to solve my problem:

WHILE(1=1)
BEGIN
IF NOT EXISTS
(

SELECT A.job_id FROM

(

SELECT job_id FROM msdb.dbo.sysjobhistory WHERE  run_status = 4

)A

INNER JOIN

(

SELECT job_id,name FROM msdb.dbo.sysjobs WHERE name LIKE ‘%Mirror%’
OR name LIKE ‘%BR%’
OR name like ‘%Fact%’

)B

ON A.job_id = B.job_id

)

BEGIN
BREAK;
END

WAITFOR DELAY ’10:00′

END

—-start my job——
EXEC msdb..sp_start_job @job_name = ‘My Job’

Posted in SQL, Database, SSIS (SQL Server Integration Service) | Tagged , , , , , | Leave a comment

Update another table using Data Macro in MS Access 2010

Recently, I had to find a way to write a trigger in MS Access 2007. My aim was to update a field in one table (let’s say Project Table) based on Insert/Update/Delete operation on another table (Let’s say Investors table). Quickly, I realized that MS Access 2007 does not offer much when it comes to triggers, but Access 2010 does. It provides a feature called Data Macro. So, what I did was to write a Data Macro which updates the Project table. The Macro looks like below:

Macro Name: UpdateProjectAmount:

So, it takes the Project Id, AmountPledged, AmountReceived as parameters. Then, inside the procedure, it adds those amounts to the ProjectAmountRaised_Pledged and ProjectAmountRaised_Received field respectively for the corresponding ProjectId.

It’s pretty simple from now on. All I had to do was to create After Insert, After Update and After Delete triggers for the Investor table and call the above macro accordingly with the right values supplied as paramaters. Below is the descriptions of those triggers. Look at the use of Keyword “Old” in the Macros.

1. After Insert:

This macro gets the AmountPledged and AmountReceived from    each new record and send them as Parameters to the Macro called UpdateProjectAmount. The newly inserted ProjectId is also sent as a parameter.

RunDataMacro(Investors.UpdateProjectAmount,[ProjectId],[AmountPledged],[AmountReceived])

2. After Delete:

This macro gets the AmountPledged and AmountReceived from each deleted record and send them as Parameters to the Macro called UpdateProjectAmount. The newly deleted ProjectId is also sent as a parameter.

RunDataMacro(Investors.UpdateProjectAmount,[Old].[ProjectId],-[Old].[AmountPledged],[Old].[AmountReceived])

3. After Update:

This macro gets the updated AmountPledged and AmountReceived from the updated record and send them as Parameters to the Macro called UpdateProjectAmount. The newly updated ProjectId is also sent as a parameter.

RunDataMacro(Investors.UpdateProjectAmount,[ProjectId],[AmountPledged]-[Old].[AmountPledged],[AmountReceived]-[Old].[AmountReceived])


Posted in SQL, Database | Tagged , , , , | 2 Comments

Cursor Using Foreach Ado Enumerator in SSIS

Recently, I have been trying to make a cursor using SSIS components and came up with this idea. The trick is explained below:

Let’s say, we have a record set and we need to loop through the records to concatenate/append one after another in a sequential manner. For example, you have 6 records:

A

B

C

D

E

And, you want to make a comma separated list of records like this :     ‘A’,’B’,’C’,’D’,’E’,

First, you need to drag a Data Flow Component onto the designer area. Double click on it and it will take u to Data Flow Tab. Then u have to drag a Source component. After u set the connection string and everything inside the source component, u can write a query which will produce the record set. Next, u have to declare a package variable (User::Panels) with Object data type. Now, take a “Recordset Destination” and connect the source with it.

Now, double click on the Recordset Destination and the Advanced Editor will open. In the Custom Properties section, there is a field called VariableName. Choose the Object type variable (User::Panels) as the value of that field.

So, the record set will be stored inside that variable.

Next, a Foreach loop Container will iterate through the records stored inside that object variable. So, go back to the control flow tab and drag a Foreach loop container. Connect that with the previously designed data flow task. Double click the container and Editor will open. Go to the collection tab and select Foreach ADO Enumerator as Enumerator. Select ADO object source variable from the drop down list below. See the figure:

Now, the foreach container will read records one by one and keep it in another variable. We need to go to Variable Mappings section to mention that variable (User::PanelName). So, u go to variable mapping section and select another variable (string type for our example). Index will be 0, since we will read the first field (there is only one field in our example) of our records set.

Now, comes the final part. Drag an Execute SQL task inside the Foreach Loop Container and set its connection. Next, go to the Parameter Mapping section of the SQL task and set two variables. The first one is the same as the one used in Foreach loop container (User::PanelName in our example). The second one is a new string type variable (User::FinalString) which will store the final concatenated string.

Next, go to the general section, set Single row as the ResultSet and write down the below query in SQL statement.

Select ”” + ? + ”” + ‘,’ + ? FinalString.

This query gets the first variable and append the FinalString variable to its end and stores the final result in FinalString variable.

It works like,     FinalString = Panelnames + FinalString.

The default value for FinalString is blank (”).

Let’s say, the first record in PanelName variable is A. So, the first iteration will make it………… ” ‘A’,” and keep it in FinalString variable. If the next record from the loop is B, the query makes it………………………..     “‘B’,’A’,” and keeps the new string in FinalString variable. This way, we append each record to the other as we move through the loop. Now, the question is, where are we telling that the result should be kept in the FinalString variable? Yes, that’s why we need to go to the Result Set section of the Execute SQL task editor and map the result to FinalString variable there. Below is the screenshot of the General tab and Result tab of the Execute SQL Task Editor:

Now, u r good to go and execute the package.

Posted in SSIS (SQL Server Integration Service) | Tagged , , , , | 9 Comments

SQL Server Agent needs to Log on As Local System for using Oracle Oledb Provider in SSIS Job

This problem was one of the most terrible one in my short SSIS career. I was using a script task which uses Oracle Oledb provider to connect to an oracle server. After the package was complete, I created a SQL server job and obviously the job’s step was configured to run from SQL Server Agent Service account. When, I started the job , it was hanging forever. There was no error message, no failure…nothing. The job simply hangs forever. I was fortunate enough to find out the solution. I went to the SQL server configuration manager. I went to the properties of SQL Server Agent and this service was logged on as a domain user account. I decided to run the job using the built-in account and chose “Local System” from the dropdown list of available built-in accounts. This time the job finished in time. It seems to be a permission/right problem as the job was trying to use a provider. Often, when an SSIS script cannot access configuration file or any other external resource due to permission problem, it hangs forever. May be, this was the case for my provider also.

Posted in SSIS (SQL Server Integration Service) | Tagged , , , , , | Leave a comment

the connection string format is not valid . it must consist of one or more components of the form….error

Back again with another SSIS SQL Server based configuration problem. I was using SQL server based configuration and the SSIS configuration wizard created one table with the default field lengths. The table is like this:

CREATE TABLE [dbo].[SSIS Configurations](

    [ConfigurationFilter] [nvarchar](255) NOT NULL,
    [ConfiguredValue] [nvarchar](255) NULL,
    [PackagePath] [nvarchar](255) NOT NULL,
    [ConfiguredValueType] [nvarchar](20) NOT NULL
)
ON [PRIMARY]

While importing my configuration properties into this table, I realized that one of my package property’s “ConfiguredValue” is more than 255 in length and it was causing an error. So, I opened the table in design mode in Management Studio and modified the length of that field to 4000. Now, it was time for the great surprise, when I started the package, it was throwing an error like this:

the connection string format is not valid . it must consist of one or more components of the form…

I tried to find out where is the problem in my connection string and after a lot of search, I got my answer. I reduced the long “ConfiguredValue” of that SSIS package property to 255 and modified the table field’s length back to 255. Then the package worked. The lesson is, when the table is already created by the wizard, u cannot change its length afterwards. If u do so, u cannot run the package. So, always adjust the column length of the configuration table at the time of creation.


Posted in SSIS (SQL Server Integration Service) | Leave a comment

Password must be included in connection string for SQL server based configuration in SSIS

Recently, I have been working with SSIS configurations a lot. At the beginning, it seemed not much difference between XML configuration and SQL server configuration to me. Very quickly, I was proved wrong. In the configuration wizard, we can choose several properties of a connection; i.e. Connection String, User Name, Password, Server Name, Initial Catalog etc. When the xml file will be created by the wizard, you will see that the configured value for password is empty. So, u just have to open the xml file and set the password for yourself. And, things will work fine. In case you are wondering that the connection string itself can contain the password also, but SSIS does not store password in connection string. Yes, definitely u can open the xml file and write the password inside the connection string property value but SSIS will fail. It always tries to get the Password from the Password tag from the xml, not from the connection string.

My mistake was that I thought SQL server configuration reads the password the same way (from the password property of the connection). But, it does not. To my surprise, It looks for the password inside the connection string property. My package kept throwing the error: Connection may not be configured correctly or you may not have the right permission on this connection. My first step was, I opened the configuration table and went to the password property field and found the value for password “******” (encrypted). So, I wrote the real password instead of “******”. But, I kept getting the same error. After some nightmarish hours, I found what needed to be done and included the password in the connection string property. Then it worked finally.

Posted in SSIS (SQL Server Integration Service) | 1 Comment

Dynamic Column Mapping in SSIS : Part 2 (using dts variable)

Previously I have discussed the dynamic column mapping using script component. Now, I am going to mention one more way of dynamic column mapping in SSIS. Let’s say, our source table is Table_1:

ID    JAN    FEB    MAR

1    100    200    300

2    400    500    600

3    700    800    900

We want to map its columns dynamically in data flow task.

First, u need to create a DTS package variable called “COMMAND” in your SSIS and the type will be string. The value of the variable should be “Select 1 as Col1,2 as Col2,3 as Col3,4 as Col4”.

Now, drag a data flow task onto the designer. Double Click the task to go to the editor. On the editor, drag one oledb source and double click to edit it. Select the connection, then select the Data Access Mode as “SQL Command From Variable”. Then choose your variable “COMMAND” in variable name. Then, click on the column mapping and click ok.

Now, drag one OleDb destination on the editor of the data flow task. Connect it with the source and double click to edit it. Now, Configure the Destination in usual way. Below is a screen shot:

Now, click on Mappings and it should be like below:

As, u can see the columns from variable “COMMAND” will be working as the source and they will be mapped to the destination table’s columns. After clicking “OK”, go back to the original control flow designer area and drag one script task from the toolbox. Place it before data flow task and connect it with the data flow task.

Double click the script task, set ReadWriteVariable to User variable “COMMAND”. Click on the Edit Script button. Now write some C# code to

1. read the source table columns dynamically/programmatically

2. build an SQL command string using the column names of the source table

3. assign the SQL command string to the variable “COMMAND”.

After all these, u r ready to go. So, execute the package and data will be loaded into the destination. Below is the code snippet for doing the above 3 tasks:

public
void Main()

{


// TODO: Add your code here

Dts.TaskResult = (int)ScriptResults.Success;


SqlConnection Conn = new
SqlConnection(“Data Source=DACW0077;Initial Catalog=Demo;Integrated Security=SSPI;”);


SqlCommand Cmd = new
SqlCommand(“Select top 1* from Table_1”, Conn);


SqlDataReader Reader;

Conn.Open();

Reader = Cmd.ExecuteReader();


string Col1_Name = Reader.GetName(0); //GET THE COLUMN NAMES


string Col2_Name = Reader.GetName(1); //GET THE COLUMN NAMES


string Col3_Name = Reader.GetName(2); //GET THE COLUMN NAMES


string Col4_Name = Reader.GetName(3); //GET THE COLUMN NAMES

        //////BUILD THE SQL COMMAND AND ASSIGN IT TO VARIABLE/////////////////

Dts.Variables[“COMMAND”].Value = “Select “ + Col1_Name + ” AS Col1,” + Col2_Name + ” AS Col2,” +

Col3_Name + ” AS Col3,” + Col4_Name + ” AS Col4 from Table_1″;


//MessageBox.Show(Dts.Variables[“COMMAND”].Value.ToString());

}

Posted in SSIS (SQL Server Integration Service) | Tagged , , , , | 30 Comments