Showing posts with label Integration Services. Show all posts
Showing posts with label Integration Services. Show all posts

Wednesday, October 28, 2020

Automate deletion of SSIS packages from SSIS packagestore in SMDB

Once SSIS projects have been converted from Package Deployment to Project Deployment in Visual Studio, you may want to delete the deployed packages from the SSIS packagestore in MSDB on the SSIS servers. You can delete packages from the MSDB store one by one with SSMS. But it can be a tedious job if you have hundreds of packages on several servers (Development, Testing, Production environment). We can generate a script to delete the packages with the DTUTIL utility. 

Here is a very simple example

Before:

  • SSIS project in Visual Studio in package deployment mode
    • 2 packages PackageA & PackageB
  • SSIS Servers (development and production)
    • MSDB Package store
After:

  • SSIS project in Visual Studio in project deployment mode
    • 2 packages PackageA & PackageB

  • SSIS Servers (development and production)
    • SSISDB Integration Services catalog

Once the packages have been deployed in the SSISDB catalog, there is no need to keep them any longer in the MSDB Package store.
The following SQL script will generate a list of all packages in a given subdirectory and generate the SQL script to delete all of them. 

declare @SsisServers varchar(255);
declare @Parentfolder varchar(255);
declare @subfolders varchar(255);
declare @DeleteSubfolder bit;
SET @SsisServers = 'ServerDEV;ServerTST;ServerPRD';
set @Parentfolder = 'Genledger';
set @subfolders = 'Project1;Project2';
set @DeleteSubfolder = 1;
DECLARE @ts TABLE ( SsisServer varchar(255) );
insert into @ts SELECT * FROM STRING_SPLIT (@SsisServers, ';');
DECLARE @tf TABLE ( subfolder varchar(255) );
insert into @tf SELECT * FROM STRING_SPLIT (@subfolders, ';');
SELECT os.script
FROM @ts AS ts
OUTER APPLY (
SELECT  o.script
FROM    @tf AS t
        OUTER APPLY ( 
SELECT ('DTUTIL "/SQL ' +  @Parentfolder + '\' + t.subfolder + '\' + p.[name] + ' /DELETE /SOURCESERVER ' + ts.SsisServer +'"') AS script
FROM msdb.dbo.sysssispackages p
INNER JOIN msdb.dbo.sysssispackagefolders f
ON f.folderid = p.folderid
INNER JOIN msdb.dbo.sysssispackagefolders pf
ON f.parentfolderid = pf.folderid
WHERE pf.foldername =@Parentfolder
and f.foldername =t.subfolder
                    ) o
) os;
If @DeleteSubfolder <> 0
SELECT ('DTUTIL "/SOURCESERVER "' + ts.SsisServer + '" /FDELETE SQL;' +  @Parentfolder + ';' + t.subfolder) AS script
FROM @tf t
cross join @ts ts
go
The output of the script can then be copied into a Powerquery script:
DTUTIL "/SQL GenLedger\Project1\PackageA /DELETE /SOURCESERVER ServerDEV"
DTUTIL "/SQL GenLedger\Project1\PackageB /DELETE /SOURCESERVER ServerDEV"
DTUTIL "/SQL GenLedger\Project2\PackageC /DELETE /SOURCESERVER ServerDEV"
DTUTIL "/SQL GenLedger\Project2\PackageD /DELETE /SOURCESERVER ServerDEV"
DTUTIL "/SQL GenLedger\Project1\PackageA /DELETE /SOURCESERVER ServerTST"
DTUTIL "/SQL GenLedger\Project1\PackageB /DELETE /SOURCESERVER ServerTST"
DTUTIL "/SQL GenLedger\Project2\PackageC /DELETE /SOURCESERVER ServerTST"
DTUTIL "/SQL GenLedger\Project2\PackageD /DELETE /SOURCESERVER ServerTST"
DTUTIL "/SQL GenLedger\Project1\PackageA /DELETE /SOURCESERVER ServerPRD"
DTUTIL "/SQL GenLedger\Project1\PackageB /DELETE /SOURCESERVER ServerPRD"
DTUTIL "/SQL GenLedger\Project2\PackageC /DELETE /SOURCESERVER ServerPRD"
DTUTIL "/SQL GenLedger\Project2\PackageD /DELETE /SOURCESERVER ServerPRD"
#delete subfolder
DTUTIL "/SOURCESERVER ServerDEV /FDELETE SQL;GenLedger;Project1"
DTUTIL "/SOURCESERVER ServerDEV /FDELETE SQL;GenLedger;Project2"
DTUTIL "/SOURCESERVER ServerTST /FDELETE SQL;GenLedger;Project1"
DTUTIL "/SOURCESERVER ServerTST /FDELETE SQL;GenLedger;Project2"
DTUTIL "/SOURCESERVER ServerPRD /FDELETE SQL;GenLedger;Project1"
DTUTIL "/SOURCESERVER ServerPRD /FDELETE SQL;GenLedger;Project2"

Run the script from the Powershell ISE


That's all folks!

Tuesday, February 10, 2015

SSIS 2012 increment variable in expression task

With SSIS 2012 the Expression Task is a very useful addition.

image

It can replace some of the tedious little script tasks when all you want to do is to change the value of a variable.

However! Although the programming language of C# in scripts can look very similar to the expression language used in SSIS it can be misleading.

Today I needed a counter in a foreach loop container that needed to be incremented by one if an error was found in a preceding task. I created a variable of type int that was initiallly set to zero. Inside the container I added an Expression Task to increment my counter by 1.

There I made my mistake: being used to incrementing with C# I wrote the following (incorrect) statement:

@[User::Errorcounter] += 1

This statement will evaluate to a value of 1 when you click the ‘Eveluate Expression’ button. since my variable was initially set to zero, this looked OK to me.

image

However, the variable will always stay at value 1, it will never increment!

The correct increment statement is (in a VB-like or SQL-like form)

@[User::Errorcounter] = @[User::Errorcounter] + 1

This also evalutates to 1, but it will increment to 2, 3 and so on when the package is run.

By the way: this can easily be verified when you set the initial value of the variable Errorcounter to another value, e.g. 5. With the incorrect statement the expression will still evaluate to 1. With the correct statement the expression will evaluate to 6.

Tuesday, October 28, 2014

Make a dynamic selection from SharePoint list data with the SSIS 2012 OData Source

Problem

Use SSIS to extract, transform and load data from a SharePoint list into a SQL Server table. Extraction from the SharePoint lists is done in a Data Flow Task by means of the OData Source for SSIS 2012. The Data must be dynamically extracted, i.e. only List data from the last <n> days must be read.

The OData Source Editor connects to the SharePoint REST service.

The URL of the REST service: http://myserver/_vti_bin/ListData.svc

A name after the URL of the service returns SharePoint Foundation list data in XML format as a standard Atom feed: http://myserver/_vti_bin/ListData.svc/Incidents

.In the OData Source Editor:

image

To make a selection of data created since 2014-08-04, the Filter is set to : $filter=Created+gt+datetime'2014-08-04'

image

There is no way to add a parameter in the Query options text box. How can the filter on the Created date be made variable so that only the last 5 days are read for a daily upload.

Solution

There are no expressions for the Filter property in the OData Source properties. However, in the Data Flow Task that contains the OData Source, we have Expressions that can be used to dynamically set the Query property of the OData Source.

First of all create a Project parameter called ‘LoadDays’ of type Int32 with the value 10.

In the SSIS package add a variable called ‘LoadCreated’ of type DateTime, with an expression to subtract 10 days from the current date: DATEADD( "d", - @[$Project::LoadDays] , GETDATE()  )

image

In the Control Flow pane, select the Data Flow Task that contains the OData Source.

image

In the Properties Window, click on the ellipsis button of the Expressions property.

image

In the Property Expressions Editor window, in the Property column select the OData Source Query property, click on the ellipsis button to open the Expression Builder.

image

The expression we create will need to mimic an OData filter expression like $filter=Created+gt+datetime'2014-08-04'

In the expression builder, the expression is:

"$filter=Created+gt+datetime'" + (DT_WSTR, 4)YEAR(@[User::LoadCreated]) + "-" + ( MONTH( @[User::LoadCreated]) < 10 ? "0" : "") + (DT_WSTR,2)MONTH( @[User::LoadCreated]) + "-" + (DAY( @[User::LoadCreated])<10 ? "0" : "" ) + (DT_WSTR,2)DAY( @[User::LoadCreated]) +"'"

The OData filter does not accept date literals like ‘201-8-4’, we need to add a leading zero for the month and day part, hence the conditional operator ( MONTH( @[User::LoadCreated]) < 10 ? "0" : "")

Click OK to close the Expression builder.

Click OK to close the Property Expressions Editor.

With the Data flow task selected, click the Data Flow tab. Select the OData Source control and edit. The filter property now holds the calculated values of the expression:

$filter=Created+gt+datetime'2014-08-04'

image

When the SSIS package runs it will only load list data created after the calculated date.

Tuesday, October 21, 2014

A deadlock was detected while trying to lock variables for R/W access

Problem

While testing an Integration Service package with a File Watcher task as described in Using the Script Task in SSIS to Process Data Files When They Arrive, the following error was thrown:

The script threw an exception: A deadlock was detected while trying to lock variables <variablelist> for read access and variables <variablelist> for read/write access. A lock cannot be acquired after 16 attempts. The locks timed out.

This was developed as an SSIS 2012 project and deployed on a SQL server in the Integration Services Catalog. The package ran fine when I tested it in Visual Studio and when I executed the task from the Catalog. The error came when I create a SQL Server Agent job to run the task.

First I read some articles about variable locking in SSIS; but why did my package run without errors when I ran it under my account?

Solution

The cause of the problem: it is the account under which the package is run in the Job step that does not have read/write access to the folders where the File Watcher Task and the ensuing File System Tasks are doing their work. In this case the job step was run as the SQL Server Agent Service Account.

Either give this service account read/write access to the folders used by the SSIS package (may not be a good idea) or use a special windows account with sufficient rights and create a SQL Server Agent proxy for this account, see Use a windows account to grant SSIS packages folder and file access on a file share

Thursday, June 21, 2012

Use SSIS package data as source for Report in Reporting Services

It is possible to use the destination output of a Package in SSIS as the input source of a report in Reporting Services. In SSIS the package can send output to a datareader destination object, this in memory output can then be captured by Reporting Services.

Configuring Reporting Services to Use SSIS Package Data

There is an article on MSDN that describes the configuration and also warns about possible security risks.

Creating Report Datasets from SQL Server Integration Services

The follow-up explains how to create a package and a report

Wednesday, May 09, 2012

Windows account has been assigned SQL sysadmin role but cannot connect to Integration services with SSMS

When a new Windows Account (user or a group) has been granted the sysadmin server role, it does not automatically follow that this account has full Integration Services administration privileges. To administer Integration Services you need a windows account, a SQL server account will not work.

  • In the database engine the account has to be added to the Security Logins with Server Roles public and Sysadmin.
    • Use SSMS to connect to the database engine of the server that also runs the integration services.
    • Under Security, right-click on Logins, select New Login…
    • In the Login-New dialog window
      • leave the option Windows authentication
      • enter the name of a Windows Account (can be a user or a group)
      • Under Server roles, check public and sysadmin
      • click OK to create the login.
  • In the windows server on which the Integration Services are running the account has to be added to the local system administrators group.
    • Log on to the windows server
    • under Start / Administrative tools, select Computer management
    • In the computer Management console
      • System tools / Local User and Groups / Groups
      • right click on Administrators in the right hand pane, select Add to group (or select properties)
      • In the Adminstrators properties dialog
        • click on the Add… button
        • enter the Windows Account name (the same as was used in the SQL logins)

This Windows account will now be able to connect with SSMS to Integration services server and perform all tasks necessary to deploy and run packages.

Wednesday, March 28, 2012

SSIS package to export to a new Excel file

Problem

How to export a large number of columns into non existing excel file.

Developing an SSIS package to export to an existing Excel file is easy when the Excel file exists with column headings on the excel sheet. If you have an export that will be scheduled on a daily basis of 50+ columns to a new excel file, then you have some careful typing/copy-pasting to do to create an excel template file. Instead.

Solution

What I often do is to generate an SSIS package with the Sql Server Management Studio export wizard. The interesting thing about this wizard is that it can generate excel files on the fly.After saving the package thus generated by SSMS, you can open the SSIS package in Visual Studio to inspect what it is doing.

image

The package has a dataflow task, but it is preceded by a SQL task. The trick lies in the SQL statement that generates the columns for the Excel sheet. When we edit the Preparation SQL Task 1, then we see a SQL statement that creates a table called ‘Sheet’.

CREATE TABLE `Query` (
`PlantKey` VarChar(8),


‘OrderValue’ Decimal(18,5)
)
GO



image


Conclusion


I can use the package generated by SSMS to adapt to my needs, for instance to generate a series of excel filenames in a For Each Loop Container or in a For Loop container, store theEexcel filename in a variable and put the Preparation SQL task and the Dataflow task in the container. I will still need the Excel file generated by the SSMS package to further design my Dataflow task. But once the package is deployed it will use the Excel variable name in the Excel connection.


Remark


This solution does not take into consideration the problem of Numeric values getting converted to text in the excel sheet.

Monday, March 19, 2012

To speed up processing with SSIS Lookup Data Flow Item use as little columns as necessary

The number of records in a table that was the basis of a Lookup Data flow Item has grown considerateley, whilst the processing speed of the Package was diminishing. In the Lookup item properties for the connection, I had been using the table name as datasource. When running the package interactively the Lookup caching was built up steadily in chuncks until all data had been loaded. I have changed the property to use a SELECT statement that uses only the essential columns. While runnng the package now, the Lookup cache buildup is a lot faster.

Thursday, March 15, 2012

Use SSIS lookup to add only new records to a table and avoid hangup

Problem

Table A is transaction table, table B is a history table. Table B only needs to be populated with new rows from table A. Table B contains a copy of the business key in table A. This wil be done by an SSIS package.

Since no updates are necessary a slowly changing dimension transformation is will not be used.

As a first solutoin I created an SSIs package with a Dataflow, inside the Dataflow I have 3 steps: an OLE DB source to read an inventory, a data transformation and an OLE DB Destination. My OLE DB select statement would read table A and join to table B to find select only rows from A that are not yet written in B.

This is the select statement in table A:

SELECT InventId, col1, col2, ...

FROM dbo.Inventory AS a

LEFT OUTER JOINdbo.HistInvent AS b

      ON a.InventId = b.A_InventId

WHERE b.HistGUID IS NULL

The 3 steps work together fine as long as there are not too many rows, a couple of 1000 rows gave no problems to debug the package.

But when there are more than 100.000 rows then the package hangs. My guess is that the data is read in chunks and written in chunks, but the writing in step 3 blocks table B from being read for the next chunk in step 1.

Solution

Only read from Table A in step 1, then use a Lookup Transformation as a new step between the 1st and 2nd step. The lookup transformation reads the business key in Table B and compares it to the Primary key of table A.

The lookup only redirects Rows that are not found in B.

The Dataflow in 4 steps:

image

This time the package runs fine for more than 500.000 rows.

Conclusion

although it seemed efficient to let SQL server limit the data flow in step 1, it is advisable to use the SSIS tools even though they create more overhead.

Sunday, March 11, 2012

Tuesday, January 31, 2012

Large SSIS package runs out of resources/memory on server

Problem

A large SSIS package, scheduled as a SQL server job, that needs to import tens of large Excel files into a datawarehouse, always halted with errors out of resources or memory on a SQL 2008 R2. We noticed that on the file server only 2 or 3 of the excel files were handled, and moved over to another folder, before the package halted.
Cause
After inspection of the way the package was constructed, I noticed that several data flow tasks were not interconnected to run in serial, so they ran in parallel, because the dataflow tasks do not depend on each other.
Solution
In Visual Studio 2008 we connected each data flow task to the next so that all data flow tasks ran in serial, not in parallel. After installing the package on the server, we launched the job and this time the job ran fine. Although it still ate large amounts of CPU it kept on going. Of course I will have to monitor this for the next days to see if the job runs properly from now on.
Conclusion
Parallel processing may sound fine, but I doubt that SSIS has enough intelligence to handle things properly, it looks like it starts all the flows at once and then chokes in all parallel threads. Therefore if there is no need for a parallel processing just let the flows work in serial. If one really wanted parallel processing, it would probably be better to split the package in different packages, schedule those to start at the same time and then let the operating system handle the parallel processing. But then your maintenance of the packages is more of a burden in comparison to whatever time you may gain over serial processing.