Thursday, December 05, 2013

Example of when to use the new Union All in SSIS 2012 instead of Merge

In a package developed with SSIS 2008 I had to log errors (such as missing data in records processed in the data flow) in a log table. In this data flow different types of error messages are generated and logged, while valid records are processed.further. In short, the valid data is processed and the users can have a look at the errorlog verify if any data was missing.

The errormessages differ but the packagename, taskname and some other data supplied to the errorlog is constant, so I used derived column transformations to generate the errormessage and merge to connect different error message outputs to the same derived column transform for taskname. Finally this data is written to the log. To merge I sometimes needed to sort the output before merging, which was a nuisance, because the order of the errormessages in the log for did not matter.

image

Each record from the data source undergoes some validation and additional data is looked up, if validation fails the a Dervied Errormessage is created (e.g.in No match). The errormessage and some other columns from the record are then sent through a sort and merged in the merge transformation. Each time a new erromessage is needed, the data flow is sorted and merged into the other error flow before task and packagename are added, and then the erroneous record is logged in an error log table.

In SSIS 2012 the new Union All data flow transformation makes it easier to create this kind of data flow logic. The Merge and Sort can be replaced by a single Union All. Below is a sample with similar logic.

image

Another usefull property of ‘Union All’ is the ability to accept more than 2 inputs. Whereas with the merge transformation you needed 2 merge blocks for 3 inputs (as is illustrated in the 1st screenshot)

image

Thursday, November 28, 2013

SSIS 2012 has no default instance of SQL Server

On a SQL 2012 that I was recently given access to develop some new SSIs packages, I received the following error message when trying to connect with SSMS to the SSIS server:

The SQL Server instance specified in SSIS service configuration is not present or is not available. This might occur when there is no default instance of SQL Server on the computer. For more information, see the topic "Configuring the Integration Services Service" in SQL Server 2012 Books Online.

The solution is to find the name instance of SQL server that can be used as default instance on the server, in my case there were 2 named instances.

Once the instance name is known, you have to edit the SSIS configuration file MsDtsSrvr.ini.xml, The file is located in the folder, %ProgramFiles%\Microsoft SQL Server\110\DTS\Binn.

Edit this file, find the line with <ServerName>.</ServerName> and replace the dot by the named instance.

<?xml version="1.0" encoding="utf-8"?>
<DtsServiceConfiguration xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <StopExecutingPackagesOnShutdown>true</StopExecutingPackagesOnShutdown>
  <TopLevelFolders>
    <Folder xsi:type="SqlServerFolder">
      <Name>MSDB</Name>
      <ServerName>SERVER_NAME\INSTANCE_NAME</ServerName>
    </Folder>
    <Folder xsi:type="FileSystemFolder">
      <Name>File System</Name>
      <StorePath>..\Packages</StorePath>
    </Folder>
  </TopLevelFolders>
</DtsServiceConfiguration>

Thursday, October 31, 2013

Reporting project Shared data source is losing its password

I have a Reporting Services project in BIDS 2008 R2 (or VS 2008 if you like). The project contains 4 different shared data sources. Those shared data sources were connecting to different database servers with database accounts and passwords. The project is under TFS source control. The project has been deployed to 2 different servers in a debug and release environment. Recently I received a new development system with a fresh installation of SQL 2008 R2 developer. With BIDS I connected to the TFS server and downloaded the most recent version of my reporting project. All project files had been checked in beforehand on the old development system. I had to check out and modify a report. When I wanted to preview the report in BIDS I noticed the error message that it could not connect to one of the four shared data sources. For deployment my project is setup to NOT overwrite shared data sources.

I presumed that the data source password was lost because it was not stored on this new system. So I checked out the data source opened it and edited the connection in the Shared Data Source properties. Indeed the password and login name where blank. I entered them, tested the connection and closed the properties dialog box by clicking OK. For the other 3 data sources I had to do the same: check out, edit property, reenter credentials and save.

However when I previewed the report I received the same error message, the data source could not connect. I went back to the data source properties and tested the connection: everything was OK. But the report preview still did not work. I saved everything in the project and closed the solution. Then reopened it, went to the shared data source properties and indeed the credentials were blank again.

I checked the file system, all 4 files where write enabled, but only 3 files had a modification time of my recent changes, while the data source with the blank credentials had the modification time of when I downloaded the file from TFS.

I stopped BIDS and deleted the rptproj.user file, restarted BIDS, but that did not help.

I tried running BIDS as administrator and save the credentials for that particular data source but that did not help either.

I checked out the complete solution, entered the credentials for the data source again, but that did not help either.

Since I had everything checked out, I decided to delete the data source and then create it again with the same name. This solution worked. The credentials where now stored each time I reopened the project, I could see the credentials stored in the data source properties and preview the report.

Monday, October 21, 2013

Use Windows account to grant SSIS packages R/W access to files on file share

Problem

To schedule a job that runs SSIS packages that need read/write access on folders and files on a file share, the windows account running the SSIS service needs access to the file share, or you can setup a proxy account with a dedicated windows account that has access to the file share and that can run the SSIS package. Packages are stored in the SSIS package store.

A solution is also explained in Running a SSIS Package from SQL Server Agent Using a Proxy Account

Solution

  • Create a windows account that has sufficient rights on the file share.
    • e.g. MYDOMAIN\ssisfileuser with password xyz
  • On the SQL server where the SQL Server Agent will be running the jobs, add the windows account as a new login.
    • In the user mapping of the Login properties, select the ‘msdb’ database and assign the ‘db_ssisoperator’ Database role membership for msdb to the login.
  • To run the package from a SQL job, the SQL Agent has to use the windows account as a proxy, to define the proxy we have to define a credential, where the password of the windows user can be stored.
    • Under Security / Credentials add a new credential
      • Credential name: ssisfileusercred
      • Identity: MYDOMAIN\ssisfileuser
      • Password: xyz
    • Under SQL Server Agent / Proxies, add a new proxy
      • Proxy name: ssisfileuserproxy
      • Credential name: ssisfileusercred
      • Activate the following subsystems:
        • Operating system (cmdexec) (if you are still running on 64-bit version of SQL 2008, not on SQL 2008 R2)
        • SQL server Integration Services Package
  • Define a new job to be run under the windows account
    • Under SQL Server Agent Jobs, add a new job
      • General
        • owner can be your own windows account name
        • category : I use ‘Data collector’ to put the job under this kind of category
      • Steps – add new step
        • Name: use a name of your choice, I use the same name as the package to run in this step.
        • Type: Sql Server Integration Services Package
        • Run as:ssisfileuserproxy
        • General
          • Package source: SSIS Package store
          • Server: my servername
          • Use windows Authentication is default and only setting
          • Package: select the package name to be run.
      • Schedules: add a schedule when to run the job.

With this approach you can schedule packages that need file access on a share to run from the SSIS package store.

Friday, October 18, 2013

BI Development Studio 2008 integrate with TFS 2010

To let a SQL developer create BI projects for SQL 2008 and save the source in TFS 2010.

Installation

  • Download and install Team Explorer 2008 (This is an ISO file, if you don’t have an ISO image reader try installing Virtual clonedrive or a similar utility)
    • from the ISO image run setup.exe
      image
  • Download and install Visual Studio 2008 SP1 (This small installer will download the full installa and upgrade both VS2008 and Team Explorer 2008 to SP1, if the machine on which BI development Studio is installed, has no full internet access and rights, you may have to look for the full package download)
    • run the executable VS90…..exe
    • image
  • Download and install VSTS 2008 Forward Compatibility Update
    • image

Connecto to TFS

  • Start BI Developmnet Studio
  • In the menu View, select Other windows / Source control Explorer
  • image
  • In the menu Tools, select Connect to Team Foundation Server…
  • image
  • In the Connect window, Click on Servers…
  • image
  • In the Add/Remove window click on Add…
    • image
    • or if the TFS server is not the default root and you have to specify a site name, e.g. ‘Tfs’ then type the full url to the TFS site, ‘http://myserver:8080/tfs’
    • image
    • this will connect to the default TFS collection
    • If you need access to another collection, you need to add a new server connection and type the full name of the collection. e.g. if the collection’s name is ‘Accounting projects’ (with spaces) then the full url will to connect is ‘http://myserver:8080/tfs/Accounting%20projects’

Monday, September 30, 2013

Use Recordset Destination as input in a Foreach Loop Container

The Foreach Loop containter can use ADO recordsets as enumerator. The ADO recordset is stored in a variable of type ‘Object’. Often the ADO recordset is populated with an Excecute SQL Task. But when the data that needs to be enumerated in the Foreach Loop container must be found by combining 2 databases on different servers, the use of an Excecute SQL Task is less obvious.. A solution is to create a DataFlow that populates a Recordset Destination. This Recordset destination is an ADO recordset that is stored in a variable of type ‘Object’. Therefore it can be used as the enumerator in a Foreach Loop Container.

Example

We need to send data from a central database to different on separate locations. The separate server information is stored in a configuration database. The Data to send to the servers comes from different database with sales and financial data.

ADO recordset stored in a Variable

Variable in the screenshot is called ‘AdoRecordset’ in namespace ‘User’ and of value type ‘Object’.

image

Using the Execute SQL Task

The Execute SQL Task will execute a select statement (or a stored procedure) that collects the necessary data that will be used to cycle through in the Foreach Loop Container. In this case it will read all servers and their connection data.

image

Excute SQL task general setting

ResultSet : full result set

Connectiontype ADO.Net (can be Oledb as well)

image

Execute SQL Task Result Set setting

Result Name: 0

Variable Name: AdoRecordset

image

Foreach Loop Container Collection settings

Enumerator: ‘Foreach ADO enumerator’

ADO object source variable: User::AdoRecordset, check ‘rows in the first table’.

image

Foreach Loop container Variable Mappings

Here the columns in the recordset need to be mapped to variables in the package.

Variables are created as needed

image

image

Variable: ServerId, Index 0

Variable: ConnectionString, Index 1

image

The connectionString variable

Is used as an expression in a Connection object. The connection object will be used in various dataflow inside the Foreach Loop container

image

Remote server Expression properties: use @User::ConnectionString as ConnectionString property.

image

This will allow to use the same connection in each iteration of the foreach loop with another connectionstring.

Using the Recordset Destination in a Dataflow instead of Execute SQL Task

Instead of a single select statement in an Execute SQL Task we will use a DataFlow with a combination of various sources to combine in a Recordset Destination. The Recordset destination is stored in the variable AdoRecordset.

In the control flow the Execute SQL Task is replaced by a Data Flow Task.

image

The Data flow task reads data from several databases and combines it into a Recordset Destination

image

The recordset Destination editor

VariableName: AdoRecordset.

image

Conclusion

When the Ado recordset for a foreach loop is composed of data from various databases or other Data Flow sources, or when complex decision logic is needed, use a Data Flow with a Recordset Destination instead of an Execute SQL Task to create an in memory recordset that is consumed by the foreach Loop container.

Wednesday, September 18, 2013

Repeat header rows on each page not working

The header rows for a simple tablix do not repeat on each page by default. Since I keep forgetting the solution, I posted the answer that I found on the internet here.

Posted by Stacia on http://stackoverflow.com

“It depends on the tablix structure you are using. In a table, for example, you do not have column groups, so Reporting Services does not recognize which textboxes are the column headers and setting RepeatColumnHeaders property to True doesn't work. Instead, you need to open Advanced Mode in the Groupings pane: click the arrow to the right of the Column Groups and select Advanced Mode. You'll see Static Groups appear in the Row Groups area. Clicking on a Static group highlights the corresponding textbox in the tablix. For the column headers that you want to repeat, select the Static group that highlights the leftmost column header. This is generally the first Static group listed. In the Properties window, set the RepeatOnNewPage property to True. Make sure that the KeepWithGroup property is set to After. The KeepWithGroup property specifies which group to which the static member needs to stick. If set to After then the static member sticks with the group af-ter, or below, it acting as a group header. If set to Before, then the static member sticks with the group before, or above it, acting as a group footer. If set to None, Reporting Services decides where to put the static member. Now when you view the report, the column headers repeat on each page of the tablix.”

Friday, May 17, 2013

Let Reporting Services use the User’s regional settings

US format by default

When designing a report currencies and dates use the US format, e.g. MM/dd/yyyy The user would like to read his report in his local format, e.g. dd/MM/yyyy.

Change the report for the user’s regional settings

In the report designer change the report property Language to Expression: =User!Language

A more elaborate explanation can be found at this link Reporting services tip about user regional settings

Thursday, May 16, 2013

Report parameter default value to 1st and last day of previous month

Problem

For a report that selects data in a period with a start and end date, I needed to create a subscription that is sending an email the beginning of each month with the data from the previous month.

The report has 2 parameters @FromDate and @ToDate. The dataset of the report uses a select statement with a where condition: … WHERE Reportdate BETWEEN @FromDate and @ToDate.

The ReportDate is a DATE field (not a Datetime field)

when the user opens the report, he will be shown the 2 parameter fields with a Datetime picker to select a date from a calendar. But with a subscription the parameter fields need to have a default value. In this case the default value must be the 1st day and the last day of the previous month.

Solution with SQL Date field

The FromDate parameter has an expression as default value. The expression calculates the 1st day of the previous month:

=DateSerial(Year(DateAdd(DateInterval.Month, -1, Today())), Month(DateAdd(DateInterval.Month, -1, Today())), 1)

The ToDate paramet has a similar expression as default value. The expression calculates the last day of the previous month (no matter how many days). It does so by calculating the first day of the current month with DateSerial and then it subtracts one day with DateAdd.

=DateAdd(DateInterval.Day, -1, DateSerial(Year(Today()), Month(Today()), 1))

In case of SQL Datetime field

2 possibilities: either use the CAST function in the select statement, and keep the parameters and default expressions as they are for the Date field:

… WHERE CAST(Reportdate AS Date) BETWEEN @FromDate and @ToDate

Or leave the select statement as it is and change the default value of the ToDate parameter, by not subtracting one Day but the smallest necessary timeinterval. Howver the smallest interval in Reporting Services code is ‘Second’. If your datatime fields contain millisceconds you cannot give an absolute guarantee that it will catch all possible values. So only use this if you will never have values between 23:59:59.000 and 23:59:59.999

=DateAdd(DateInterval.Second, -1, DateSerial(Year(Today()), Month(Today()), 1))

Tuesday, May 14, 2013

See What Other Users Have Checked Out in TFS

I found the answer on Adam Prescott's blog

Prerequisites

Visual Studio Team Foundation Power Tools – With Visual Studio 2010 – in the Tools menu, select Extension manager. Download the Power tools and install (close VS 2010)

image

See what others have checked out

After the installation of the tools has finished:

  • run VS 2010
  • open the Source control explorer window
  • In the VS 2010 main menu, select File / Source control / Find In Source Control / Status…
    image
  • In the dialog box, leave the path to $/ to find checked out sources in all workspaces, click on ‘find’
    image
  • You’ll be presented a list of files with the user names that are checked out

Tuesday, May 07, 2013

An easy way to rename oledb connection in a series of slowly changing dimensions

Problem

Over time we had developed many SSIS packages on a SQL 2008 server that transformed data over several databases. Each package used its own configuration file in which the same connections were defined again and again. To standardize the use of the database connections in all packages and to ease the use in development, testing and production environment, we decided to remove the connection definitions from the package configuration and put them in a separate configuration file. To achieve this, standardization of the connection manager names is required as well.

Instead of having one dtsConfig file for each package with:

  • variables
  • connection strings
    • the connection definition is repeated in different packages
    • the connection name is not standard
<?xml version="1.0"?>
<DTSConfiguration>
  <DTSConfigurationHeading>
    <DTSConfigurationFileInfo GeneratedBy="Mydomain\jdhondt" GeneratedFromPackageName="SomePackage" GeneratedFromPackageID="{86BF1AAA-9397-4DC5-A45A-CC89D6B478F0}" GeneratedDate="25/10/2011 12:07:28" />
  </DTSConfigurationHeading>
  <Configuration ConfiguredType="Property" Path="\Package.Connections[Sales].Properties[ConnectionString]" ValueType="String">
    <ConfiguredValue>Data Source=DBSERVER;User ID=DBuser;Initial Catalog=Sales;Provider=SQLNCLI10.1;Persist Security Info=True;Application Name=SSIS_Import;</ConfiguredValue>
  </Configuration>
  <Configuration ConfiguredType="Property" Path="\Package.Connections[Sales.ADO.NET].Properties[ConnectionString]" ValueType="String">
    <ConfiguredValue>Data Source=DBSERVER;User ID=DBuser;Initial Catalog=Sales;Application Name=SSIS_Import;</ConfiguredValue>
  </Configuration>
  <Configuration ConfiguredType="Property" Path="\Package.Variables[User::Value1].Properties[Value]" ValueType="Int32">
    <ConfiguredValue>10</ConfiguredValue>
  </Configuration>
  <Configuration ConfiguredType="Property" Path="\Package.Variables[User::Value2].Properties[Value]" ValueType="Int32">
    <ConfiguredValue>200</ConfiguredValue>
  </Configuration>
</DTSConfiguration>





We opted for several dtsConfig files per package:



  • dtsConfig with variables specific to the connection



<?xml version="1.0"?>
<DTSConfiguration>
  <DTSConfigurationHeading>
    <DTSConfigurationFileInfo GeneratedBy="Mydomain\jdhondt" GeneratedFromPackageName="SomePackage" GeneratedFromPackageID="{86BF1AAA-9397-4DC5-A45A-CC89D6B478F0}" GeneratedDate="25/10/2011 12:07:28" />
  </DTSConfigurationHeading>
  <Configuration ConfiguredType="Property" Path="\Package.Variables[User::Value1].Properties[Value]" ValueType="Int32">
    <ConfiguredValue>10</ConfiguredValue>
  </Configuration>
  <Configuration ConfiguredType="Property" Path="\Package.Variables[User::Value2].Properties[Value]" ValueType="Int32">
    <ConfiguredValue>200</ConfiguredValue>
  </Configuration>
</DTSConfiguration>




  • dtsConfig with connections specific to a database

    • sometimes an oledb and an ADO connection
    • the connection are renamed accordingly ( .._oledb, .._ADO)
    • the same dtsConfig file can be used by more than one SSIS package



<?xml version="1.0"?>
<DTSConfiguration>
  <DTSConfigurationHeading>
    <DTSConfigurationFileInfo GeneratedBy="Mydomain\jdhondt" GeneratedFromPackageName="SomePackage" GeneratedFromPackageID="{86BF1AAA-9397-4DC5-A45A-CC89D6B478F0}" GeneratedDate="25/10/2011 12:07:28" />
  </DTSConfigurationHeading>
  <Configuration ConfiguredType="Property" Path="\Package.Connections[Sales].Properties[ConnectionString]" ValueType="String">
    <ConfiguredValue>Data Source=DBSERVER;User ID=DBuser;Initial Catalog=Sales;Provider=SQLNCLI10.1;Persist Security Info=True;Application Name=SSIS_Import;</ConfiguredValue>
  </Configuration>
  <Configuration ConfiguredType="Property" Path="\Package.Connections[Sales.ADO.NET].Properties[ConnectionString]" ValueType="String">
    <ConfiguredValue>Data Source=DBSERVER;User ID=DBuser;Initial Catalog=Sales;Application Name=SSIS_Import;</ConfiguredValue>
  </Configuration>
</DTSConfiguration>



The big problem lies with the use of the package designer to rename a connection in existing Data Flow Sources or Slowly Changing Dimensions. Each time you dare to edit one of these and change the connection manager, the column names are lost and you have to redefine every one of them. The other problem is that you cannot easily find where the connection managers are being used.


Solution



  • Make a backup of your package file.
  • Use Visual Studio 2008
  • With the package editor, add the new connection names, keep the old ones.
  • Save the package
  • close the designer
  • In the solution explorer, right click on the package and select ‘view code’ in the dropdown menu.
  • In the xml code, look for the ConnectionManager definition of the old connection and take note of the DTSID value



    <DTS:ConnectionManager>
    <DTS:Property DTS:Name="DelayValidation">-1</DTS:Property>
    <DTS:Property DTS:Name="ObjectName">Sales</DTS:Property>
    <DTS:Property DTS:Name="DTSID">{495F9DDC-1119-4A10-B67C-BEE6F3C35E72}</DTS:Property>




  • now look for the ConnectionManager definition of the new connection and take note of the DTSID value


    <DTS:Property DTS:Name="DelayValidation">-1</DTS:Property>
    <DTS:Property DTS:Name="ObjectName">Sales_oledb</DTS:Property>
    <DTS:Property DTS:Name="DTSID">{06ACAA06-FE6F-4F94-B8D2-4164E9D33F9E}</DTS:Property>



  • Find and replace the old DTSID everywhere in the code (except on the connectionManager definition), e.g. somehwere in the code where a slowly changing dimension is used:


    <connections>
    <connection id="10" name="OleDbConnection" description="The OLE DB runtime connection used to access the database." connectionManagerID="{06ACAA06-FE6F-4F94-B8D2-4164E9D33F9E}"/>
    </connections>



  • Save the package file and close the code editor window.
  • Open the package with the package designer. Verify the connections in the Data flow Sources or slowly Changing Dimensions, The new name should be in there.
  • Remove the old connection from the connection manager.
  • Save and close the package.
  • Edit the old dtsconfig file and delete the old connection definition.
  • Open the package.
  • Setup the package configuration to use the new database dtsconfig file.

Tuesday, April 16, 2013

Cannot create a Data-driven subscription on a report

For an existing report in SQL 2008 R2 I want to create a Data-driven subscription, but in the report management, under subscriptions there is a warning sign along the button to create a data-driven subscription

image

  • The report credentials are stored to run the report
  • It is not a linked report
  • There are no user-defined parameter values

But I found a message in the reporting services forums stating that the report might contain expressions depending on user like User!UserID

When I looked in the footer of the report there was indeed a text box with the page and the text “printed by “ User!UserID

After removal of this expression, the option to create a data-driven subscription was available.

Monday, April 15, 2013