A large SSIS package, scheduled as a SQL server job, that needs to import tens of large Excel files into a datawarehouse, always halted with errors out of resources or memory on a SQL 2008 R2. We noticed that on the file server only 2 or 3 of the excel files were handled, and moved over to another folder, before the package halted.
After inspection of the way the package was constructed, I noticed that several data flow tasks were not interconnected to run in serial, so they ran in parallel, because the dataflow tasks do not depend on each other.
In Visual Studio 2008 we connected each data flow task to the next so that all data flow tasks ran in serial, not in parallel. After installing the package on the server, we launched the job and this time the job ran fine. Although it still ate large amounts of CPU it kept on going. Of course I will have to monitor this for the next days to see if the job runs properly from now on.
Parallel processing may sound fine, but I doubt that SSIS has enough intelligence to handle things properly, it looks like it starts all the flows at once and then chokes in all parallel threads. Therefore if there is no need for a parallel processing just let the flows work in serial. If one really wanted parallel processing, it would probably be better to split the package in different packages, schedule those to start at the same time and then let the operating system handle the parallel processing. But then your maintenance of the packages is more of a burden in comparison to whatever time you may gain over serial processing.