Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 SSIS and Import/Export (2005)
 Conditional Split not picking up everything

Author  Topic 

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-20 : 14:22:27
I have a flat file source where I am trying to pull out all 5W, 7W and 9W's from seven fields (DryCd1 - DryCd7). I added a Conditional Split to my data flow and put this conditon all on one line with an Output Name called ValidRows:

(SUBSTRING(DryCd1,1,2) == "5W" || SUBSTRING(DryCd1,1,2) == "7W" || SUBSTRING(DryCd1,1,2) == "9W" || SUBSTRING(DryCd2,1,2) == "5W" || SUBSTRING(DryCd2,1,2) == "7W" || SUBSTRING(DryCd2,1,2) == "9W" || SUBSTRING(DryCd3,1,2) == "5W" || SUBSTRING(DryCd3,1,2) == "7W" || SUBSTRING(DryCd3,1,2) == "9W" || SUBSTRING(DryCd4,1,2) == "5W" || SUBSTRING(DryCd4,1,2) == "7W" || SUBSTRING(DryCd4,1,2) == "9W" || SUBSTRING(DryCd5,1,2) == "5W" || SUBSTRING(DryCd5,1,2) == "7W" || SUBSTRING(DryCd5,1,2) == "9W" || SUBSTRING(DryCd6,1,2) == "5W" || SUBSTRING(DryCd6,1,2) == "7W" || SUBSTRING(DryCd6,1,2) == "9W" || SUBSTRING(DryCd7,1,2) == "5W" || SUBSTRING(DryCd7,1,2) == "7W" || SUBSTRING(DryCd7,1,2) == "9W")

My problem is the 5W, 7W and 9W's are not being picked up after DryCd1. So if I have 5W, 7W and 9W's in fields DryCd2 - DryCd7 they are being skipped. What am I doing wrong?

Should I put each field on it's own line eg(DryCd1, DryCd2...) then have 7 destinations then merge them all together?

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-20 : 14:33:05
Let me add that I have data in DryCd1.
I have some data in DryCd2 - DryCd4 but the rest DryCd5 - DryCd7 are all blank (null). Don't know if that could help.
Go to Top of Page

rgombina
Constraint Violating Yak Guru

319 Posts

Posted - 2008-10-20 : 14:37:29
SUBSTRING(DryCd#,1,2) - same field position?

What's your raw data looks like?
Go to Top of Page

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-20 : 15:07:36
These are all different fields DryCd1, DryCd2, DryCd3 etc... Don't know what you mean by position.

Here's some raw data

This is one DryCd1 This wouldn't be pulled out as it has an MR
01MR201101090
This is DryCd1 and DryCd2.
The JR is DryCd1 but the 9W is DryCd2
02JR2008040339W200809240

Go to Top of Page

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-20 : 15:09:57
So I could have four dryCd's (dryCd1 - DryCd4) together like this:

02JR2008040339W20080924002JR200804033JR200809240

I would want to pull out the 9W which is in DryCd2. I hope this makes sense.
Go to Top of Page

rgombina
Constraint Violating Yak Guru

319 Posts

Posted - 2008-10-20 : 16:35:36
I'm sorry but is the raw data one continuous string or stacked (multiple lines/rows)?

If stacked, try using FINDSTRING() method since your criteria is not fixed position.
http://msdn.microsoft.com/en-us/library/ms141748.aspx

To me, SUBSTRING(DryCd1,1,2) means 5W, 7W or 9W are always in the first two positions which explains why you're missing resutls.
Go to Top of Page

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-21 : 07:56:27
Yes the flat file is stacked.

Tried this but it's not working:

FINDSTRING(DryCd2,1,2) == "5W" || FINDSTRING(DryCd2,1,2) == "7W" || FINDSTRING(DryCd2,1,2) == "9W"

DryCd1 - DryCd7 are all String[DT_STR] I read that
DT_STR data type are implicitly cast to the DT_WSTR data type before FINDSTRING performs its operation Do I have to add a data conversion before I use the FindString?
Go to Top of Page

rgombina
Constraint Violating Yak Guru

319 Posts

Posted - 2008-10-21 : 08:29:25
[code]Here's what I got:

1) Use one column (ex 'Column 0' default is fine) from source
2) Use this as your conditional logic -
FINDSTRING([Column 0],"5W",1) > 0 || FINDSTRING([Column 0],"7W",1) > 0 || FINDSTRING([Column 0],"9W",1) > 0
3) Label your condition as true or false appropriately and you should get rows split

Result:
4 true, 1 false (no 5W, 7W or 9W on the data row)

Here's what I used as raw data:
02JR2008040339W20080924002JR200804033JR200809240
02JR200804033JR200809240027W200804033JR200809240
02JR200804033JR20080924002JR2008040335W200809240
027W200804033JR20080924002JR200804033JR200809240
02JR200804033JR20080924002JR200804033JR200809240[/code]
Go to Top of Page

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-21 : 10:48:36
Thanks that worked but didn't change my numbers. Getting the mainframe person to do a comparison to see which files I'm missing as this FindString is picking up everything.

One question why the 1 instead of the 2 I had listed before?

FINDSTRING([Column 0],"7W",1)
Go to Top of Page

rgombina
Constraint Violating Yak Guru

319 Posts

Posted - 2008-10-21 : 11:04:33
[code]FINDSTRING(character_expression, searchstring, occurrence)[/code]

This is totally different method you are using now. 2 before means "2 character length", 1 in FINDSTRING means how many times (at least) you want to find a specific string (searchstring) within your input string. So, if it finds at least one time, then "> 0" means TRUE statement, therefore, this particular row will go to the TRUE destination. If you set it to 2, that means "7W" must be present 2 times to make it a TRUE statement else FALSE.

SUBSTRING is locating a specific character(s) at a specific position within the data. So,

[code]SUBSTRING([Column 0],1,2) == "7W" is TRUE if your input string is
7W11112222AAAABBBBB

another example:
SUBSTRING([Column 0],5,3) == "7WW" is TRUE if your input string is
JR1117WW22AAAABBBBB[/code]

I hope this makes sense.
Go to Top of Page

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-21 : 15:25:30
Thanks for the explanation! I tried to use FindString with my below problem but can't get it to work.

I have another question. My record count is coming out short due to some of the records have a 5W and 7W on the same line. I need to add something to my conditional split to increase the count if Recno=02.

Example I am missing 35 records because it is not picking up the DryCd2 field that has the 5W, 7W or 9W in them. So instead the below record should be counted as two records instead of one.

how could I do something like if RECNO='02' then count that record again?

055W2008091312E2008101307W2008091715H2008092905B200810040
Go to Top of Page

rgombina
Constraint Violating Yak Guru

319 Posts

Posted - 2008-10-22 : 08:44:31
You need record count or output rows into destination? If you can't get your input data manipulated so that only one record per row, then you may need to use Script Component. You may end up doing a lot of logical expression/split and use Union All task to get your final output.

Perhaps other experts here have a different approach such as dumping data into temp table and parse out multiple-record row into separate lines.
Go to Top of Page

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-22 : 09:27:00
Yes I too was thinking of using a staging table. I really don't need the data I just need the count because I'm putting stats on the website.

What if I add to this conditional split:

FINDSTRING([Column 0],"7W",1) > 0 ||

if Column 0 has a 5W, 7W or 9W in the row then add another row/count to the total? Can I do this?

Go to Top of Page

rgombina
Constraint Violating Yak Guru

319 Posts

Posted - 2008-10-22 : 10:44:49
Okay now I got what you are trying to do... but I'm not sure why you are using SSIS to tally the counts. Anyhow, here's my solution but I'm sure there's other ones out there that works as well.

Variables needed: count5W, count7W, count9W and countTotal (all int). Also need to add all variable names on ReadWriteVariables script section.


Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Text.RegularExpressions
Imports System.IO

Public Class ScriptMain

Public Sub Main()

Dim myFlatFileConnection As String
myFlatFileConnection = _
DirectCast(Dts.Connections("myFlatFile").AcquireConnection(Dts.Transaction), _
String)

Dim str As String = File.ReadAllText(myFlatFileConnection).ToString
Dim count5W As Integer = NumOccurrences(str, "5W")
Dim count7W As Integer = NumOccurrences(str, "7W")
Dim count9W As Integer = NumOccurrences(str, "9W")
Dim countTotal As Integer = count5W + count7W + count9W

Dts.Variables("count5W").Value = count5W
Dts.Variables("count7W").Value = count7W
Dts.Variables("count9W").Value = count9W
Dts.Variables("countTotal").Value = countTotal

MsgBox(Dts.Variables("count5W").Value.ToString)
MsgBox(Dts.Variables("count7W").Value.ToString)
MsgBox(Dts.Variables("count9W").Value.ToString)
MsgBox(Dts.Variables("countTotal").Value.ToString)

Dts.TaskResult = Dts.Results.Success
End Sub

Function NumOccurrences(ByVal StrToCheck As String, ByVal LookFor As String) As Integer
'courtesty from: http://www.mikeross.tv/Count-Occurrences-in-String-VB-NET.aspx
Dim exp As New Regex(LookFor, RegexOptions.IgnoreCase)
Dim Count As Integer = exp.Matches(StrToCheck).Count

Return Count
End Function
End Class
Go to Top of Page

JJ297
Aged Yak Warrior

940 Posts

Posted - 2008-10-22 : 11:12:12
Wow thanks! I have this in SSIS so I can set up a job later for the mainframe person to run who will be pulling the new file each week.

Do I put this code in .Net or SSIS using a Script Task? Sorry confused.
Go to Top of Page

rgombina
Constraint Violating Yak Guru

319 Posts

Posted - 2008-10-22 : 11:18:17
Script Task (uses VB.NET). So long as you have the same filename/path, then code should be okay. Otherwise, the connection name "myFlatFile" needs to be dealt with if it changes.
Go to Top of Page
   

- Advertisement -