Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-20 : 14:22:27
|
I have a flat file source where I am trying to pull out all 5W, 7W and 9W's from seven fields (DryCd1 - DryCd7). I added a Conditional Split to my data flow and put this conditon all on one line with an Output Name called ValidRows:(SUBSTRING(DryCd1,1,2) == "5W" || SUBSTRING(DryCd1,1,2) == "7W" || SUBSTRING(DryCd1,1,2) == "9W" || SUBSTRING(DryCd2,1,2) == "5W" || SUBSTRING(DryCd2,1,2) == "7W" || SUBSTRING(DryCd2,1,2) == "9W" || SUBSTRING(DryCd3,1,2) == "5W" || SUBSTRING(DryCd3,1,2) == "7W" || SUBSTRING(DryCd3,1,2) == "9W" || SUBSTRING(DryCd4,1,2) == "5W" || SUBSTRING(DryCd4,1,2) == "7W" || SUBSTRING(DryCd4,1,2) == "9W" || SUBSTRING(DryCd5,1,2) == "5W" || SUBSTRING(DryCd5,1,2) == "7W" || SUBSTRING(DryCd5,1,2) == "9W" || SUBSTRING(DryCd6,1,2) == "5W" || SUBSTRING(DryCd6,1,2) == "7W" || SUBSTRING(DryCd6,1,2) == "9W" || SUBSTRING(DryCd7,1,2) == "5W" || SUBSTRING(DryCd7,1,2) == "7W" || SUBSTRING(DryCd7,1,2) == "9W")My problem is the 5W, 7W and 9W's are not being picked up after DryCd1. So if I have 5W, 7W and 9W's in fields DryCd2 - DryCd7 they are being skipped. What am I doing wrong?Should I put each field on it's own line eg(DryCd1, DryCd2...) then have 7 destinations then merge them all together? |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-20 : 14:33:05
|
Let me add that I have data in DryCd1. I have some data in DryCd2 - DryCd4 but the rest DryCd5 - DryCd7 are all blank (null). Don't know if that could help. |
 |
|
rgombina
Constraint Violating Yak Guru
319 Posts |
Posted - 2008-10-20 : 14:37:29
|
SUBSTRING(DryCd#,1,2) - same field position?What's your raw data looks like? |
 |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-20 : 15:07:36
|
These are all different fields DryCd1, DryCd2, DryCd3 etc... Don't know what you mean by position.Here's some raw dataThis is one DryCd1 This wouldn't be pulled out as it has an MR01MR201101090 This is DryCd1 and DryCd2. The JR is DryCd1 but the 9W is DryCd2 02JR2008040339W200809240 |
 |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-20 : 15:09:57
|
So I could have four dryCd's (dryCd1 - DryCd4) together like this:02JR2008040339W20080924002JR200804033JR200809240 I would want to pull out the 9W which is in DryCd2. I hope this makes sense. |
 |
|
rgombina
Constraint Violating Yak Guru
319 Posts |
Posted - 2008-10-20 : 16:35:36
|
I'm sorry but is the raw data one continuous string or stacked (multiple lines/rows)?If stacked, try using FINDSTRING() method since your criteria is not fixed position.http://msdn.microsoft.com/en-us/library/ms141748.aspxTo me, SUBSTRING(DryCd1,1,2) means 5W, 7W or 9W are always in the first two positions which explains why you're missing resutls. |
 |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-21 : 07:56:27
|
Yes the flat file is stacked.Tried this but it's not working:FINDSTRING(DryCd2,1,2) == "5W" || FINDSTRING(DryCd2,1,2) == "7W" || FINDSTRING(DryCd2,1,2) == "9W"DryCd1 - DryCd7 are all String[DT_STR] I read that DT_STR data type are implicitly cast to the DT_WSTR data type before FINDSTRING performs its operation Do I have to add a data conversion before I use the FindString? |
 |
|
rgombina
Constraint Violating Yak Guru
319 Posts |
Posted - 2008-10-21 : 08:29:25
|
[code]Here's what I got:1) Use one column (ex 'Column 0' default is fine) from source2) Use this as your conditional logic - FINDSTRING([Column 0],"5W",1) > 0 || FINDSTRING([Column 0],"7W",1) > 0 || FINDSTRING([Column 0],"9W",1) > 03) Label your condition as true or false appropriately and you should get rows splitResult:4 true, 1 false (no 5W, 7W or 9W on the data row)Here's what I used as raw data:02JR2008040339W20080924002JR200804033JR20080924002JR200804033JR200809240027W200804033JR20080924002JR200804033JR20080924002JR2008040335W200809240027W200804033JR20080924002JR200804033JR20080924002JR200804033JR20080924002JR200804033JR200809240[/code] |
 |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-21 : 10:48:36
|
Thanks that worked but didn't change my numbers. Getting the mainframe person to do a comparison to see which files I'm missing as this FindString is picking up everything. One question why the 1 instead of the 2 I had listed before?FINDSTRING([Column 0],"7W",1) |
 |
|
rgombina
Constraint Violating Yak Guru
319 Posts |
Posted - 2008-10-21 : 11:04:33
|
[code]FINDSTRING(character_expression, searchstring, occurrence)[/code]This is totally different method you are using now. 2 before means "2 character length", 1 in FINDSTRING means how many times (at least) you want to find a specific string (searchstring) within your input string. So, if it finds at least one time, then "> 0" means TRUE statement, therefore, this particular row will go to the TRUE destination. If you set it to 2, that means "7W" must be present 2 times to make it a TRUE statement else FALSE.SUBSTRING is locating a specific character(s) at a specific position within the data. So,[code]SUBSTRING([Column 0],1,2) == "7W" is TRUE if your input string is7W11112222AAAABBBBBanother example:SUBSTRING([Column 0],5,3) == "7WW" is TRUE if your input string isJR1117WW22AAAABBBBB[/code]I hope this makes sense. |
 |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-21 : 15:25:30
|
Thanks for the explanation! I tried to use FindString with my below problem but can't get it to work.I have another question. My record count is coming out short due to some of the records have a 5W and 7W on the same line. I need to add something to my conditional split to increase the count if Recno=02.Example I am missing 35 records because it is not picking up the DryCd2 field that has the 5W, 7W or 9W in them. So instead the below record should be counted as two records instead of one.how could I do something like if RECNO='02' then count that record again?055W2008091312E2008101307W2008091715H2008092905B200810040 |
 |
|
rgombina
Constraint Violating Yak Guru
319 Posts |
Posted - 2008-10-22 : 08:44:31
|
You need record count or output rows into destination? If you can't get your input data manipulated so that only one record per row, then you may need to use Script Component. You may end up doing a lot of logical expression/split and use Union All task to get your final output.Perhaps other experts here have a different approach such as dumping data into temp table and parse out multiple-record row into separate lines. |
 |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-22 : 09:27:00
|
Yes I too was thinking of using a staging table. I really don't need the data I just need the count because I'm putting stats on the website.What if I add to this conditional split:FINDSTRING([Column 0],"7W",1) > 0 ||if Column 0 has a 5W, 7W or 9W in the row then add another row/count to the total? Can I do this? |
 |
|
rgombina
Constraint Violating Yak Guru
319 Posts |
Posted - 2008-10-22 : 10:44:49
|
Okay now I got what you are trying to do... but I'm not sure why you are using SSIS to tally the counts. Anyhow, here's my solution but I'm sure there's other ones out there that works as well.Variables needed: count5W, count7W, count9W and countTotal (all int). Also need to add all variable names on ReadWriteVariables script section.Imports SystemImports System.DataImports System.MathImports Microsoft.SqlServer.Dts.RuntimeImports System.Text.RegularExpressionsImports System.IOPublic Class ScriptMain Public Sub Main() Dim myFlatFileConnection As String myFlatFileConnection = _ DirectCast(Dts.Connections("myFlatFile").AcquireConnection(Dts.Transaction), _ String) Dim str As String = File.ReadAllText(myFlatFileConnection).ToString Dim count5W As Integer = NumOccurrences(str, "5W") Dim count7W As Integer = NumOccurrences(str, "7W") Dim count9W As Integer = NumOccurrences(str, "9W") Dim countTotal As Integer = count5W + count7W + count9W Dts.Variables("count5W").Value = count5W Dts.Variables("count7W").Value = count7W Dts.Variables("count9W").Value = count9W Dts.Variables("countTotal").Value = countTotal MsgBox(Dts.Variables("count5W").Value.ToString) MsgBox(Dts.Variables("count7W").Value.ToString) MsgBox(Dts.Variables("count9W").Value.ToString) MsgBox(Dts.Variables("countTotal").Value.ToString) Dts.TaskResult = Dts.Results.Success End Sub Function NumOccurrences(ByVal StrToCheck As String, ByVal LookFor As String) As Integer 'courtesty from: http://www.mikeross.tv/Count-Occurrences-in-String-VB-NET.aspx Dim exp As New Regex(LookFor, RegexOptions.IgnoreCase) Dim Count As Integer = exp.Matches(StrToCheck).Count Return Count End FunctionEnd Class |
 |
|
JJ297
Aged Yak Warrior
940 Posts |
Posted - 2008-10-22 : 11:12:12
|
Wow thanks! I have this in SSIS so I can set up a job later for the mainframe person to run who will be pulling the new file each week.Do I put this code in .Net or SSIS using a Script Task? Sorry confused. |
 |
|
rgombina
Constraint Violating Yak Guru
319 Posts |
Posted - 2008-10-22 : 11:18:17
|
Script Task (uses VB.NET). So long as you have the same filename/path, then code should be okay. Otherwise, the connection name "myFlatFile" needs to be dealt with if it changes. |
 |
|
|
|
|
|
|