Showing posts with label profile. Show all posts
Showing posts with label profile. Show all posts

Monday, September 5, 2011

Data Profiling Script

I recently needed to profile a database - pull out the schema, and also do some basic data quality checks. So, in the spirit of laziness, er automation, I wrote the following:

if object_Id('tempdb..#Output') is not null
drop table #output
go
Set Nocount on
go
declare @DataBase varchar(200) -- the name of the Database

Select @DataBase = 'Put Your Table Name here' -- Remember to enclose in quotes

if OBJECT_ID('tempdb..#Output') is not null
drop table #Output

select SO.name as TableName
,SC.name as ColumnName
,ST.name as DataType
,SC.length as DatTypePrecision
,CONVERT(int,0) as [RowCount]
,CONVERT(int,0) as DistinctRowCount
,CONVERT(int,0) as MinLength
,CONVERT(int,0) as MaxLength
,CONVERT(float,0) as AVGValue
,CONVERT(varchar(200),'') as MinValue
,CONVERT(varchar(200),'') as MaxValue
,CONVERT(bit,0) as isDateField
,CONVERT(bit,0) as isValidDate
into #Output
from sysObjects SO
join sysColumns SC
on SO.id = SC.id
join sys.Types ST
on SC.xType = ST.system_type_ID

where so.name = @DataBase

declare curLoop cursor
For
Select TableName
,ColumnName
,DataType
from #Output

Declare @TableName varchar(200)
,@Column Varchar(200)
,@DataType Varchar(200)
,@SQL varchar(max)
Open CurLoop

Fetch Next From curLoop
into @TableName
,@Column
,@DataType


While @@FETCH_STATUS = 0
begin

Select @SQL = 'update #Output set [RowCount]= (select count([' + @Column + ']) from [' + @TableName + ']) where [ColumnName]=''' + @Column +''''
exec (@SQL)

Select @SQL = 'update #Output set [DistinctRowCount]= (select count(distinct [' + @Column + ']) from [' + @TableName + ']) where [ColumnName]=''' + @Column +''''
exec (@SQL)

Select @SQL = 'update #Output set [minLength]= (select min(len([' + @Column + '])) from [' + @TableName + ']) where [ColumnName]=''' + @Column +''''
exec (@SQL)

Select @SQL = 'update #Output set [MaxLength]= (select max(len([' + @Column + '])) from [' + @TableName + ']) where [ColumnName]=''' + @Column +''''
exec (@SQL)

Select @SQL = 'update #Output set [AVGValue]= (select AVG([' + @Column + ']) from [' + @TableName + ']) where [DataType] in (''int'',''float'') and [ColumnName]=''' + @Column +''''
BEGIN TRY
exec (@SQL)
END TRY
BEGIN CATCH
Select @SQL = 'update #Output set [AVGValue]= 0 where [ColumnName]=''' + @Column +''''
END CATCH

Select @SQL = 'update #Output set [MinValue]= (select min(convert(varchar(max),[' + @Column + '])) from [' + @TableName + ']) where [ColumnName]=''' + @Column +''''
exec (@SQL)

Select @SQL = 'update #Output set [MaxValue]= (select max(convert(varchar(max),[' + @Column + '])) from [' + @TableName + ']) where [ColumnName]=''' + @Column +''''
exec (@SQL)


Select @SQL = 'update #Output set [isDateField]= case when columnname like (''%Date%'') then 1 else 0 end where [ColumnName]=''' + @Column +''''
exec (@SQL)
--print @SQL

Select @SQL = 'update #Output set [isValidDate]= 1 where [ColumnName]=''' + @Column +'''and [isDateField]=1 and IsDate([MinValue]) = 1 and IsDate([MaxValue]) = 1'
exec (@SQL)


Fetch Next From curLoop
into @TableName
,@Column
,@DataType

end

close curLoop
Deallocate curloop





select *
from #Output

Thursday, August 7, 2008

Moving a user-profile in windows XP

We recently had a situation, where SSIS caching a lot of data in the TEMP folder of the user profile. The default drive only had about 2 GB available, while the E-Drive had about 100GB. We couldn't find a setting to force SSIS to use the E_Drive, so we moved the profile to the E-drive. This is how to move it...

* Log in as an admin account BUT NOT THE PROFILE WE WANT TO MOVE!!! ( Make sure the user is also not logged in on the machine)
* Right-click "My Computer" -> Properties -> Advanced -> User Profiles -> Settings -> Move profile
* We moved the profile to the E-Drive ("E:\Profiles\MyUserName")
* Run REGEDIT
* Go to HKEY_LOCAL_MACHINE -> Software -> Microsoft -> Windows NT -> Current Version -> Profile List
* Ignore the folders with the "short names"
* Look at each folder, and look at the value for ProfileImagePath to see the user name
* Change the path to the folder where you copied the profile to (e.g. ("E:\Profiles\MyUserName")
* Close Regedit, and log back in with the user

When we reran the packages, lo and behold, the E-Drive started getting hit. Problem solved