Tuesday, March 16, 2010

Powershell: Writing Text Files

I was recently trying to generate a bunch of data that I needed to pass through one of our in-house DLLs in order to use some custom encryption algorithms. Thanks to the wonderful Powershell community, I found several examples on using custom DLL methods. However, I ultimately wanted this data to end up in SQL Server. My first attempt used an “Invoke-SQLCmd” Cmdlet from the SQLPS snap-in. However, generating several million rows took a while to insert. I knew that I didn’t want to queue all of this data in memory, so was inserting one row at a time. I stopped this process after a while because it was too time-consuming.

My next attempt was to push these values to a text file using a variable pointing to a file and then using add-content to write each line to the file.  This took longer than doing the SQL inserts one at a time.

I remembered coming across an article by Linchi Shea titled Add-Content and Out-File are not for performance. After finding the link to that article, I found that he recommends using System.IO.StreamWriter to write out files in the most performant manner. I definitely recommend that you check out his article for his summary and examples. For my part, I can say that it took me the better part of 6 hours to generate and populate 10 million rows. I just generated 20 million into a file in < 15 minutes!

I’m including a mockup of my code below in case anyone is interested. I’m definitely not a Powershell expert so as always, be careful if you copy any of this code.  My first attempt at directly inserting into SQL Server worked, but threw an error at about 3.5 million rows. As the time was already pretty lengthy by then, I stopped the process and moved on to writing the text file, resuming where I’d left off. That ran for hours until I finally killed it and looked for a better way.  The uncommented code is what finally ended up working in a timely manner.

I started by loading my DLL. I called it’s “EncryptValue” function (against just the integer – that’s a longer story) and populated both out to a table. There are several reasons for doing this, but this gave me a good working set for a test project and definitely lays some groundwork for the future.  If anyone has any ideas on how I can speed this up, feel free to leave me some comments. If I find some, I’ll post an update.

 

[Reflection.Assembly]::LoadFile("C:\MyDLL.dll")

# First attempt – insert directly into SQL Server.
#foreach ($i IN 1..10000000) { invoke-sqlcmd –query '
#
("insert dbo.MyTable VALUES(" + "$i" + ",'" + [MyDLLNamespace]::EncryptValue("$i") + "')") -serverinstance "localhost" }

#Attempt #2 – Write to file with add-content
#$file = New-Item -type File "C:\Values.txt"

#foreach ($i IN 1..10000000) { add-content $file ("$i" + "," + [MyDLLNamespace]::EncryptValue("$i") ) }

$file = New-Object System.IO.StreamWriter "E:\Values.txt";

foreach ($i IN 10000001..30000000) { $file.Writeline("$i" + "," + [MyDLLNamespace]::EncryptValue("$i") ) }

$file.close();

Saturday, March 6, 2010

Powershell Links – 2010-03-06

Here are some of the more interesting Powershell links I came across recently, along with some thoughts about them.

Powershell, String Encryption, and GPGSQLServerCentral.com brought this one to my attention. Chad Miller discusses working with Oracle through Powershell, and the subject of encrypting the connection strings came up. There are times that we need to connect using something other than Windows authentication. Storing an encrypted connection string makes a lot of sense in those cases. I don’t normally work with Oracle, but the idea of passing and saving encrypted connection strings makes a lot of sense. I do a lot of work against SQL Servers that require SQL Logins so plan to investigate this more thoroughly in the near future.

Powershell Cheatsheet – If you don’t follow the #Powershell hashtag on Twitter, you may have missed out on this useful set of Powershell tips and tricks, including a way to set your current location to a UNC path. This is cool because it’s something you cannot do with a standard command prompt without mapping a drive. There are other tips here that make this well worth bookmarking if you are getting started with Powershell.

The T-SQL Hammer – This is taken from Chad Miller’s blog, but is a great reminder that just because your typical DBA knows T-SQL and thinks in T-SQL does not make that the best solution for everything. If you’re wondering why a DBA might want to learn Powershell, give this a read.

Powershell Help Brower – This script uses Primal Forms and Powershell to build a Treeview Help browser for Powershell. I tested this out and the only entry that seemed to give me fits was trying to get help for Get-ChildItem. That constantly threw an error for me. However, the rest was very nice – an easy way to browse the available commands and see the help for them.

As with anything else you find on the Internet, don’t just copy these scripts and run them. Check them out carefully and be sure you know what’s going to happen. (Someday I’m just gonna copy Buck Woody’s disclaimer for scripts and give attribution. :)  )