Blogger Jeff = new Blogger

Programming and stuff in Western PA

Using Linq to query a CSV File

I know there have been a ton of times I’ve had to deal directly with Comma Delimited files. Searching through the contents for certain pieces of data is usually not a whole lot of fun. However with Linq, things become a bit more enjoyable!

To start off, we’ll create a csv file with the following contents :

"Mets","New York","NL"
"Marlins","Florida","NL"
"Orioles","Baltimore","AL"
"Pirates","Pittsburgh","NL"
"Phillies","Philadelphia","NL"

Save that some place and name it “teams.csv”. Then we’ll add the following namespace to our class file :

using System.Data.OleDb

Then using OleDb, we can access the csv file, and load a data table into a data adapter :

String conn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\;
    Extended Properties=""Text;HDR=No;FMT=Delimited""";

OleDbConnection cn = new OleDbConnection(conn);
OleDbCommand cmd = new OleDbCommand(@"SELECT * FROM C:\Temp\teams.csv", cn);
OleDbDataAdapter da = new OleDbDataAdapter(cmd);

cn.Open();

DataTable dt = new DataTable();

da.Fill(dt);

At this point, we are good to go to query the datatable with Linq. Our entire function looks like this :

static void QueryCsv()
{

    OleDbConnection cn = new OleDbConnection (@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=
C:\;Extended Properties=""Text;HDR=No;FMT=Delimited""");
    OleDbCommand cmd = new OleDbCommand(@"SELECT * FROM C:\Temp\teams.csv", cn);
    OleDbDataAdapter da = new OleDbDataAdapter(cmd);

    cn.Open();

    DataTable dt = new DataTable();

    da.Fill(dt);

    var teams = from r in dt.AsEnumerable()
                where r.Field(2) == "NL"
                select new { City = r.Field(0), TeamName= r.Field(1) };

    foreach (var team in teams)
    {
        Console.WriteLine(String.Format("The {0} {1}",team.TeamName,team.City));
    }

    Console.ReadLine();

    cn.Close();

}

If you run this you should see :

Result Set

Now if your life is like mine, the client will always throw a screwball at you and say something like : “We need to return everything from the csv file where the record also exists in this other data source.” Linq makes this easy too. We’ll create a quick class to demonstrate to represent the other data source :

public class Mascot
{
    public string Name;
    public string City;
    public string Player;
}

We’ll then create a list with our data in it

List mlb = new List
{
    new Mascot{City="Philadelphia",Name="Philllies",Player="Cole Hammels"},
    new Mascot{City="Boston",Name="Red Sox",Player="Curt Schilling"},
    new Mascot{City="Florida",Name="Marlins",Player="Dontrelle Willis"},
    new Mascot{City="Cleveland",Name="Indians",Player="Jake Westbrook"},
    new Mascot{City="Baltimore",Name="Braves",Player="Eric Bedard"},
    new Mascot{City="Los Angeles",Name="Dodgers",Player="Randy Wolf"}
};

Adjusting our original code, we can query and join the two data sources to display some common data :

static void QueryCsvWithJoin()
{

	 String conn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\;
	     Extended Properties=""Text;HDR=No;FMT=Delimited""";

	 List mlb = new List
	 {
	     new Mascot{City="Philadelphia",Name="Philllies",Player="Cole Hammels"},
	     new Mascot{City="Boston",Name="Red Sox",Player="Curt Schilling"},
	     new Mascot{City="Florida",Name="Marlins",Player="Dontrelle Willis"},
	     new Mascot{City="Cleveland",Name="Indians",Player="Jake Westbrook"},
	     new Mascot{City="Baltimore",Name="Braves",Player="Eric Bedard"},
	     new Mascot{City="Los Angeles",Name="Dodgers",Player="Randy Wolf"}
	 };

	 OleDbConnection cn = new OleDbConnection(conn);
	 OleDbCommand cmd = new OleDbCommand(@"SELECT * FROM C:\Temp\teams.csv", cn);
	 OleDbDataAdapter da = new OleDbDataAdapter(cmd);

	 cn.Open();

	 DataTable dt = new DataTable();

	 da.Fill(dt);

	 var teams = from r in dt.AsEnumerable()
	             join o in mlb on r.Field(1) equals o.City
	             where r.Field(2) == "NL"
	             select new { City = r.Field(0), TeamName= r.Field(1),Player=o.Player }; 

	 foreach (var team in teams)
	 {
	     Console.WriteLine(String.Format("Pitcher {0} from the {1} {2}", team.Player,team.TeamName, team.City));
	 }

	 Console.ReadLine();

	 cn.Close(); 

}

When run this, we get

Result Set 2

Tags: , , , ,

January 28, 2008 Posted by | .Net, C#, Linq | 14 Comments

Using Linq to query Xml

If you’ve used the .Net’s framework new Linq functionality, you’ve no doubt seen how nice it is to query a data object directly. For me, one of my favorite parts of Linq is being able to query an Xml object. Gone are the days are writing awkward XPath queries, or "ripping through" an Xml document.

Here’s the old school way of querying Xml :

static void OldSchool()
{ 
    string teams = @" " +
        "PhilliesNL" +
        "YankeesAL" +
        "AngelsAL" +
        "BrewersNL" +
        " ";

    XmlDocument xml = new XmlDocument();
    xml.LoadXml(teams);

    XmlNodeList NLTeams = xml.SelectNodes("//League[. = 'NL']/parent::node()/Name");

    foreach (XmlNode team in NLTeams)
        Console.WriteLine(team.InnerText);
}

And here’s the new school way :

static void NewSchool()
{
    XElement teams = XElement.Parse(@" " +
        "PhilliesNL" +
        "YankeesAL" +
        "AngelsAL" +
        "BrewersNL" +
        " ");
    
    IEnumerable NLTeams = from t in teams.Descendants("Team")
                   where t.Element("League").Value.Equals("NL")
                   select t.Element("Name").Value;

    foreach (var team in NLTeams)
        Console.WriteLine(team);
}

You’re pretty much writing the same amount of code, but the Linq way is more preferable in that it standardizes a way to query and object in .net for data.

Tags: , , ,

January 14, 2008 Posted by | .Net, C#, Linq, Technology, Xml | Leave a comment

SQL Server 2005 Productivity tips

(1)  Synonyms
Tired of typing "SELECT ID FROM TBL_THELONGESTTABLENAMEINHISTORY" ? Then create a synonym and make life easier for yourself.

CREATE SYNONYM [dbo].[mininame]
FOR [NorthWind].[dbo].[TBL_THELONGESTTABLENAMEINHISTORY]

Now to access that mama-jamma, your query is whittled down to :

SELECT ID FROM mini


(2) Use those Templates young man!
SQL Management Studio supplies you with a wealth of pre-defined templates. Want the basic syntax for a cursor, or need the full create syntax for a table constraint?  Just open up the Templates window and drag it onto your SQL query window. 

But if you’re like me, you may have to run the same code a number of times, and templates are a wonderful place to store stuff like that.  All you have to do is right click on the SQL Server Template folder in the Template Explorer and choose to add a new template.  For my personal templates, I even created my own folder to store them in:

You can even add your own parameters to templates, and be prompted for the values at design time.  Parameters are enclosed in less than/greater than signs, take the form of <Parameter Name,Type,Value> :

IF EXISTS (

 SELECT * 

 FROM INFORMATION_SCHEMA.ROUTINES 

 WHERE SPECIFIC_SCHEMA = N'<Schema_Name, sysname, Schema_Name>'

 AND SPECIFIC_NAME = N'<Procedure_Name, sysname, Procedure_Name>' 

)

 DROP PROCEDURE <Schema_Name, sysname, Schema_Name>.<Procedure_Name, sysname, Procedure_Name>

GO



CREATE PROCEDURE <Schema_Name, sysname, Schema_Name>.<Procedure_Name, sysname, Procedure_Name>

 <@param1, sysname, @p1> <datatype_for_param1, , int> = <default_value_for_param1, , 0>, 

 <@param2, sysname, @p2> <datatype_for_param2, , int> = <default_value_for_param2, , 0>

AS

 SELECT @p1, @p2

GO

After saving your template, you can then add it to an existing script in a query window, and then click Ctrl-Shift-M to bring up a dialogue box for you to fill in your parameter values :

(3) Metadata views

I recently had to dive into a huge (50 gig) database.  I had to generate some rather large stored procedures that would delete a common key from a lot of tables. To learn the database, I used the information schema views a lot.  I was able to determine if a column existed in all the tables by querying the COLUMNS and TABLES views. I even modified the Sql to generate the delete statements for me, like this:

SELECT 'DELETE ' + a.TABLE_NAME + ' WHERE ProductID = @ProductID'
FROM information_schema.columns a
INNER JOIN INFORMATION_SCHEMA.TABLES b ON a.table_name=b.table_name
WHERE column_name LIKE '%ProductID%'
AND table_type='BASE TABLE'
ORDER BY b.table_name

-------------------
RESULTS
-------------------
>>DELETE Order Details WHERE ProductID = @ProductID
>>DELETE Products WHERE ProductID = @ProductID

What I had to do when I was generating this for 50 tables was run the sql and see if any constraint errors were thrown.  Most likely, I just had to change the order of the delete statements.

(4) Use Sql Profiler to see the Sql your app runs

SQL profiler is best known for troubleshooting performance issues in SQL Server.  But the thing I usually use it for is to see exactly what sql gets run by an application.  Curious to see what happen when you use the neat little WYSIWYG editor in Crystal Reports to query the database?  Run a quick profiler trace.  Want to see what tables are getting accessed by your application, run a profiler trace. The nice thing about getting to the trace data is that you can also copy and paste it directly into SQL Management Studio, or into a separate stored procedure.  Which brings us to our next tip….

(5) Debugging stored procedures

In SQL Server 2005, the only way you can step into a stored procedure is if you create a Visual Studio.Net 2005 Database Project. If you right-click on the stored procedure, and then select step into, you’ll also be prompted to fill in any needed parameter values.  This can be a real pain in the hind quarters if there are a lot of parameters. What I end up doing is saving the sql that gets executed into a separate, temporary stored procedure, and then stepping into that procedure.

Tags:

January 14, 2008 Posted by | .Net, SQL Server | 3 Comments

C# 3.0 Extension Methods

C# 3.0 introduces a concept known as extension methods.  What these allow you to do is to "extend" static methods onto an existing instance class.  Trust me, it’s a lot easier to understand when seeing an example.  In mine, I’ll extend a FormatSSN method onto the string object :

class Program
{
    static void Main(string[] args)
    {
        String val = "987654321";
        Console.WriteLine(val.FormatSSN());
        Console.ReadLine();
    }
}

public static class Utilities
{
    public static String FormatSSN(this String input)
    {
        return String.Format("{0}-{1}-{2}", input.Substring(0, 3),
            input.Substring(3, 2), input.Substring(5, 4));
    }
}

Microsoft says "Extension methods are less discoverable and more limited in functionality than instance methods. For those reasons, it is recommended that extension methods be used sparingly and only in situations where instance methods are not feasible or possible."

So why would you ever want to use an extension method?  I would say for convenience purposes, like in the example above, perhaps you already have a pre-existing class and rather than go back and add a "FormatSSN" method, you could simply extend one to the current code base.

Tags: ,

December 1, 2007 Posted by | .Net, C# | Leave a comment

C# coalesce operator

Damn Microsoft! I finally thought I had my thick head wrapped around C# when they decide to come out with a boatload of new features to further confuse, err, I mean enhance things.

I was quite happy to read that they added a coalesce operator to C#. For those of you who use Sql Server on a regular basis, you’re probably quite familiar with this function. What it does is allow you to pass a list of values, and will return the first non-null value.

For instance, in previous versions of C# we might have code that looks like this :

After running this, the string “brian” gets outputted to the console. We can now rewrite this a little cleaner with the new coalesce (??) operator :

After running this, you’ll see we get the same results. Thanks again to the C# team for giving me the capability to make my code even more unreadable 😉

Tags: , ,

November 23, 2007 Posted by | .Net, C#, Technology | 3 Comments