Blogger Jeff = new Blogger

Programming and stuff in Western PA

Using Linq to query a CSV File

I know there have been a ton of times I’ve had to deal directly with Comma Delimited files. Searching through the contents for certain pieces of data is usually not a whole lot of fun. However with Linq, things become a bit more enjoyable!

To start off, we’ll create a csv file with the following contents :

"Mets","New York","NL"
"Marlins","Florida","NL"
"Orioles","Baltimore","AL"
"Pirates","Pittsburgh","NL"
"Phillies","Philadelphia","NL"

Save that some place and name it “teams.csv”. Then we’ll add the following namespace to our class file :

using System.Data.OleDb

Then using OleDb, we can access the csv file, and load a data table into a data adapter :

String conn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\;
    Extended Properties=""Text;HDR=No;FMT=Delimited""";

OleDbConnection cn = new OleDbConnection(conn);
OleDbCommand cmd = new OleDbCommand(@"SELECT * FROM C:\Temp\teams.csv", cn);
OleDbDataAdapter da = new OleDbDataAdapter(cmd);

cn.Open();

DataTable dt = new DataTable();

da.Fill(dt);

At this point, we are good to go to query the datatable with Linq. Our entire function looks like this :

static void QueryCsv()
{

    OleDbConnection cn = new OleDbConnection (@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=
C:\;Extended Properties=""Text;HDR=No;FMT=Delimited""");
    OleDbCommand cmd = new OleDbCommand(@"SELECT * FROM C:\Temp\teams.csv", cn);
    OleDbDataAdapter da = new OleDbDataAdapter(cmd);

    cn.Open();

    DataTable dt = new DataTable();

    da.Fill(dt);

    var teams = from r in dt.AsEnumerable()
                where r.Field(2) == "NL"
                select new { City = r.Field(0), TeamName= r.Field(1) };

    foreach (var team in teams)
    {
        Console.WriteLine(String.Format("The {0} {1}",team.TeamName,team.City));
    }

    Console.ReadLine();

    cn.Close();

}

If you run this you should see :

Result Set

Now if your life is like mine, the client will always throw a screwball at you and say something like : “We need to return everything from the csv file where the record also exists in this other data source.” Linq makes this easy too. We’ll create a quick class to demonstrate to represent the other data source :

public class Mascot
{
    public string Name;
    public string City;
    public string Player;
}

We’ll then create a list with our data in it

List mlb = new List
{
    new Mascot{City="Philadelphia",Name="Philllies",Player="Cole Hammels"},
    new Mascot{City="Boston",Name="Red Sox",Player="Curt Schilling"},
    new Mascot{City="Florida",Name="Marlins",Player="Dontrelle Willis"},
    new Mascot{City="Cleveland",Name="Indians",Player="Jake Westbrook"},
    new Mascot{City="Baltimore",Name="Braves",Player="Eric Bedard"},
    new Mascot{City="Los Angeles",Name="Dodgers",Player="Randy Wolf"}
};

Adjusting our original code, we can query and join the two data sources to display some common data :

static void QueryCsvWithJoin()
{

	 String conn = @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\;
	     Extended Properties=""Text;HDR=No;FMT=Delimited""";

	 List mlb = new List
	 {
	     new Mascot{City="Philadelphia",Name="Philllies",Player="Cole Hammels"},
	     new Mascot{City="Boston",Name="Red Sox",Player="Curt Schilling"},
	     new Mascot{City="Florida",Name="Marlins",Player="Dontrelle Willis"},
	     new Mascot{City="Cleveland",Name="Indians",Player="Jake Westbrook"},
	     new Mascot{City="Baltimore",Name="Braves",Player="Eric Bedard"},
	     new Mascot{City="Los Angeles",Name="Dodgers",Player="Randy Wolf"}
	 };

	 OleDbConnection cn = new OleDbConnection(conn);
	 OleDbCommand cmd = new OleDbCommand(@"SELECT * FROM C:\Temp\teams.csv", cn);
	 OleDbDataAdapter da = new OleDbDataAdapter(cmd);

	 cn.Open();

	 DataTable dt = new DataTable();

	 da.Fill(dt);

	 var teams = from r in dt.AsEnumerable()
	             join o in mlb on r.Field(1) equals o.City
	             where r.Field(2) == "NL"
	             select new { City = r.Field(0), TeamName= r.Field(1),Player=o.Player }; 

	 foreach (var team in teams)
	 {
	     Console.WriteLine(String.Format("Pitcher {0} from the {1} {2}", team.Player,team.TeamName, team.City));
	 }

	 Console.ReadLine();

	 cn.Close(); 

}

When run this, we get

Result Set 2

Tags: , , , ,

January 28, 2008 Posted by | .Net, C#, Linq | 14 Comments

Using Linq to query Xml

If you’ve used the .Net’s framework new Linq functionality, you’ve no doubt seen how nice it is to query a data object directly. For me, one of my favorite parts of Linq is being able to query an Xml object. Gone are the days are writing awkward XPath queries, or "ripping through" an Xml document.

Here’s the old school way of querying Xml :

static void OldSchool()
{ 
    string teams = @" " +
        "PhilliesNL" +
        "YankeesAL" +
        "AngelsAL" +
        "BrewersNL" +
        " ";

    XmlDocument xml = new XmlDocument();
    xml.LoadXml(teams);

    XmlNodeList NLTeams = xml.SelectNodes("//League[. = 'NL']/parent::node()/Name");

    foreach (XmlNode team in NLTeams)
        Console.WriteLine(team.InnerText);
}

And here’s the new school way :

static void NewSchool()
{
    XElement teams = XElement.Parse(@" " +
        "PhilliesNL" +
        "YankeesAL" +
        "AngelsAL" +
        "BrewersNL" +
        " ");
    
    IEnumerable NLTeams = from t in teams.Descendants("Team")
                   where t.Element("League").Value.Equals("NL")
                   select t.Element("Name").Value;

    foreach (var team in NLTeams)
        Console.WriteLine(team);
}

You’re pretty much writing the same amount of code, but the Linq way is more preferable in that it standardizes a way to query and object in .net for data.

Tags: , , ,

January 14, 2008 Posted by | .Net, C#, Linq, Technology, Xml | Leave a comment