>LINQ Operators – Part IV

>

Grouping operators

GroupBy : GroupBy is used to group a sequence into subsets. At first look it feels like it is same as T-SQL "Group By" but it is not. It slices a sequence into multiple "rows" where each row have a "key" and subset of sequence matching to those key. Here is a screen shot how it looks with this simple query:

Customers.GroupBy( c => c.Country )

 

groupBy

GroupBy has an easy syntax with some overloads. You can optionally specify an element selector (projection) and also optionally specify a custom equality comparer. Equality comparer is less used and not different from the sample we used in OrderBy().

Here is a sample with an element selector:

Customers
   .GroupBy (
      c => c.Country,
      c =>
         new 
         {
            CustomerID = c.CustomerID,
            CompanyName = c.CompanyName
         }
   ) 
 
 
groupBySelector

Latter GroupBy selects only two properties of Customers instead of all properties. You can also group by multiple keys:

Customers
   .GroupBy (
      c =>
         new 
         {
            Country = c.Country,
            Region = c.Region
         }, // keySelector
      c =>
         new 
         {
            CustomerID = c.CustomerID,
            CompanyName = c.CompanyName
         }  //elementSelector
   ) 
 

groupByMultipleKeys

In all these queries you are getting a sequence of "groupings". You can "browse" the result with nested foreach() which is sort of nested scan…endscan:

var byCountry = Customers.GroupBy ( c => c.Country );
  
foreach(var g in byCountry)
{
   Console.WriteLine("Country: {0}", g.Key);
   foreach(var customer in g)
   {
      Console.WriteLine("\t {0}\t{1,-50}\t{2}",
         customer.CustomerID,
         customer.CompanyName,
         customer.ContactName);
   }
Partial result:

groupByMultipleKeys3

 

In other words we are dealing with a sequence of sequences. With LinqToSQL inner sequences are queried as they are needed resulting in "groups count" + 1 selects send to backend. This may or may not be the behavior you want depending on where you need it (you can see the series of SQLs in LinqPad’s SQL tab). To prevent this behavior you can convert Customers to an IEnumerable  from IQueryable (or to list, array…) first and then do grouping. For example:

Customers.AsEnumerable().GroupBy( c => c.Country )

AsEnumerable() is a conversion operator that converts the IQueryable to an IEnumerable ( in other words from a remote source to local source – similar to cursors ). We will use AsEnumerable() in some of the following samples that show statistics, nested grouping, distinct sequences.

This sample shows some statistics about Customers grouped by country:

Orders.GroupBy (
  o => o.OrderDate.GetValueOrDefault().Year )
.Select (
   sales =>
   new 
   {
     Year = sales.Key,
     Customers = sales
       .Select( s => s.CustomerID )
       .Distinct()
       .Count(),
     Countries = sales
       .Select( s => s.ShipCountry )
       .Distinct()
       .Count(),
     FirstSale = sales.Min( s => s.OrderDate ),
     LastSale  = sales.Max( s => s.OrderDate )
   }
  ).OrderBy( s => s.Year ) 

This sample groups customerId and countries by years they made an "order":

Orders.AsEnumerable()
  .GroupBy( o => o.OrderDate.GetValueOrDefault().Year )
  .Select (
      sales =>
      new 
      {
        Year = sales.Key,
        Customers = sales
            .Select( s => s.CustomerID )
            .Distinct(),
        Countries = sales
            .Select( s => s.ShipCountry )
            .Distinct()
       }
   ) 

This sample is a multilevel grouping. It first groups customers by first letter of countries and then by countries:

Customers.AsEnumerable()
.GroupBy( c => c.Country [0] )
.Select( c =>
      new 
      {
        c.Key,
        countries =
           c.GroupBy( cn => cn.Country )
            .Select( cs =>
                 new 
                 {
                   cs.Key,
                   clist =
                      cs.Select( cf =>
                        new 
                        {
                           cf.CompanyName,
                           cf.Country,
                           cf.Region})
                        }).OrderBy( cn => cn.Key )
         }).OrderBy( c => c.Key )

There is another grouping operator called ToLookup() that creates one to many dictionary. I reserved it into another category.

Advertisements
Posted in Uncategorized | Leave a comment

>LINQ Query Operators – Part III

>

Ordering Operators:

Ordering operators like T-SQL "Order By" is used for ordering sequences in different ways. There are 5 ordering operators:

  1. OrderBy
  2. OrderByDescending
  3. ThenBy
  4. ThenByDescending
  5. Reverse

Except the last operator Reverse() all those are used as:

operator( keySelector ) or operator( keySelector, customComparer )  – customComparer is for custom sorting and not supported by L2S (Reverse() is not supported either).

You should have noticed that "descending" is not a parameter but separate methods exist to sort descending.

OrderBy: Orders the sequence by a given "keySelector". i.e:

Linq
T-SQL
Customers.OrderBy( c => c.ContactName )
Select * from Customers order by ContactName

 

Seconds overload accepts a comparer. Here is a sample that sorts orders by :

// In LinqPad select C# Program from language combo
// Delete default code
// Copy Paste or better hands on, write yourself
// and press F5 to run

void Main()
{
// This overload is not supported by L2S
// Getting into a local list first
// We would review later what does "local" mean
// for now just consider queries on ordList 
// is not Linq To SQL 
// (sort of you are querying a local cursor which
// you created by a remote call to SQL server)
 
var ordList = Orders.Select( o => 
                  new {
                           OrderId = o.OrderID,
                           OrderDate = o.OrderDate
                         }).ToList();
 
// creating an instance of custom comparer
// and passing it as a parameter
// final Select projection is about showing
// the result of sort
// Note that the sort is by "Day Of Week" enumeration
// and not by "WeekDay" alphabetically

var myCustomList = 
               ordList
                  .OrderBy( o => 
                                 o.OrderDate, 
                                 new OrderComparer() ) 
                 .Select( o => new {
                            o.OrderId,
                            o.OrderDate, 
                            o.OrderDate.GetValueOrDefault().DayOfWeek} );

myCustomList.Dump();
}

// Here is our custom sorting class
// It accepts parameters of type nullable DateTime
// and returns comparison result based on
// DayOfWeek – like DOW() in VFP

public class OrderComparer : IComparer<DateTime?>
{
   public int Compare(DateTime? x, DateTime? y)
   {
     return 
        x.GetValueOrDefault().DayOfWeek
         .CompareTo( 
        y.GetValueOrDefault().DayOfWeek ); 
    }
}

 

OrderByDescending: Same as OrderBy() except that it sorts in descending order (it was obvious,isn’t it).

ThenBy, ThnenByDescending: ThenBy and ThenByDescending are used to sort on multiple columns. Hmm why would you need something like this? Can’t you simply use chain of OrderBy(), OrderByDescending() operators? Of course you can but I think these were added to prevent you from writing it the way you intended (like in natural English you would be saying order by X then by Y then by Z). Probably this is best explained by a sample. We want to sort customers by their country and then by company name. It is very easy to write it as in first query and fail to get what we really intended.

Linq Corresponding SQL
Customers
   .OrderBy(c => c.Country)
   .OrderBy(c => c.CompanyName)
   .Select( c => new {c.Country, c.CompanyName} )
Select [t0].[Country], [t0].[CompanyName]
FROM [Customers] As [t0]
ORDER BY [t0].[CompanyName], [t0].[Country]
Customers
   .OrderBy(c => c.CompanyName)
   .OrderBy(c => c.Country)
   .Select( c => new {c.Country, c.CompanyName} )

SELECT [t0].[Country], [t0].[CompanyName]
FROM [Customers] AS [t0]
ORDER BY [t0].[Country], [t0].[CompanyName]

Customers
   .OrderBy( c => c.Country)
   .ThenBy  ( c => c.CompanyName)
   .Select( c => new {c.Country, c.CompanyName} )

SELECT [t0].[Country], [t0].[CompanyName]
FROM [Customers] AS [t0]
ORDER BY [t0].[Country], [t0].[CompanyName]

In SQL the order of columns dictate it and it is easy to write exactly what we want ( order by country and then by company ). However in Linq method syntax, keep in mind with each and every chained method we get a new intermediate sequence as a result ( and you could see how a series of methods is working by attaching a "Dump()" to each of those methods ). i.e:

Customers
  .Take(10).Dump("Got top 10")
  .Select( c => new {c.Country, c.CompanyName} ).Dump("Projected to 2 columns: Country,Company")
  .OrderBy( c => c.Country).Dump("Sorted on Country")
  .OrderBy( c => c.CompanyName).Dump("Sorted on Company")

 

As you can see we are first sorting on country and "then by" without a ThenBy() operator on company. The net effect we are sorting on company name only. ThenBy() operator excellently translates our intention about "by Country  THEN BY  Company Name".

Reverse: On a local sequence reverses the sequence as its name suggest. i.e:

int[] numbers = {1,2,3,4,5};
numbers.Reverse().Dump();

string reversed = new string(
  "Reverse Me".Reverse().ToArray()
  );

Console.WriteLine(reversed);

Posted in Uncategorized | Leave a comment

>LINQ Query Operators – Part II

>

Projection Operators:

Select: Selects items from a source, projecting the results as necessary.

When the input sequence would be selected as is without any projection ( aka * in T-SQL ) using Select() is not necessary and looks a little ugly. i.e.:

Customers.Select( c => c )
 

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

Instead we would simply write just "Customers" which is the enumerable itself. In query syntax it is the same as writing:

from c in Customers select c
 
Here are some Select projections and its Query expression, T-SQL counterparts:
 
T-SQL
Select CompanyName from customers
Query Syntax
from c in Customers
     select c.CompanyName
Method Syntax
Customers.Select( c => c.CompanyName )
 
T-SQL
Select 
  firstName, lastName,
  firstName +  ‘ ‘ + lastName As fullName
from employees
Query Syntax
from e in Employees
select 
  new {
        e.FirstName,
        e.LastName,
        fullName = e.FirstName + " " + e.LastName }
Method Syntax
Employees
  .Select( e =>
  new {
     e.FirstName,
     e.LastName,
     fullName = e.FirstName + " " + e.LastName } )
 
There is another overload of Select() which accepts an index parameter and is not supported by Linq To SQL and EF. Here is a simple example that lists the DBF files from VFP samples data folder enumerating them with their position (reminder: in C# indexing, arrays start from 0):
 
Directory.GetFiles(
   @"c:\Program Files\Microsoft Visual Foxpro 9\Samples\Data""*.dbf")
   .Select( (filename,position) =>
        new {fileNo = position+1,filename} )
 
In the samples above, with the "new" keyword the type created is an anonymous type. Sometimes we would want to create a type of our own (similar to "into cursor cursorName" – cursorName might a different structure than the source). To do that we could create a new class and qualify the "new" type with that class. i.e.:
 
void Main()
{
  var q = from c in Customers
          select new
          myCustomer {
            ID = c.CustomerID,
            company = c.CompanyName,
            orderCount = c.Orders.Count
          };

  q.Dump();
}

public class myCustomer
{
 public string ID {get;set;}
 public string company {get;set;}
 public int orderCount {get;set;}
}

 
This produces the SQL:
 

SELECT
  [t0].[CustomerID] AS [ID],
  [t0].[CompanyName] AS [company],
  (
    SELECT COUNT(*)
    FROM [Orders] AS [t1]
    WHERE [t1].[CustomerID] = [t0].[CustomerID]
  ) AS [orderCount]
FROM [Customers] AS [t0]

SelectMany: The Select() method works as Select( input =>output ) where it produces one output per input. Consider a query getting customers and orders of customers (maybe customerID and CompanyName from Customers and OrderID, OrderDate from orders to show simpler output):
 
Select 
   [t0].[CustomerID], [t0].[CompanyName], [t1].[OrderID] As [OrderID], [t1].[OrderDate]
FROM [Customers] As [t0], [Orders] As [t1]
WHERE [t1].[CustomerID] = [t0].[CustomerID]
 
This trivial SQL generates a 2D flat table.  If we attempt to create this with a Select() we could do that using an expression like below:
 

Orders.Select( o =>
   new 
   {
     o.Customer.CustomerID,
     o.Customer.CompanyName,
     o.OrderID,
     o.OrderDate
   } )

Which produce this SQL:
 
Select 
[t1].[CustomerID], [t1].[CompanyName], 
[t0].[OrderID], [t0].[OrderDate]
FROM [Orders] As [t0]
LEFT OUTER JOIN [Customers] As [t1] 
On [t1].[CustomerID] = [t0].[CustomerID]
 
and the flat 2D result that we are accustomed to. If you look at the SQL closely you would notice that it is sort of cheat creating an inner join (Orders driving the results and already each Order has a "parent" Customer, this in effect is an inner join). We could also look at it as:
 
use Customers order tag CustomerID
select 0
use Orders
set relation to CustomerID into Customers
 
The old Indian trick that we used in reports in old days to get all orders of customers using relations. If we did the relation from Customers into Orders and then browse each table side by side remember per Customer in left browse we would see "Many" Orders on right. Or better yet we all wanted to have Grid in Grid or hierarchical grids showing one Customer in a row and her Orders in a grid in another grid cell. Doing that is not trivial except using relations and that would be a single "thing". Now that we are in objects world a single object would be representing a data structure where per customer "row" there are multiple order "rows" (think of a report where data is grouped on CustomerID). If we have started from Customers (and remember we get one output per input) we would end up with such an "object" which is an ordered enumerable of Customers along with its Orders. Here is our new query:
 
Customers.Select( c =>
   new  
   {
     c.CustomerID,
     c.CompanyName,
     OrderList = c.Orders.Select( o =>  new {o.OrderID, o.OrderDate})
   } )
 
Producing SQL:
 
Select 
  [t0].[CustomerID], 
  [t0].[CompanyName], 
  [t1].[OrderID], [t1].[OrderDate], 
  (
    Select COUNT(*)
    FROM [Orders] As [t2]
    WHERE [t2].[CustomerID] = [t0].[CustomerID]
  ) As [value]
FROM [Customers] As [t0]
  LEFT OUTER JOIN [Orders] As [t1]
  On [t1].[CustomerID] = [t0].[CustomerID]
ORDER BY [t0].[CustomerID], [t1].[OrderID]
 
If we run this SQL we would get a flat list as always. However the Linq query returns an object result which is more like "parent-child" packed data (like a 3D table). Here is part of output from LinqPad to understand better (and I hope you are following these testing in LinqPad yourself):
 
Select
 
This is more like grouping that we may like. We may want this as a flat 2D as we are accustomed to and SelectMany() comes into play there.  With query syntax it is as easy as:
 
from c in Customers
from o in c.Orders
select new
   {
      c.CustomerID,
      c.CompanyName,
      o.OrderID,
      o.OrderDate
   }
 
this is converted to SelectMany() as shown here:
 
Customers
   .SelectMany (
      c => c.Orders, 
      (c, o) => 
         new  
         {
            CustomerID = c.CustomerID, 
            CompanyName = c.CompanyName, 
            OrderID = o.OrderID, 
            OrderDate = o.OrderDate
         }
   )
 
This concludes projection operators subject.
 
Note: I intentionally didn’t mention some details about operators shown and left out some operator(s) that may be part of categories we talked about in order to prevent confusion. Also note that these are standard operators. Linq is evolving and already there are many more operators than you would see in standard operators set. Trying to keep it simple and summarized I might be skipping details more than I should, I hope not.
Posted in Uncategorized | Leave a comment

>LINQ Query Operators – Part I

>

Orders.OrderBy(o => o.OrderDate).Skip(5)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

Before checking the operators lets remind some points.

  • Query operators are extension methods
  • Some Linq operations are only available with operators (or to say it otherwise "method syntax")
  • Syntax includes the source itself which need not be specified (because implemented as static extension method on IEnumerable, IOW source). Typical syntax looks like:

    public static IEnumerable<TSource> Where<TSource>( this IEnumerable<TSource> source, Func<TSource, Boolean> predicate)

      Looks like Chinese? It is very easy once you understand indeed. You can think of it as:

     IEnumerable<TSource> : It returns something that can be enumerated ( like an array, list, records …) – and its type is symbolically TSource.

     Where<TSource>( this IEnumerable<TSource> source :  First parameter is also something that can be enumerated and since it is an extension method of IEnumerable itself we don’t need to feed source parameter.

     Up to this point it is simple, gets something like a list and returns back another list. Just like an SQL select does.

     Func<TSource, Boolean> predicate) : Here comes the interesting part. It accepts a "function" that has the TSource type input parameter (you can think of TSource as a row of customer when the source is customers table) and returns back true or false. If true is returned "record" is selected.

     Now lets check this "Where" using in a Linq query and compare to its counterpart SQL select and Linq query syntax:

SQL select:

select * from Customers where Customers.Country = "USA" into cursor usaCustomers

LINQ Query Syntax:

IEnumerable<Customers> usaCustomers = from c in Customers where c.Country == "USA" select c;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }LINQ Method Syntax:

IEnumerable<Customers> usaCustomers = Customers.Where( c => c.Country == "USA" );

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 

Lets dissect the method syntax and see different ways of writing the same thing:

Customers.Where( c => c.Country == "USA" );

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

TSource is Customers. In method Where(), c is the input parameter to function (remember it represents something like a Customers row, so it has columns as properties). it returns the result of check: c.Country == "USA" which is a true or false. We could write the same thing like this:

Customers.Where( ( c ) => c.Country == "USA" );

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

When parameter count is one, we wouldn’t want to do that however and simply use the previous syntax. When parameter count is multiple we put them within parentheses.

For readability we are allowed to write the exact same thing like this (the newline and extra whitespace between Customers and .Where is insignificant):

IEnumerable<Customers> usaCustomers = Customers
                                      .Where( c => c.Country == "USA" );

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 

Now that I think we are armed with basic syntax we can move on. LINQ operators are divided into categories. Members of categories may be different  from source to source but in general all sources have same categories (and some operators are listed in more than one category).

Filtering and Partitioning Operators:

Where:  This is the filtering operator which any VFP developer would be most familiar with. It has two forms

  1. Typical function getting a source and returning a boolean (filters out for the ones returning false)

    var usaCustomers = Customers.Where( c => c.Country == "USA" );

    .csharpcode, .csharpcode pre
    {
    font-size: small;
    color: black;
    font-family: consolas, “Courier New”, courier, monospace;
    background-color: #ffffff;
    /*white-space: pre;*/
    }
    .csharpcode pre { margin: 0em; }
    .csharpcode .rem { color: #008000; }
    .csharpcode .kwrd { color: #0000ff; }
    .csharpcode .str { color: #006080; }
    .csharpcode .op { color: #0000c0; }
    .csharpcode .preproc { color: #cc6633; }
    .csharpcode .asp { background-color: #ffff00; }
    .csharpcode .html { color: #800000; }
    .csharpcode .attr { color: #ff0000; }
    .csharpcode .alt
    {
    background-color: #f4f4f4;
    width: 100%;
    margin: 0em;
    }
    .csharpcode .lnum { color: #606060; }

  2. A function getting an index parameter besides source (this one is not supported with Linq To SQL or Linq to Entity Framework because there isn’t a supporting SQL for this. Practically speaking methods that do not a SQL counterpart do not work with L2S).

    int[] numbers = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
     
    var some = numbers
        .Where((c, index) => ( 
                             (c%2 == 0) &&  
                             (index >= 8) && (index <= 15)));
     
    some.Dump("Even numbers between positions 8-15");

    .csharpcode, .csharpcode pre
    {
    font-size: small;
    color: black;
    font-family: consolas, “Courier New”, courier, monospace;
    background-color: #ffffff;
    /*white-space: pre;*/
    }
    .csharpcode pre { margin: 0em; }
    .csharpcode .rem { color: #008000; }
    .csharpcode .kwrd { color: #0000ff; }
    .csharpcode .str { color: #006080; }
    .csharpcode .op { color: #0000c0; }
    .csharpcode .preproc { color: #cc6633; }
    .csharpcode .asp { background-color: #ffff00; }
    .csharpcode .html { color: #800000; }
    .csharpcode .attr { color: #ff0000; }
    .csharpcode .alt
    {
    background-color: #f4f4f4;
    width: 100%;
    margin: 0em;
    }
    .csharpcode .lnum { color: #606060; }

1st syntax simply select customers from USA. 2nd syntax limits the "records" to position (sort of recno()) to 8-15 and within that range select numbers that are even.

Skip: Skips N elements.

int[] numbers = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
var some = numbers.Skip(5);
some.Dump("Skipped first 5 numbers");

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 

In T-SQL it looks like:

select number from 
(select number,Row_Number() over (order by number) as row from numbers) numberList
where row > 5 

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 

Actually if you run this against Northwind:

Orders.OrderBy(o => o.OrderDate).Skip(5)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

This is the generated T-SQL code:

-- Region Parameters
DECLARE @p0 Int SET @p0 = 5
-- EndRegion
SELECT [t1].[OrderID], [t1].[CustomerID], [t1].[EmployeeID],
   [t1].[OrderDate], [t1].[RequiredDate], [t1].[ShippedDate], 
   [t1].[ShipVia], [t1].[Freight], [t1].[ShipName], [t1].[ShipAddress],
   [t1].[ShipCity], [t1].[ShipRegion], [t1].[ShipPostalCode], [t1].[ShipCountry]
FROM (
    SELECT ROW_NUMBER() OVER (ORDER BY [t0].[OrderDate]) AS [ROW_NUMBER], 
        [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], 
        [t0].[OrderDate], [t0].[RequiredDate], [t0].[ShippedDate], 
        [t0].[ShipVia], [t0].[Freight], [t0].[ShipName], [t0].[ShipAddress], 
        [t0].[ShipCity], [t0].[ShipRegion], [t0].[ShipPostalCode], [t0].[ShipCountry]
    FROM [Orders] AS [t0]
    ) AS [t1]
WHERE [t1].[ROW_NUMBER] > @p0
ORDER BY [t1].[ROW_NUMBER]

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 

You can also achieve the same result with a query like:

SELECT * from Orders
WHERE OrderId not in 
   (select top 5 OrderID from orders ORDER BY OrderDate)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 

Take: Takes top N – IOW typical TOP N query operator.

Orders.OrderBy(o => o.OrderDate).Take(5)

 
T-SQL counterpart:

SELECT TOP 5 * FROM [Orders] order by OrderDate

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 
Both Skip and Take have another version which is not supported by L2S:
 
SkipWhile: Skips over elements as long as a condition is true. It has 2 overloads:

  1. First overload skips while a condition is true.

    int[] numbers = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
    var some = numbers.SkipWhile( n => n < 5 );
    some.Dump("Skipped while number is less than 5");

    .csharpcode, .csharpcode pre
    {
    font-size: small;
    color: black;
    font-family: consolas, “Courier New”, courier, monospace;
    background-color: #ffffff;
    /*white-space: pre;*/
    }
    .csharpcode pre { margin: 0em; }
    .csharpcode .rem { color: #008000; }
    .csharpcode .kwrd { color: #0000ff; }
    .csharpcode .str { color: #006080; }
    .csharpcode .op { color: #0000c0; }
    .csharpcode .preproc { color: #cc6633; }
    .csharpcode .asp { background-color: #ffff00; }
    .csharpcode .html { color: #800000; }
    .csharpcode .attr { color: #ff0000; }
    .csharpcode .alt
    {
    background-color: #f4f4f4;
    width: 100%;
    margin: 0em;
    }
    .csharpcode .lnum { color: #606060; }

  2. Second overload skips while a condition is true and also gets an index parameter as input.

    int[] numbers = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
    var some = numbers.SkipWhile( (n, index) => index < 5);
    some.Dump("Skipped while index position is less than 5");

    .csharpcode, .csharpcode pre
    {
    font-size: small;
    color: black;
    font-family: consolas, “Courier New”, courier, monospace;
    background-color: #ffffff;
    /*white-space: pre;*/
    }
    .csharpcode pre { margin: 0em; }
    .csharpcode .rem { color: #008000; }
    .csharpcode .kwrd { color: #0000ff; }
    .csharpcode .str { color: #006080; }
    .csharpcode .op { color: #0000c0; }
    .csharpcode .preproc { color: #cc6633; }
    .csharpcode .asp { background-color: #ffff00; }
    .csharpcode .html { color: #800000; }
    .csharpcode .attr { color: #ff0000; }
    .csharpcode .alt
    {
    background-color: #f4f4f4;
    width: 100%;
    margin: 0em;
    }
    .csharpcode .lnum { color: #606060; }

TakeWhile: Takes elements as long as the condition is true. Similar to SkipWhile it has same 2 overloads.
 

int[] numbers = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
var some = numbers.TakeWhile( n => n < 5);
some.Dump("Take while number is less than 5");

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

int[] numbers = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17};
var some = numbers.TakeWhile( (n, index) => index < 5);
some.Dump("Take while index position is less than 5");

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

 

You can combine or in other words chain these operators. Here is a sample (and already we did one before using OrderBy() above):

Orders
 .OrderBy( o => o.CustomerID)
 .Where( o => o.Customer.Country == "USA" )
 .Skip(5)
 .Take(3)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

Note that using Skip() and Take() combination you can easily create a "paging query" for a web application. For example this gets the records 21-30 (Page 3 where pages are 10 records high) in a particular order of a paged query:

Customers.OrderBy(c => c.CustomerID).Skip( 2*10 ).Take(10)

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

Posted in Uncategorized | Leave a comment

>A few Linq resources

>

It has been a long time since I posted my last entry. Sorry about that. I still don’t have time for long posts but I thought I could at least provide some pointers to resources, samples, tools and such.

Here is a new book that is published recently:

Essential LINQ
by Charlie Calvert and Dinesh Kulkarni
Publisher: Addison-Wesley Professional
Pub Date: March 12, 2009
Print ISBN-10: 0-321-56416-2
Print ISBN-13: 978-0-321-56416-0
Web ISBN-10: 0-321-60475-X
Web ISBN-13: 978-0-321-60475-0
Pages: 600

I haven’t read this book myself and learned about it today just by accident while testing a Linq sample delivered by Microsoft:)

Be sure to check samples in LinqPad and if you have Visual Studio 2008 check the samples delivered with VS located at:

C:\Program Files\Microsoft Visual Studio 9.0\Samples\1033\CSharpSamples\LinqSamples

Samples do not exist in Express version as far as I know but you could get the online version which is updated and maybe more recent then the one you have from:

http://code.msdn.microsoft.com/csharpsamples

Some more Linq flavors (not released yet – alpha,beta):

  • Parallel Linq
  • Linq to SharePoint
  • Linq to Active Directory

And finally I want to include a short reply to a question asked by a friend:

Question: Is Linq about querying data or can we use it to update the data too?

Answer: As I remember I told this before, Linq might be a misnomer and it is also used for updating the data where applicable. How we do it depends on which flavor we are using, but the key point is that while doing that we use an "object oriented" approach.

Posted in Uncategorized | Leave a comment

>Query keywords – join, let

>

join: Similar to SQL join, the join keyword is used to join sets. One thing you should note that, if there is a direct relationship in the object model a join is not needed. This behavior differs from SQL. Consider the infamous Northwind database. In SQL server (and in VFP northwind.dbc sample database) there are relations between the tables defined.  You can’t simply create joins without specifying a relation in your SQL:

select * from Customers, Orders

would be a newbie’s error (most of the time) in an attempt to join two tables and expect them to be joined using CustomerId. End result would be a Cartesian join matching each Customer with every Order in Orders set. With Linq you can make the same mistake but it is harder to do:

from c in Customers
from o in Orders
select new { c.CustomerID, o.OrderID }

With the above SQL our intention would likely be an equijoin:

select Customers.CustomerID, Orders.OrderID from Customers, Orders where Customers.CustomerID = Orders.CustomerID

select Customers.CustomerID, Orders.OrderID from Customers inner join Orders on Customers.CustomerID = Orders.CustomerID

select Customers.CustomerID, Orders.OrderID from Customers left join Orders on Customers.CustomerID = Orders.CustomerID

First two are typical inner join queries and 3rd left outer join. In Linq writing such queries is easy. You could write the inner join like this one:

from c in Customers
from o in c.Orders
    select new { c.CustomerID, o.OrderID }

and left outer join like this one:

from c in Customers 
    select new { c.CustomerID, OrderID = c.Orders.Select( o => o.OrderID ) }

While you could use "join" in these, join in Linq is for creating joins on properties that doesn’t already define a relationship in object model (it is easy to forget we are doing a query against objects and object models). Within a database it is unlikely to make join queries that are not already defined with relationships but you may never know what you need. And you wouldn’t always use Linq against collections that come from database based or some other type of well defined object model. We will however see "join" here with already well defined relationships just for the sake of your familiarity with those table and key field names. It makes it easier to guess the result without actually seeing it then.

from c in Customers
join o in Orders on c.CustomerID equals o.CustomerID
select new {c.CustomerID, o.OrderID }

I should also note that using join on primary – foreign key properties causes an "inner join" to be generated. There is also another type of join called "Group Join". You can create group join using "join" and "into" clauses (keywords) together:

from c in Customers
join o in Orders on c.CustomerID equals o.CustomerID into orderGroup
select new {c.CustomerID, orderGroup }

Or if we remember that we are working with objects could write it like this as well:

from c in Customers
join o in Orders on c equals o.Customer into orderGroup
select new {c.CustomerID, orderGroup }

But also there is one another and maybe easier to write and understand variation of the same query (and both generate the same left outer join) – provided there is a relationship that we can use in object model (you may call it "relationship" or "navigational property"):

from c in Customers select new {c.CustomerID, c.Orders}

let: Let is a cool keyword  that allows you to store inline expressions. Those who use SQL server 2005 and later know that how much CTE (Common Table Expression – with {…} as ) is welcomed easing the writing of queries and increasing the performance as well. Let keyword is not CTE of course but looks very similar. Using it you assign a variable a complete expression inline creating a "sub-expression". That sub-expression is evaluated only once and can be used in operations within the outer expression. Now let’s see how let lets you < g > write more succinct expressions. I will start with an exact copy from LinqPad’s 5 minutes samples:

from p in Products
let spanishOrders = p.OrderDetails.Where (o => o.Order.ShipCountry == "Spain")
where spanishOrders.Any()
orderby p.ProductName
select new
{
    p.ProductName,
    p.Category.CategoryName,
    Orders = spanishOrders.Count(),   
    TotalValue = spanishOrders.Sum (o => o.UnitPrice * o.Quantity)
}

Here spanishOrders is an inline variable (well not very true to call it variable, purists should forgive me) whose value is assigned from 

p.OrderDetails.Where (o => o.Order.ShipCountry == "Spain")  

expression. This expression is the same as this one written in comprehension syntax:

from o in p.OrderDetails where o.Order.ShipCountry == "Spain" select o

(and you could replace it with this one enclosing it in parentheses). It is later used in 3 places:

where spanishOrders.Any()
orderby p.ProductName
select new
{
    p.ProductName,
    p.Category.CategoryName,
    Orders = spanishOrders.Count(),   
    TotalValue = spanishOrders.Sum (o => o.UnitPrice * o.Quantity)
}

I think you can see it without any further explanation. It is acting like an extraordinary SQL subquery (but it doesn’t need to be something like a subquery, any expression would do and if that expression is returning something queryable you can query it). Here is another sample (this sample is from .Net documentation, you can see the original here):

string[] strings =
        {
            "A penny saved is a penny earned.",
            "The early bird catches the worm.",
            "The pen is mightier than the sword."
        };

// Split the sentence into an array of words
// and select those whose first letter is a vowel.
var earlyBirdQuery =
    from sentence in strings
    let words = sentence.Split(‘ ‘)
    from word in words
    let w = word.ToLower()
    where w[0] == ‘a’ || w[0] == ‘e’
        || w[0] == ‘i’ || w[0] == ‘o’
        || w[0] == ‘u’
    select word;

earlyBirdQuery.Dump();

There are two let keywords here. First one gets each sentence and stores the result of Split(‘ ‘) function to words. "words" is another enumerable ( words of sentences in string ) and it is followed by another query ( from word in words ). For each word second "let" stores the result of word.Lower() to "w". "where" clause use that w multiple times. Without "let" word.Lower() itself would be called multiple times ( where w[0] == ‘a’ || w[0] == ‘e’ … w[0] == ‘u’ ).

Here is another sample which I wrote as a reply on MSDN C# forum. The question was (as I remember it):

There is tickdata like this one:

create table tickdata (timed datetime, price money, quantity int)
insert into tickData values (’06:01′,100,20)
insert into tickData values (’06:02′,90,30)
insert into tickData values (’06:03′,129,10)
insert into tickData values (’06:11′,112,8)
insert into tickData values (’06:12′,150,60)
insert into tickData values (’06:20′,110,10)
insert into tickData values (’06:23′,120,5)
insert into tickData values (’06:24:59′,130,15)
insert into tickData values (’06:25′,140,35)

From this data generate stock trade like data that looks like:

period open low high close volume
06:00 – 06:05 100.0000 90.0000 129.0000 129.0000 60
06:10 – 06:15 112.0000 112.0000 150.0000 150.0000 68
06:20 – 06:25 110.0000 110.0000 130.0000 130.0000 30
06:25 – 06:30 140.0000 140.0000 140.0000 140.0000 35

where period is defined as 5 minute intervals starting from the first available tickdata time and shows for each period with data. This was the Linq solution I came up with (besides "let" keyword there is grouping and query methods):

from myData in Tickdata.OrderBy( t => t.Timed ).AsEnumerable()
  group myData by ((int)myData.Timed.Value.TimeOfDay.TotalMinutes) / 5 into barData
  let bStart = TimeSpan.FromMinutes( barData.Key * 5 ).ToString()
  let bEnd   = TimeSpan.FromMinutes( (barData.Key+1) * 5 ).ToString()
  select new
  { period = bStart.Substring(0,5) + " – " + bEnd.Substring(0,5),
    open   = barData.FirstOrDefault().Price,  
    low    = barData.Min( bd => bd.Price ),
    high   = barData.Max( bd => bd.Price ),
    close  = barData.LastOrDefault().Price,
    volume = barData.Sum( bd => bd.Quantity ) }

This concludes our query keywords series. Other keywords are conceptual ones used to complete syntax (in, on, equals, by, ascending, descending).

Posted in Uncategorized | Leave a comment

>Query keywords – orderby

>

orderby: Is used for sorting the records. In comprehension syntax it is just like we use it in SQL. i.e:

from c in Customers
orderby c.Country, c.CompanyName descending
select c

Here we are ordering by Country (default ascending) and then by CompanyName in descending order.

In method syntax there are 4 methods:

OrderBy()
OrderByDescending()
ThenBy()
ThenByDescending()

Using method syntax above sample would be written as:

Customers
  .OrderBy( c => c.Country )
  .ThenByDescending( c => c.CompanyName )
  .Select( c => c)

All 4 methods support a custom comparison method as a second parameter (for example you could create a method that instead of widely known alphabetical sort, sorts the strings based on a custom ordering – "10" comes after "2").

Here is a sample modified from LinqPad’s OrderBy sample:

string[] strings = { "Tom", "Çetin",
   "I come before i", "Can", "in",
   "Dick", "Harry", "Mary", "Jay",
   "ş is between s and t", "smith"};

strings.OrderBy (n => n, StringComparer.Create( new System.Globalization.CultureInfo("tr-TR"), true))
    .Dump ("Case insensitive ordering based on Turkish alphabet");

This second parameter is not supported in Linq to SQL.

Posted in Uncategorized | Leave a comment