Back to blogging

It has been a long time, really long, maybe 9-10 years, since I last posted something to my blog. One of the reasons was, I initially started blogging on Microsoft Windows Live Spaces, without thinking it was Microsoft.

Probably it is the VFP community who knows best that Microsoft should not be trusted, I did 😦 Later, suddenly Microsoft pulled the plug off for Windows Live Spaces, sounds familiar? When I was just getting a hang of blogging, my blog was thrown on to WordPress. WordPress may be a wonderful place for blogging, but I couldn’t handle another learning curve since then.

Today, before starting again, I checked how could I write blog that would include many code and easy to write locally using markdown. For wordpress, what I found was, if I wanted to do it locally and have code, then I would need to make many hops. For some security reasons, WordPress didn’t allow embedding code and I would need to write the codes online. That didn’t sound right to me but that was what I have found after a short research. I believe there is a simple way to do it with Worpress, but searching for it is not something I would do now.

Luckily, I had started blogging not only on MS Live spaces, but also duplicated the blog on Google’s blogspot with the nickname FoxSharp. I should have remembered that earlier! After a short search, I found that I could write locally using markdown, preview while writing and then simply post there, just what I wanted. So here I am again.

I will post this one on to both blogs, at WordPress and FoxSharp at blogspot. Then supposedly, I will continue on FoxSharp only.

My first post will be about a series of routines that make it easier using Microsoft SQL Server from VFP. I promised I would supply the codes for the public, even if I may not have time to sample and explain the usage of every piece. I thought, instead of writing them in a forum thread, it would be better to have it in a blog. This way, I could update samples and explanations in time.

See you in my next post.

 

Posted in Uncategorized | Leave a comment

Hello world!

Welcome to my blog. Moving my blog from Windows live spaces and this is my first post here. It will take some time to get accustomed to. I hope this would be a new and motivating experience.

Posted in Uncategorized | Leave a comment

>LINQ Operators – Part X

>

I have been busy with a series of tasks and couldn’t find time to complete operators section. Although I had a safari account via university subscription and could read it online, I ordered a great book "C# 3.0 in a nutshell – O’Reilly – Joseph Albahari & Ben Albahari". Amazon estimate was I would get it sometime after August 11th but to my surprise I got it yesterday and maybe that has motivated me to spare extra time and write today’s (well tonight’s) blog. That is not the first book I read about C# and possibly won’t be the last but I must admit I liked it very much.

Conversion operators:

OfType: Returns the items from a sequence that are of a given type. This is especially useful with object collections where the collection can have multiple different types and ArrayList is a typical sample.  

ArrayList al = new ArrayList();
al.Add(5);
al.Add(7);
al.Add("Jim");
al.Add("Frank");
al.AddRange( Customers.Where( c => c.Country == "USA" ).ToList());

al.OfType<int>().Dump("Integers");
al.OfType<string>().Dump("Strings");
al.OfType<Customers>().Dump("Customers");

and here is the partial output in LinqPad:

OfType

With generic lists it is unlikely that you would need OfType.

Cast: Cast returns a sequence attempting to cast each member to a given type. If it can’t cast one ten it throws an error.  The difference between OfType and Cast is that, Cast throws an error if it encounters an incompatible type for casting. OfType, on the other hand, ignores incompatible types and simply returns an empty sequence if it cannot cast any member.

ArrayList al = new ArrayList();
al.Add(5);
al.Add(7);
al.Cast<int>().Dump("Integers");
al.OfType<string>().Dump("Strings");

Cast

If we try the OfType<string>() with Cast<string>() we will get an exception. Probably, Cast was added to support ArrayList and I can’t think of a good usage for it.

AsEnumerable: Upcasts a sequence to an IEnumerable sequence. Converting to an Enumerable effectively makes the sequence local and then you can apply methods that are not supported on a remote (IQueryable) source. For example Linq To SQL doesn’t support Last() but you could make the sequence local with AsEnumerable() and apply Last(). This nice sample is borrowed from LinqPad samples:

Regex wordCounter = new Regex (@"\b(\w|[-‘])+\b");

// Click the ‘SQL’ tab below after running this query - 
// notice that only the topic filtering
// predicate executes on SQL Server.

var query = MedicalArticles
           .Where (article => article.Topic == "influenza")
           .AsEnumerable()
           .Where (article => wordCounter.Matches (article.Abstract).Count < 100);

query.Dump();

 

The sample is getting medical articles whose topic is "influenza" and abstract is less than 100 words. Getting articles whose topic is "influenza" is trivial and the part before AsEnumerable() does that. That part is converted to this SQL:

— Region Parameters
Declare @p0 VarChar(9) Set @p0 = ‘influenza’
— EndRegion
Select [t0].[ID], [t0].[Topic], [t0].[Abstract]
  FROM [MedicalArticles] As [t0]
  WHERE [t0].[Topic] = @p0

The tricky part is counting the words. To count words regular expression is used and that wouldn’t execute on Linq To SQL. Applying AsEnumerable() converts the sequence to a local one and makes it possible to count the words using regular expression. 

AsQueryable: Converts a sequence to an IQueryable either by downcasting or by creating an IQueryable wrapper.  Converted to an IQueryable you can execute the query locally or remotely.

 ToArray and ToList: Convert a sequence to an array or generic list respectively. The generated array or list is a snapshot taken at the moment the query is run. That is particularly useful when a query needs to be executed often and a snapshot is sufficient.

ToDictionary and ToLookup: These operators create a snapshot dictionary of a given sequence. A dictionary is a key, element pair and you can access a member using the key as a indexer. ToDictionary creates a dictionary where sequence has  a unique entry per key (typically a primary key as in a table). ToLookup on the hand looks like foreign key and there can be multiple values for the same key in a sequence. The key is used as an indexer in both.

var cusDict = Customers.ToDictionary( c => c.CustomerID, c => c );
cusDict["BONAP"].Dump();

gives this result (note how we use the unique CustomerID as a key and the particular value to "BONAP" to access a member in dictionary via indexer):

ToDictionary

var clookup = Orders.ToLookup( o => o.CustomerID, o => o);

clookup["BONAP"].Dump("Orders of customer BONAP");

yields:

ToLookup

Other operators:

SequenceEqual: Also accepted as a quantifier operator this operator compares two sequences and returns true if the sequences have identical elements and in the same order.

var s1 = "ClintEastwood".ToLower().OrderBy(l => l);
var s2 = "OldWestAction".ToLower().OrderBy(l => l);
s1.SequenceEqual(s2).Dump("An anagram from wikipedia");

 

This concludes the standard Linq operators. There are other extension methods that are not yet available in .Net 3.5 and since it is extensible we should expect more in the future. Happy LINQinq.

Posted in Uncategorized | Leave a comment

>LINQ Operators – Part IX

>

Element operators:

First, FirstOrDefault: Used to return the first element from a sequence. T-SQL counterpart is "top 1".

Last, LastOrDefault: Used to return the last element from a sequence. T-SQL counterpart is "top 1 … order by desc"

Single, SingleOrDefault; Used the return the single result from a sequence. Similar to First/FirstOrDefault but this one throws an exception if there are more than 1 results.

Using these methods you can either get First, Last or Single element from a sequence ( like a "locate" without any additional clause ). Or you can use a boolean predicate to filter "for" a match. Sometimes there wouldn’t be a match to the predicate you supplied and no element would be returned from a sequence. That would cause an exception to be thrown unless you have used the FirstOrDefault, LastOrDefault or SingleOrDefault. "Or default" versions return an element from sequence a default( TSource ) – which is "null" for reference types like string, Customer and blank for value types (generally 0). 

Customers.Where( c => c.Country == "UK")
Returns a sequence of Customers whether it may be 0 or more Customers. We could get the first UK customer as a single element of Customers type using First like this:
Customers.Where( c => c.Country == "UK").First()
or:
Customers.First( c => c.Country == "UK" )
 
Since the result is now an element we could directly get its properties if we wanted to like:
Customers.First( c => c.Country == "UK" ).CompanyName

The problem with First() is that if the country was something like "Turkey" (there is no customer from Turkey in Northwind sample database) then the sequence that First() would operate on would have no elements to choose from hence an exception would be thrown. FirstOfDefault() on the other hand, would return a default Customers element:

Customers.First( c => c.Country == "Turkey" ) // throws exception

Customers.FirstOrDefault( c => c.Country == "Turkey" ) // default Customers element – null

Last and LastOrDefault works like First/FirstOrDefault but is not supported in Linq To SQL.

int[] numbers = {3,8,6,8,3,4,7};
numbers.Last( n => n % 2 == 0).Dump(); // 4

Single and SingleOrDefault is useful when we are getting data using a unique key (typically a primary key):

string id1 = "BONAP";
string id2 = "XXXXX";

Customers customer1 = 
  Customers
  .SingleOrDefault( c => c.CustomerID == id1);

Customers customer2 = 
  Customers
  .SingleOrDefault( c => c.CustomerID == id2);

( customer1 == null 
  ? "No such customer exists." 
  : customer1.CompanyName ).Dump("Key:" + id1);
  
( customer2 == null 
  ? "No such customer exists." 
  : customer2.CompanyName ).Dump("Key:" + id2);

 

ElementAt, ElementAtOrDefault: Similar to other element operators, returns element at Nth position (or default(T) with ElementAtOrDefault when there is no element at given position). Not supported in Linq To SQL. You can think of it as record number or array position:

IEnumerable<Customers> cus = Customers.AsEnumerable();

cus.ElementAtOrDefault( 34 ).Dump("34th"); // HILAA
cus.ElementAtOrDefault( 134 ).Dump("134th"); // null

 

 DefaultIfEmpty: Is used for returning a default(TSource) when the sequence is empty (normally it would return a "sequence" if existed). Used in "left join" type queries. 

Customers.SelectMany (
   c => c.Orders.DefaultIfEmpty (), 
   (c, o) => new  
   {
      CustomerID = c.CustomerID, 
      CompanyName = c.CompanyName, 
      oid = ((Int32?)(o.OrderID) == null) ? 0 : o.OrderID, 
      OrderDate = o.OrderDate
   }
)

It supports a parameter of TSource defining "empty" value. Here is a sample from MSDN documentation:

class Pet
{
    public string Name { get; set; }
    public int Age { get; set; }
}

public static void DefaultIfEmptyEx2()
{
    Pet defaultPet = new Pet { Name = "Default Pet", Age = 0 };

    List<Pet> pets1 =
        new List<Pet>{ new Pet { Name="Barley", Age=8 },
                       new Pet { Name="Boots", Age=4 },
                       new Pet { Name="Whiskers", Age=1 } };

    foreach (Pet pet in pets1.DefaultIfEmpty(defaultPet))
    {
        Console.WriteLine("Name: {0}", pet.Name);
    }

    List<Pet> pets2 = new List<Pet>();

    foreach (Pet pet in pets2.DefaultIfEmpty(defaultPet))
    {
        Console.WriteLine("\nName: {0}", pet.Name);
    }
}

/*
This code produces the following output:

Name: Barley
Name: Boots
Name: Whiskers

Name: Default Pet
*/

Posted in Uncategorized | Leave a comment

>LINQ Operators – Part VIII

>

Aggregation operators

Aggregation operators are Average, Count, LongCount, Sum, Max, Min and Aggregate. Aggregation operators are available only as methods in C# (VB also have a comprehension syntax for aggregation operators).

Average: Gets the average of a numeric collection. T-SQL counterpart is Avg()

Sum: Sums the values in a numeric collection. T-SQL counterpart is Sum()

Min,Max: Gets minimum and largest values in a sequence. T-SQL counterparts are Min(), Max()

Count, LongCount: Counts the elements in a collection. T-SQL counterparts are Count(), Count_Big().

Aggregate: Performs custom aggregation on a set of values. T-SQL has no counterpart.

int[] scores = {60,75,89};
scores.Average().Dump("Average");
scores.Sum().Dump("Sum");
scores.Max().Dump("Max");
scores.Min().Dump("Min");
scores.Count().Dump("Count");
scores.LongCount().Dump("LongCount");

Count and LongCount have a filtering parameter that filters what to count based on a boolean expression as in this sample:

int[] numbers = {17,122,23,12,22};
numbers.Count( n => n%2 == 0).Dump("Even Numbers count");

Count and LongCount are same except that LongCount returns a long (64 bits integer).

All these methods (except Count and LongCount) have a "selector" parameter to specify "what" to sum,average,min or max. Here are some samples:

Customers.Select( 
   c => new 
   {
      customer = c.CompanyName,
      firstPurchase = c.Orders.Min( o => o.OrderDate ),
      lastPurchase = c.Orders.Max( o => o.OrderDate ), 
      orders = c.Orders.Count,
      totalSale = c.Orders.Count == 0 ? 0 :
      c.Orders.Sum( o => o.OrderDetails
              .Sum( od => od.Quantity * od.UnitPrice ))
   } )
   .OrderBy( c => c.customer )

This sample uses Min, Max, Count, Sum to generate a summary statistics on customer purchases. Here is a sample for getting averages:

double quantity = OrderDetails.Average( od => od.Quantity );
decimal price =  OrderDetails.Average( od =>  od.UnitPrice );
decimal sale = OrderDetails.Average( od => od.Quantity * od.UnitPrice );

Console.WriteLine("{0} {1} {2}", quantity, price, sale );

 

Aggregate is the interesting one here that is new to VFP developers (and SQL developers as well). Aggregate allows custom aggregations to be done and is supported only in local data (not supported by Linq To SQL). Here are some samples:

IEnumerable<int> numbers = Enumerable.Range(1,10);
int sum = numbers.Aggregate( 
  (current, next) => current + next );
int sumProduct = numbers.Aggregate( 
  (current, next) => current * next );

sum.Dump();
sumProduct.Dump();

First one is simply a custom implementation of Sum(). Second one gets a product of the numbers in list ( 1 * 2 * 3 * 4 … 10 ). Next example shows that to use aggregate it doesn’t have to be a numeric.

Customers.Select(c => c.CustomerID).ToArray().
    Aggregate(   (current, next) =>  current  + ", " +next )

This sample creates a comma separated list of all customer ID values. Note that we get the customer IDs to an array first to make the collection local.

Aggregate have two more signatures. Next one uses an initial "seed" value (in the above samples the seed value was the first element in the list). I think VS Linq samples have one of the best samples for this one:

        [Category("Aggregate Operators")]
        [Title("Aggregate – Seed")]
        [Description("This sample uses Aggregate to create a running account balance that " +
                     "subtracts each withdrawal from the initial balance of 100, as long as " +
                     "the balance never drops below 0.")]
        public void Linq93() {
            double startBalance = 100.0;
            
            int[] attemptedWithdrawals = { 20, 10, 40, 50, 10, 70, 30 };
            
            double endBalance = 
                attemptedWithdrawals.Aggregate(startBalance,
                    (balance, nextWithdrawal) =>
                        ( (nextWithdrawal <= balance) ? (balance – nextWithdrawal) : balance ) );
            
            Console.WriteLine("Ending balance: {0}", endBalance);
        }

 

Next version of Aggregate allows you to have a custom selector projection on the result. 

Products.Select( p => new { p.ProductName, p.Category.CategoryName }).ToArray()
.Aggregate( 
       new {Beverages=0,Condiments=0,Other=0}, // initial seed value
       (acc,next) => next.CategoryName == "Beverages" 
                     ? new {Beverages=acc.Beverages+1, Condiments=acc.Condiments, Other=acc.Other}
                     : ( next.CategoryName == "Condiments"
                     ? new {Beverages=acc.Beverages, Condiments=acc.Condiments+1, Other=acc.Other}
                     : new {Beverages=acc.Beverages, Condiments=acc.Condiments, Other=acc.Other+1} ),
       result => String.Format("Beverages: {0}, Condiments: {1}, Others:{2}", result.Beverages, result.Condiments, result.Other)  // projection
)

It looks confusing and yes it is. Probably you wouldn’t want to aggregate like this but use a for loop. I wrote this one just to show how you could initialize a complex (here an anonymous type with Beverages,Condiments and Other properties) seed value, do accumulation based on category names and finally project the result to a custom string.

Posted in Uncategorized | Leave a comment

>LINQ Operators – Part VII

>

Generation operators:

Range: Given a start and count generates a list of integers.

Enumerable.Range( 3, 10)

Generates integers 3,4 … 12.

Repeat: Is similar to replicate() –but not limited to strings- and generates a list of a given element (either a single element or not) repeating it N times.

Enumerable( ‘A’, 5).Dump(); // Char A 5 times
Enumrable.Repeat(1, 3).Dump(); // int 1 3 times
var o = Customers.Single( c => c.CustomerID == "ALFKI" ); // Single is a method specifying that the result is a single element
Enumerable.Repeat(o,10).Dump(); Customers o 10 Times

var customers = Customers.Where( c => c.Country == "USA"  );
Enumerable.Repeat(customers, 3).Dump(); // a series of customers repeated 3 times 

Note that Repeat() repeats the given source which is a singleton element or a collection.

Empty: Generates an "empty" sequence of a given type of source. Though it is "empty", it is still a representation of given type rather than NULL.

var customers = Enumerable.Empty<Customers>();
customers.Dump(); // nothing
customers.Count().Dump(); // 0

Quantifier Operators:

Contains: Returns true if a given sequence contains a given element. T-SQL counterpart is IN, Exists.

string[] countries = "USA,UK,Germany".Split(‘,’);
Customers.Where( c => countries.Contains(c.Country)).Dump();

Selects customers where Customers.Country IN ("USA","UK","Germany").

Any: Returns true if a sequence have "any" elements.  T-SQL counterpart is Exists, IN.

Customers.Where( c => c.Orders.Any() )

Returns all Customers that has at least one Order.

Customers.Where(c => c.Orders.Any( o => o.OrderDetails.Count() > 5))

Returns all customers who have an Order with at least 6 items. Here is the result set using Northwind (picture cut):

Any

As you may have noticed Any() can do what Contains() do. For example we could rewrite Contains sample:

string[] countries = "USA,UK,Germany".Split(‘,’);
Customers
    .AsEnumerable()
    .Where( c => countries.Any( cn => cn == c.Country )).Dump();

However in case of an IQueryable (like Linq To SQL or Entity Framework) Contains() is supported to have a local sequence while Any() is not supported – and its syntax is easier for this sample.

All: All is like Any() but always have a filter condition where all elements of a sequence should match for All() to return true.

Orders
    .Where( o => o.OrderDetails
                            .All( od => od.Product.Category.CategoryName == "Beverages" ))

Selects all the orders where all order items are from "Beverages" category. You can take a look what those order ID, product name and category names are using the methods we learned earlier:

Orders
  .Where( o => o.OrderDetails
  .All( od => od.Product.Category.CategoryName == "Beverages" ))
  .SelectMany( 
    o => o.OrderDetails,
    (o, od) => new  
   {
       OrderID = o.OrderID, 
       Product = od.Product.ProductName,
       Category = od.Product.Category.CategoryName
   })
  .OrderBy( o => o.OrderID )
Posted in Uncategorized | Leave a comment

>LINQ Query Operators – Part VI

>

Set operators

In set operators we will discuss Distinct(), Union(), Intersect(), Except() and Concat().

Distinct: Removes the duplications and selects unique items. T-SQL counterpart is "Distinct".

Union: Returns the union of items from multiple sets. In other words adds one set’s elements to other removing duplicates. T-SQL counterpart is "Union".

Concat: Returns the union of two sets without dropping the duplicates. T-SQL counterpart is "Union All".

Intersect: Returns elements from two sets which exists in both sets. T-SQL counterpart "exists", "In (….)" (SQL2008 "intersect").

Except: Returns elements from a set which doesn’t exist in other set. T-SQL counterpart "not exists", "not In (….)" (SQL2008 "except").

Customers.Select(c=>c.Country).Distinct()

is the same as:  

Select DISTINCT Country FROM Customers 

If we have used:

Customers.Distinct()

The SQL would be:

Select * from Customers

Just as we expected. (The actual generated SQL lists the fields instead of using *)

Distinct accepts a parameter to define an "equality" interface (IEqualityComparer<TSource>). So you can for example accept the items that have say "Netherlands" and "Holland" as same while applying a distinct operation. This IEqualityComparer interface is not only used by Distinct but by a series of methods like Group, GroupJoin, Intersect, Except, Union. Before we go into sampling IEqualityComparer lets make a note that objects are compared using equality comparison and hashcode. Hashcode is a "signature" of object. Two different objects have different hashcode even if their properties are same. It is not hard to understand this. In VFP consider:

use myTable
scatter name o1
skip
scatter name o2

First two records might exactly have the same data in them and hence o1, o2 would match in their properties but still they are 2 different objects as shown in the sample code below:

Create Cursor sampledata ( firstName c(10), lastname c(10))
Insert Into sampledata Values ("Joe","Doe")
Insert Into sampledata Values ("Joe","Doe")
Insert Into sampledata Values ("Frank","Doe")

Locate
Scatter Name o1
Skip
Scatter Name o2

Clear
? o1.firstName,o1.lastname
? o2.firstName,o2.lastname

? "FirstNames are equal",o1.firstName == o2.firstName
? "LastNames are equal", o1.lastname == o2.lastname
? "Objects have identical properties", Compobj(o1,o2)
? "Objects are same", o1 = o2

Lets create some SQL server data to see all these visually:

create database LinqTest
go

use LinqTest
go

create table badData (
  personId int identity primary key,
  firstName varchar(15),  lastName varchar(20),
  email1 varchar(50),  email2 varchar(50),
  phone1 varchar(15),  phone2 varchar(15),  phone3 varchar(15))

insert into badData 
  (firstName,lastName, email1,email2, phone1,phone2,phone3)
  values
  (‘John’, ‘Doe’, ‘jdoe@myhost1.com’,’jdoe@myhost2.com’,’111-111 11 11′, ‘111-111 11 12’, ‘111-111 11 13’)

insert into badData 
  (firstName,lastName, email1,email2,phone1,phone2,phone3)
  values
  (‘John’, ‘Doe’, ‘jdoe@myhost1.com’,’jdoe@myhost2.com’,’111-111 11 11′, ‘111-111 11 12’, ‘111-111 11 13’)

insert into badData 
  (firstName,lastName, email1,email2,phone1,phone2,phone3)
  values
  (‘John’, ‘Doe’, ‘jdoe@myhost2.com’,’jdoe@myhost1.com’,’111-111 11 13′, ‘111-111 11 11’, ‘111-111 11 12’)

insert into badData 
  (firstName,lastName, email1,email2,phone1,phone2,phone3)
  values
  (‘Joe’, ‘Doe’,’jdoe@myhost2.com’,”,  ”, ”, ”)
 
insert into badData 
  (firstName,lastName, email1,email2,phone1,phone2,phone3)
  values
  (‘Joseph’, ‘Doe’, ”,”, ‘111-111 11 13’, ”, ”)

insert into badData 
  (firstName,lastName, email1,email2,phone1,phone2,phone3)
  values
  (‘Frank’, ‘Doe’, ‘fdoe@myhost1.com’,’fdoe@myhost2.com’,’111-111 11 11′, ”, ”)

insert into badData 
  (firstName,lastName, email1,email2,phone1,phone2,phone3)
  values
  (‘John’, ‘Doe’,’jdoe@myhost3.com’,’jdoe@myhost4.com’, ‘222-111 11 11’, ‘222-111 11 12’, ‘222-111 11 13’)
go

Select * from badData

Create the same data as badData2 – to use with Union and see the effect of hashcode. Now we will call Distinct() and Union() using this data. Before doing that lets check our data. It is a bad data design, but here did that on purpose. First two are plain duplicate entry. 3rd one is a duplication where data is in different columns (email1, email2 swapped, phone1,phone2,phone3 are scrambled). 4th, 5th and 6th have different names and only part of the data is entered but they match to others, which is likely to be another kind of duplication (or not a duplication but suppose we want to call offices and call each office only once, then first 6 are the same office). Last one is a match on name but neither his emails nor phones match others which we would think of as another person. Here is our code:

void Main()
{
   var b1 =
      BadData.Select( b =>
       new Person {
         FirstName = b.FirstName, LastName = b.LastName,
         Email1 = b.Email1, Email2 = b.Email2,
         Phone1 = b.Phone1, Phone2 = b.Phone2, Phone3 = b.Phone3
       });

   var b2 =
      BadData2.Select( b =>
       new Person {
         FirstName = b.FirstName, LastName = b.LastName,
         Email1 = b.Email1, Email2 = b.Email2,
         Phone1 = b.Phone1, Phone2 = b.Phone2, Phone3 = b.Phone3
       });

   b1.Union( b2 ).Dump("b1 union b2 – Union in SQL");
   b1.Distinct().Dump("b1 distinct – Distinct in SQL");

   PersonEqualityComparer cmp = new PersonEqualityComparer();
   b1.AsEnumerable().Union( b2.AsEnumerable(), cmp ).Dump("b1 union b2 with comparer");
   b1.AsEnumerable().Distinct( cmp ).Dump("b1 distinct with comparer");
}

// Define other methods and classes here
public class Person
{
  public string FirstName {get;set;}
  public string LastName {get;set;}
  public string Email1 {get;set;}
  public string Email2 {get;set;}
  public string Phone1 {get;set;}
  public string Phone2 {get;set;}
  public string Phone3 {get;set;}
}

public class PersonEqualityComparer : IEqualityComparer<Person>
{
   public bool Equals(Person x, Person y) {
      return IsMatch(
         new string[]{x.Email1,x.Email2},
         new string[]{y.Email1,y.Email2}
         ) || IsMatch(
         new string[]{x.Phone1,x.Phone2,x.Phone3},
         new string[]{y.Phone1,y.Phone2,y.Phone3});
   }

   public int GetHashCode(Person obj) {
      return obj.GetHashCode();
   }

   private bool IsMatch(string[] words1,string[] words2) {
      return (
         words1.Where(w => !String.IsNullOrEmpty( w ))
         .Intersect(
         words2.Where(w => !String.IsNullOrEmpty( w ))
         ).Count() > 0);
   }
}

Here are the results:

UnionAndDistinct

It is what you expected for the first two results (distinct and union is done directly in T-SQL but next two looks surprising. If I didn’t use a new Person class then those would be same as the first two results. But we created a custom comparer to mean that:

"If any email or phone matches then it is a duplicate"

so why would we get too many results? Much more than the one without a custom comparison? Well, we defined a custom Equals() method but we still return the Hash Code of the original object. Remember to be accepted same not only should Equals() return true but hash codes must match as well. For this reason none of them is equal and union returns a result like as if we wanted to do a "union all". To correct it we would just change the GetHashCode() implementation and see what we would get: 

   public int GetHashCode(Person obj) {
      return "".GetHashCode();
   }

and here are our new result sets (last two displayed):

UnionAndDistinct2

Now that we have covered defining our own equality operator, Union and Distinct lets check other methods Concat,Intersect and Except.

Customers
  .Where( c => c.Country == "USA" )
  .Select( c => new {c.CustomerID,c.CompanyName,c.Country} )
  .Concat( 
Customers
  .Where( c => c.Country == "UK" )
  .Select( c => new {c.CustomerID,c.CompanyName,c.Country} )
)

This sample is a typical "UNION ALL" of USA customers set and UK customers set ( Select( c => new … ) projection is for sampling only, if we didn’t have Select() parts it would be all fields).

Products.Where( p => p.OrderDetails.Any( od => od.Order.Customer.Country == "Argentina" ))

Here we use Any() which we will see later in "Quantifier" operators. What it does is to return true if any item in the set has a match to its filter expression ( T-SQL exists or any). This query is selecting products that are bought by Argentina customers. Here is the result set (partial):

ArgentinaPartial

Products.Where( p => p.OrderDetails.Any( od => od.Order.Customer.Country == "Argentina" ))
.Intersect(
Products.Where( p => p.OrderDetails.Any( od => od.Order.Customer.Country == "Belgium" ))
)

In this query Second part is doing the same for Belgium customers creating a seconds set. Then using INTERSECT we are getting the Products which are common to both sets. Here is the result set (picture cut for space):

intersect

As you should notice there are 14 common items (Argentina customers bought 32 different products). Looking at the results we could say that products with Id 5, 7, 11, 13, 14, 16, 32,34 … are not bought by Belgium customers. Lets find out which products bought by Argentina customers were not on demand by Belgium customers (in other words: what are the products that are bought by Argentina customers EXCEPT those are also bought by Belgium customers).

Products.Where( p => p.OrderDetails
  .Any( od => od.Order.Customer.Country == "Argentina" ))
.Except(
Products.Where( p => p.OrderDetails
  .Any( od => od.Order.Customer.Country == "Belgium" ))
)

Here is the result set:

except 

Like in Distinct and Union, Intersect, Except methods have support for additional custom comparer.

PS: Because of years of working with SQL I happened to write insert statements for LinqTest database in T-SQL. Instead I could do the data insertion part using Linq. For example if you wanted add few more it would look like:

void Main()
{
   List<BadData> persons = 
     new List<BadData>
     {
       new BadData {FirstName="Frank",LastName="Smith",
           Email1="m1",Email2="m2",
           Phone1="p1",Phone2="p2",Phone3="p3"},
       new BadData {FirstName="Cetin",LastName="Basoz"},
       new BadData {FirstName="Fred",LastName="Flintstone",
           Email1="fred@yellowstone.com",Phone1="555-111 11 11"}
     };

   BadData.InsertAllOnSubmit(persons);
   BadData.Context.SubmitChanges();

   // Check results
   BadData.Dump();
}

Or you could insert a single element like this:

BadData b = new BadData();
b.FirstName = "Jack";
b.LastName = "Smith";
b.Phone2 = "666-123 45 67";

BadData.InsertOnSubmit( b );
// more code if any
BadData.Context.SubmitChanges();

 

See you on next post.

Posted in Uncategorized | Leave a comment

>LINQ Operators – Part V

>

Joining operators:

Join: If you remember from query methods join is not needed in the case of direct relations in the model (however local queries are faster with join). You can create inner and left joins using Join operator. Syntax is simple:

Customers.Join(            // Customers
   Orders,                     // inner join Orders
   c => c.CustomerID,    // (Customers as c) on c.CustomerID
   o => o.CustomerID,   // (Orders as o) = o.OrderID
   (c, o) => new            // local aliases – select c.CustomerID, o.OrderID, o.OrderDate
       {
            c.CustomerID,
            o.OrderID,
            o.OrderDate
       } ) 
Is equivalent to (and generates) this SQL:
SELECT c.CustomerID, o.OrderID, o.OrderDate
FROM Customers c
INNER JOIN Orders o ON c.CustomerID = o.CustomerID

You can use a custom equality comparer delegate as a parameter.

GroupJoin: GroupJoin is  a type of join that groups the results into lists based on join key – list of lists where inner lists are "grouped" on key.  

Customers.GroupJoin(
   Orders,
   c => c.CustomerID,
   o => o.CustomerID,
   (input,outputSequence) => outputSequence
           .Select( r => new { r.CustomerID, r.OrderID, r.OrderDate } ) )
Posted in Uncategorized | Leave a comment

>LINQ Operators – Part IV

>

Grouping operators

GroupBy : GroupBy is used to group a sequence into subsets. At first look it feels like it is same as T-SQL "Group By" but it is not. It slices a sequence into multiple "rows" where each row have a "key" and subset of sequence matching to those key. Here is a screen shot how it looks with this simple query:

Customers.GroupBy( c => c.Country )

 

groupBy

GroupBy has an easy syntax with some overloads. You can optionally specify an element selector (projection) and also optionally specify a custom equality comparer. Equality comparer is less used and not different from the sample we used in OrderBy().

Here is a sample with an element selector:

Customers
   .GroupBy (
      c => c.Country,
      c =>
         new 
         {
            CustomerID = c.CustomerID,
            CompanyName = c.CompanyName
         }
   ) 
 
 
groupBySelector

Latter GroupBy selects only two properties of Customers instead of all properties. You can also group by multiple keys:

Customers
   .GroupBy (
      c =>
         new 
         {
            Country = c.Country,
            Region = c.Region
         }, // keySelector
      c =>
         new 
         {
            CustomerID = c.CustomerID,
            CompanyName = c.CompanyName
         }  //elementSelector
   ) 
 

groupByMultipleKeys

In all these queries you are getting a sequence of "groupings". You can "browse" the result with nested foreach() which is sort of nested scan…endscan:

var byCountry = Customers.GroupBy ( c => c.Country );
  
foreach(var g in byCountry)
{
   Console.WriteLine("Country: {0}", g.Key);
   foreach(var customer in g)
   {
      Console.WriteLine("\t {0}\t{1,-50}\t{2}",
         customer.CustomerID,
         customer.CompanyName,
         customer.ContactName);
   }
Partial result:

groupByMultipleKeys3

 

In other words we are dealing with a sequence of sequences. With LinqToSQL inner sequences are queried as they are needed resulting in "groups count" + 1 selects send to backend. This may or may not be the behavior you want depending on where you need it (you can see the series of SQLs in LinqPad’s SQL tab). To prevent this behavior you can convert Customers to an IEnumerable  from IQueryable (or to list, array…) first and then do grouping. For example:

Customers.AsEnumerable().GroupBy( c => c.Country )

AsEnumerable() is a conversion operator that converts the IQueryable to an IEnumerable ( in other words from a remote source to local source – similar to cursors ). We will use AsEnumerable() in some of the following samples that show statistics, nested grouping, distinct sequences.

This sample shows some statistics about Customers grouped by country:

Orders.GroupBy (
  o => o.OrderDate.GetValueOrDefault().Year )
.Select (
   sales =>
   new 
   {
     Year = sales.Key,
     Customers = sales
       .Select( s => s.CustomerID )
       .Distinct()
       .Count(),
     Countries = sales
       .Select( s => s.ShipCountry )
       .Distinct()
       .Count(),
     FirstSale = sales.Min( s => s.OrderDate ),
     LastSale  = sales.Max( s => s.OrderDate )
   }
  ).OrderBy( s => s.Year ) 

This sample groups customerId and countries by years they made an "order":

Orders.AsEnumerable()
  .GroupBy( o => o.OrderDate.GetValueOrDefault().Year )
  .Select (
      sales =>
      new 
      {
        Year = sales.Key,
        Customers = sales
            .Select( s => s.CustomerID )
            .Distinct(),
        Countries = sales
            .Select( s => s.ShipCountry )
            .Distinct()
       }
   ) 

This sample is a multilevel grouping. It first groups customers by first letter of countries and then by countries:

Customers.AsEnumerable()
.GroupBy( c => c.Country [0] )
.Select( c =>
      new 
      {
        c.Key,
        countries =
           c.GroupBy( cn => cn.Country )
            .Select( cs =>
                 new 
                 {
                   cs.Key,
                   clist =
                      cs.Select( cf =>
                        new 
                        {
                           cf.CompanyName,
                           cf.Country,
                           cf.Region})
                        }).OrderBy( cn => cn.Key )
         }).OrderBy( c => c.Key )

There is another grouping operator called ToLookup() that creates one to many dictionary. I reserved it into another category.

Posted in Uncategorized | Leave a comment

>LINQ Query Operators – Part III

>

Ordering Operators:

Ordering operators like T-SQL "Order By" is used for ordering sequences in different ways. There are 5 ordering operators:

  1. OrderBy
  2. OrderByDescending
  3. ThenBy
  4. ThenByDescending
  5. Reverse

Except the last operator Reverse() all those are used as:

operator( keySelector ) or operator( keySelector, customComparer )  – customComparer is for custom sorting and not supported by L2S (Reverse() is not supported either).

You should have noticed that "descending" is not a parameter but separate methods exist to sort descending.

OrderBy: Orders the sequence by a given "keySelector". i.e:

Linq
T-SQL
Customers.OrderBy( c => c.ContactName )
Select * from Customers order by ContactName

 

Seconds overload accepts a comparer. Here is a sample that sorts orders by :

// In LinqPad select C# Program from language combo
// Delete default code
// Copy Paste or better hands on, write yourself
// and press F5 to run

void Main()
{
// This overload is not supported by L2S
// Getting into a local list first
// We would review later what does "local" mean
// for now just consider queries on ordList 
// is not Linq To SQL 
// (sort of you are querying a local cursor which
// you created by a remote call to SQL server)
 
var ordList = Orders.Select( o => 
                  new {
                           OrderId = o.OrderID,
                           OrderDate = o.OrderDate
                         }).ToList();
 
// creating an instance of custom comparer
// and passing it as a parameter
// final Select projection is about showing
// the result of sort
// Note that the sort is by "Day Of Week" enumeration
// and not by "WeekDay" alphabetically

var myCustomList = 
               ordList
                  .OrderBy( o => 
                                 o.OrderDate, 
                                 new OrderComparer() ) 
                 .Select( o => new {
                            o.OrderId,
                            o.OrderDate, 
                            o.OrderDate.GetValueOrDefault().DayOfWeek} );

myCustomList.Dump();
}

// Here is our custom sorting class
// It accepts parameters of type nullable DateTime
// and returns comparison result based on
// DayOfWeek – like DOW() in VFP

public class OrderComparer : IComparer<DateTime?>
{
   public int Compare(DateTime? x, DateTime? y)
   {
     return 
        x.GetValueOrDefault().DayOfWeek
         .CompareTo( 
        y.GetValueOrDefault().DayOfWeek ); 
    }
}

 

OrderByDescending: Same as OrderBy() except that it sorts in descending order (it was obvious,isn’t it).

ThenBy, ThnenByDescending: ThenBy and ThenByDescending are used to sort on multiple columns. Hmm why would you need something like this? Can’t you simply use chain of OrderBy(), OrderByDescending() operators? Of course you can but I think these were added to prevent you from writing it the way you intended (like in natural English you would be saying order by X then by Y then by Z). Probably this is best explained by a sample. We want to sort customers by their country and then by company name. It is very easy to write it as in first query and fail to get what we really intended.

Linq Corresponding SQL
Customers
   .OrderBy(c => c.Country)
   .OrderBy(c => c.CompanyName)
   .Select( c => new {c.Country, c.CompanyName} )
Select [t0].[Country], [t0].[CompanyName]
FROM [Customers] As [t0]
ORDER BY [t0].[CompanyName], [t0].[Country]
Customers
   .OrderBy(c => c.CompanyName)
   .OrderBy(c => c.Country)
   .Select( c => new {c.Country, c.CompanyName} )

SELECT [t0].[Country], [t0].[CompanyName]
FROM [Customers] AS [t0]
ORDER BY [t0].[Country], [t0].[CompanyName]

Customers
   .OrderBy( c => c.Country)
   .ThenBy  ( c => c.CompanyName)
   .Select( c => new {c.Country, c.CompanyName} )

SELECT [t0].[Country], [t0].[CompanyName]
FROM [Customers] AS [t0]
ORDER BY [t0].[Country], [t0].[CompanyName]

In SQL the order of columns dictate it and it is easy to write exactly what we want ( order by country and then by company ). However in Linq method syntax, keep in mind with each and every chained method we get a new intermediate sequence as a result ( and you could see how a series of methods is working by attaching a "Dump()" to each of those methods ). i.e:

Customers
  .Take(10).Dump("Got top 10")
  .Select( c => new {c.Country, c.CompanyName} ).Dump("Projected to 2 columns: Country,Company")
  .OrderBy( c => c.Country).Dump("Sorted on Country")
  .OrderBy( c => c.CompanyName).Dump("Sorted on Company")

 

As you can see we are first sorting on country and "then by" without a ThenBy() operator on company. The net effect we are sorting on company name only. ThenBy() operator excellently translates our intention about "by Country  THEN BY  Company Name".

Reverse: On a local sequence reverses the sequence as its name suggest. i.e:

int[] numbers = {1,2,3,4,5};
numbers.Reverse().Dump();

string reversed = new string(
  "Reverse Me".Reverse().ToArray()
  );

Console.WriteLine(reversed);

Posted in Uncategorized | Leave a comment