Why LINQ beats SQL

If you're not a LINQ addict, you might wonder what the fuss is about. SQL isn't broken, so why fix it? Why do we need another querying language?

The popular answer is that LINQ is INtegrated with C# (or VB), thereby eliminating the impedance mismatch between programming languages and databases. While that's true, it's only part of the story. More importantly, LINQ is, in general, a significantly more productive querying language than SQL.

Compared to SQL, LINQ is simpler, tidier, and higher-level. It's rather like comparing C# to C++. Sure, there are times when it's still best to use C++ (as is the case with SQL), but in most situations, working in a modern tidy language and not having to worry about lower-level details is a big win.

SQL is a very old language—invented in 1974. Since then it's been extended endlessly, but never redesigned. This has made the language messy—rather like VB6 or Visual FoxPro. You might have become so accustomed to this that you can't see anything wrong!

Let's take an example. You want to write a simple query that retrieves customers as follows:

SELECT UPPER(Name)
FROM Customer
WHERE Name LIKE 'A%'
ORDER BY Name

That doesn't look too bad, right? But now suppose these results are feeding a web page, and we want to retrieve just rows 21-30. Suddenly, you need a subquery:

SELECT TOP 10 UPPER (c1.Name)
FROM Customer c1
WHERE
   c1.Name LIKE 'A%'
   AND c1.ID NOT IN
   (
      SELECT TOP 20 c2.ID
      FROM Customer c2
      WHERE c2.Name LIKE 'A%'
      ORDER BY c2.Name
   ) 
ORDER BY c1.Name

Not only is this complicated and messy, but it violates the DRY principle (Don't Repeat Yourself). Here's same query in LINQ. The gain in simplicity is clear:

var query =
   from c in db.Customers
   where c.Name.StartsWith ("A")
   orderby c.Name
   select c.Name.ToUpper();

var thirdPage = query.Skip(20).Take(10);

When we enumerate thirdPage, LINQ to SQL translates the query (that we composed in two steps) into a single SQL statement optimized for the version of SQL Server to which it's connected.

Composability

You might have noticed another more subtle (but important) benefit of the LINQ approach. We chose to compose the query in two steps—and this allows us to generalize the second step into a reusable method as follows:

IQueryable<T> Paginate<T> (this IQueryable<T> query, int skip, int take)
{
   return query.Skip(skip).Take(take);
}

We can then do this:

var query = ...
var thirdPage = query.Paginate (20, 10);

The important thing, here, is that we can apply our Paginate method to any query. In other words, with LINQ you can break down a query into parts, and then re-use some of those parts across your application.

Associations

Another benefit of LINQ is that you can query across relationships without having to join. For instance, suppose we want to list all purchases made by customers who live in Washington, whose purchased items exceed $1000 in total value. This requires joining across 4 tables (Purchase, Customer, Address and PurchaseItem). In LINQ, the query is effortless:

from p in db.Purchases
where p.Customer.Address.State == "WA"
where p.PurchaseItems.Sum (pi => pi.SaleAmount) > 1000
select p

No joining, grouping or messy subqueries. Extending this example, suppose we want to list the purchases in reverse order of value, and we want to include the salesperson's name and number of purchased items in the final projection:

from p in db.Purchases
where p.Customer.Address.State == "WA"
let purchaseValue = p.PurchaseItems.Sum (pi => pi.SaleAmount)
where purchaseValue > 1000
orderby purchaseValue descending
select new
{
   p.Description,
   p.Customer.SalesPerson.Name,
   PurchaseItemCount = p.PurchaseItems.Count()
}

Again, we have a clean, typesafe and composable query that follows the DRY principle.

Shaping Data

In SQL, queries come back as flat result sets. Quite often, though, it's far more useful to work with hierarchical data. For example, suppose we want to retrieve a selection of customers, each with their high-value purchases. In LINQ, you can simply do this:

from c in db.Customers
where c.Address.State == "WA"
select new
{
   c.Name,
   HighValuePurchases = c.Purchases.Where (p => p.Price > 1000)
}

HighValuePurchases, here, is a collection. And notice that because we were querying an association property, we didn't need to do any manual joining. We didn't even have to worry about whether to do a inner or outer join—LINQ abstracts this detail away! In this case, the query would translate to an outer join, because LINQ doesn't exclude rows just because a subcollection returns zero elements. If we wanted something that translated to an inner join, we'd do this:

from c in db.Customers
where c.Address.State == "WA"
let HighValuePurchases = c.Purchases.Where (p => p.Price > 1000)
where HighValuePurchases.Any()
select new
{
   c.Name,
   HighValuePurchases
}

LINQ also supports flat outer joins, adhoc joins, subqueries, and numerous other kinds of queries through a rich set of operators.

Parameterization

What if we wanted to parameterize our previous example, so that the state "WA" came from a variable? This is all we do:

string state = "WA";

var query =
   from c in db.Customers
   where c.Address.State == state
   ...

No messing with parameters on DbCommand objects or worrying about SQL injection attacks. LINQ's parameterization is inline, typesafe, and highly readable. It doesn't just solve the problem—it solves it really well.

Static Type Safety

In the preceding query, if we declared the state variable to be an integer rather than a string, the query would fail at compile-time rather than run-time. The same applies if you get any of the table or column names wrong. This is of real benefit when refactoring: the compiler will tell you if you haven't completely done your job.

Client Processing

LINQ lets you effortlessly shift part of the query onto the client for processing. Why would you want to do this? With a heavily burdened database server, it can actually sometimes improve performance. As long as you don't take more data than you need (in other words, you still do all the filtering on the server) you can often help performance by shifting some of the burden of reordering, transforming and regrouping the results onto a less loaded application server. With LINQ, all you need to do is to slip AsEnumerable() into the query, and everything from that point on executes locally.

When not to use LINQ

Despite its power, LINQ doesn't deprecate SQL. It takes more than 95% of the querying brunt, but you still sometimes need SQL for:

And of course you still need SQL for triggers. (SQL's also needed for stored procedures and functions, although the need for these crops up less often when you're using LINQ). You can combine LINQ with SQL by writing table-valued functions in SQL, and then calling those functions within more elaborate LINQ queries.

Having to know two querying languages is not really an issue because you'll want to learn LINQ anyway—LINQ is so useful for querying local collections and XML DOMs. If you're still using the old XmlDocument-based DOM,  you'll find LINQ to XML's DOM a dramatic step up.

And because LINQ is easier to master than SQL, the goal of writing really good queries is more achievable with LINQ than with SQL. 

LINQ to SQL in the Field

For ad-hoc queries, I use LINQ almost exclusively because it's simply more productive.

And for writing applications, my personal experience is the LINQ to SQL cuts the development time in writing a Data Access Layer by more than a half. The benefits of LINQ combined with a simple and lightweight object relational mapper (such as LINQ to SQL) make this one of the best technologies that Microsoft have ever produced.

- Joseph Albahari

Over 200,000 downloads

LINQPad

LINQPad


External links

Video Tutorials:
     Writing queries with LINQPad
       - Hi-res QuickTime
     LINQPad + EF
     V2 and Beyond
     Dimecasts LINQPad Intro
     LINQPad + OData

LINQPad Forum

Querying StackOverflow with LINQPad

LINQPad + Mindscape LightSpeed

LINQPad + OData

LINQPad + Windows Azure

LINQBugging WinForms/WPF

Terry's LINQPad Extensions

The LINQ Quiz

More...
 


LINQPad was written
by Joseph Albahari.

© 2007-2010