Wei's Blogger: Should you return IQueryable from Your Repositories

Should you return IQueryable<T> from Your Repositories

Posted a year ago by Alex Ford
 (edited a year ago)

The internet has been abuzz for a long time with fierce
 debates about whether or not you should return IQueryable<T> types from
 your repositories. I do have a fairly strong opinion on this issue so I
 figured I would weigh in on the discussion. First let's take a look at what
 they are talking about. We'll say they are using Entity
 Framework for the sake of my examples but they
 could just as easily be using LINQ to SQL.

Why would someone want to return IQueryable<T>?

Well it is true that IQueryable<T> is
 extremely powerful because any LINQ written against an IQueryable collection
 is supposed to be translated into an expression tree which is then translated
 into the data store's native language, such as SQL. This means that you only
 get back the data you query for rather than having all the data returned and
 then filtered down in memory on the server. Take a look at this quick example
 of a repository that returns IQueryable.

public interface IBookRepository

  {

      IQueryable<Book> Books { get; }

      void AddBook(Book book);

      void DeleteBook(Book book);

      void SaveChanges();

  }

public class BookRepository : IBookRepository

  {

      private Entities _objectContext;

      public IQueryable<Book> Books { get { return _objectContext.Books; } }

public BookRepository()

      {

          _objectContext = new Entities();

      }

public void AddBook(Book book)

      {

          _objectContext.Books.AddObject(book);

      }

public void DeleteBook(Book book)

      {

          _objectContext.Books.DeleteObject(book);

      }

public void SaveChanges()

      {

          _objectContext.SaveChanges();

      }

  }

The
 thing people love about this interface is that you can get books however you
 want.

IBookRepository bookRepo = new BookRepository();

  var booksByRowling =
 bookRepo.Books.Where(x => x.Author == "J.K.
 Rowling");

You
 are in complete control over how the data is retrieved which at first seems
 really awesome, but I'll show you in a minute why I don't think it's a good
 practice.

Why I don't think you should use IQueryable<T> outside of your repository.

I don't believe that you should return IQueryable for
 several reasons. First, it breaks unit testing capabilities. Half the point of
 compartmentalizing your data persistence logic into repositories is so that
 you can isolate and test that logic. If you're passing the buck up to the
 domain logic level then you're no longer able to test data persistence logic
 and domain logic by itself. In my opinion it completely breaks separation of
 concerns. You can't unit test a property of type IQueryable because
 you couldn't possibly account for all the logic possibilities that could be
 handed off from your domain logic. The person writing the domain logic is
 completely responsible for writing sound queries against the database! That's
 just absurd in my humble opinion.

Straight away the inability to properly test a component
 of your application should be reason enough to steer clear. However, there is
 another major issue that I haven't really seen discussed elsewhere on the web
 in relation to this debate. Returning IQueryable from your repository automatically forces the domain
 logic to make several assumptions. As I sort of mentioned before the first
 assumption is that the domain is responsible for writing query logic. If you
 divide up the pieces of your application among different developers then those
 developers working on domain logic are forced to write code against that IQueryable interface
 in order to retrieve the data they want. The second assumption it makes is
 that the data store will ALWAYS know how to properly implement IQueryable. If you
 ever try to re-factor your repository layer to use a different data storage
 technique then you will likely also have to re-factor your domain logic.

Aside from test-ability the other major reason for
 implementing the repository pattern is to create an application seam. The
 repository layer is supposed to hide the data persistence code from the rest
 of the application. The domain logic is ideally supposed to talk to the
 repository layer through interfaces and remain ignorant of what is actually
 going on behind those interfaces. This enables you to re-write your interfaces
 at any time to use a completely different data persistence technology. By
 returning IQueryable you automatically tie your repository layer to any data
 persistence technology that properly implements IQueryable. Yes technically you
 could implement IQueryable yourself and attempt to translate the LINQ written
 against it into something usable for whatever data persistence tech you're
 using, but that's a huge commitment and far from fool-proof. If you still
 decide to return IQueryable then you may as well resolve yourself to the fact that
 you will be stuck using either LINQ to SQL or Entity Framework because as of
 this writing I am not aware of any other tech that properly implements IQueryable.

The proper way to do it.

I definitely understand the desire to return such a
 useful and powerful type. But for the sake of test-ability and maintainability
 I highly advise against it. Instead you should be exposing necessary functions
 through your repository interfaces. Make your methods more verbose so that
 writing domain logic that uses it is simple and easy. For example, instead of
 writing a method called GetBooks write a method called GetBooksByAuthor(string author) and another
 called GetBooksByDateRange(DateTime startDate, DateTime endDate), etc. Yes
 the method names get larger and yes you will be writing more methods, but when
 you think about it from the perspective of maintainability it becomes pretty
 clear that it's the right way to go.

Many people have argued that IQueryable<T> makes it so easy to do paging because you can just do var pageOfBooks =
 bookRepo.Books.skip(itemsPerPage * (page-1)).Take(itemsPerPage); in your domain logic. It's not that much harder to
 write a method on your repository and implement that paging logic there.

public IList<Book> GetBooksByPage(int page, int
 itemsPerPage)

  {

      return _objectContext.Books.Skip(itemsPerPage *
 (page-1)).Take(itemsPerPage).ToList();

  }

You
 still get the benefit of having that paging executed in the data store so
 you're not returning superfluous rows that you're not even going to use, but
 it's compartmentalized in a repository method so that it's hidden from the
 rest of your application.

The counter-argument to this is a scenario where you
 need a page of results but you also need the count for how many items there
 are total (usually for calculating the maximum number of pages). To do this
 people often think they need two methods. One method to get the paged result
 set and another method to get the item count. If you did have to use two
 methods for that it does seem a little silly, but you don't. See
 my post about how to properly return a paged
 result set.

A related discussion is whether or not you should return
 IEnumerable<T> or IList<T>. Personally
 I think it really depends. IEnumerable is usually used for lazy loading, which means that the collection isn't populated until
 you iterate over it (such as in a foreach loop or by using certain LINQ methods like ToList()). This means that if you loop over data in an MVC view
 then the data is being retrieved at the time it's being rendered to the view.
 There are definitely edge-cases where this is useful, such as when accessing a
 navigation property on an entity object, but it's not really useful in the
 verbose repository pattern that I'm promoting. This is because I like my
 repository methods to be very specific which almost always means that whatever
 data I'm getting back is going to be used because I'm calling specific methods
 to get specifically what I want, instead of calling generic methods and then
 filtering through the results. So I always return IList and call the ToList() LINQ method after my query results in my repository
 method so that the query is executed and the data is loaded immediately. By
 the time my data hits my domain it is already in memory waiting to be
 consumed.

The bottom line is that returning IEnumerable is
 questionable at best, but returning IQueryable is useless and ruins maintainability and test-ability.

7 Comment(s)

Darryl Park

posted 2 months ago

It's
 all a matter of opinion. In your scenario your repository is eventually going
 to have a very large number of properties and methods as you'll have to go
 back to modify your repository every time you want a result in a different
 layout or format. Also, if your repository is based on an interface to support
 Dependency Injection, your interface is going to be massive. You'll have to
 modify your interface, then any fakes that you're using in your tests will
 have to implement those new methods and will probably just have
 notsupportedexceptions in them. 

An
 even bigger problem comes when you start to have a more complex object model
 in your entities and you need to pull back a large number of entities which
 reference a sub entity collection (one to many). if you're always returning
 IEnumerable<entity> then you'll actually be returning entire rows of
 data when you might only need a couple fields from the entities in question. 

An
 example (pretend it's a controller action method): 

var author = bookRepo.GetAuthorByName("J.K.
 Rowling"); // say this returns an Author

  var books = author.Books;

  return View(books.Select(b => b.Title);

In
 the above, the entire author record is returned, then the entire book records
 are returned just to get the name.

Instead
 if it were returning IQueryable<Author>, with a slight modification the
 select from the database server returns only the book titles in a simpler
 query. Less across the wire, less for the SQL server to process.

var bookTitles =
 bookRepo.Authors.Where(a => a.Name == "J.K.
 Rowling")

  .SelectMany(a => a.Books.Select(b => b.Title))

  .ToList();

  var books = author.Books;

  return View(bookTitles);

Analyze
 the query with sql profiler or the like.

I
 agree with the Testing pains or IQueryable, but the flexibility of IQueryable
 outweighs that in my opinion.

Just
 be sure to have both unit tests using faked/mocked entities along with
 integration tests against a test database.

Mattias 

posted 6 months ago

Interesting.
 I am making my first web api using entity framework these days, and I have
 been returning IQueryables from my repositories so far, knowing about the
 limitations it puts on my ability to write good tests. The reason is of course
 because my controller needs to shape the data (include/exclude fields) based
 on the request. Sometimes you want the response to include images, sometimes
 you'd just want links to them. Also paging is something I handle outside the
 repository by now. I will probably refactor this tomorrow. I can't really
 picture it without seeing my code, but I am afraid that, sometimes, when I use
 multiple repositories (in a unit of work) to get my job done, I will have to
 access the database multiple times. We'll see. :)

Alex Ford

posted 7 months ago

Actually
 Robert that's why my repository methods are more specific and only return a
 page of results.

public IList&lt;Book&gt; GetBooksByPage(int page, int
 itemsPerPage)

  {

      return _objectContext.Books.Skip(itemsPerPage *
 (page-1)).Take(itemsPerPage).ToList();

  }

_objectContext.Books is still IQueryable within the repository so SQL is generated that only
 returns a single page of results from the database and then that page of
 results is returned from the repository as a list.

Roberto Rodríguez Bouzo 

posted 7 months ago

About
 the using of Iqueryable or Ienumerable there is concern I think you haven´t
 talked about. What happen when you have a database with let´s say 500000
 records. If you dont use IQueryable you will be getting all records from the
 database in any single query you make. And therefore you app will we really
 slow.

Thanks

steve 

posted 7 months ago

I
 think your post assumes you are doing all the design and development from
 scratch. And have a limited number of tables... a perfect world scenario.

As
 always, legacy issues come into account, and the current project I am working
 on has hundreds of tables and the majority of code is very old.

I
 do not have the time nor patience to write an individual repository for every
 single object. In this case, a generic Repository&lt;T&gt; that
 returns IQueryable&lt;T&gt; works out really nicely.

And
 I am still able to implement a fake repository that takes an
 IEnumerable&lt;T&gt; in the constructor to unit test my queries
 against. Sure it isn't the exact same
 thing as pure SQL, but at some point, don't you have to trust the framework is
 doing what it is supposed to be doing? Hence why you don't write unit tests
 for Math.Add() 

You
 are allowed to treat those things as reliable/safe.

Custom
 repositories per table/object is not a realistic pattern in my experience. If
 it is a heavily used application, it isn't maintainable over a significant
 length of time. You will be swamped by these individual classes, and when
 maintaining that layer is harder than maintaining an ORM layer, what is the
 point of even having an ORM anyway? Just write stored procs in custom classes,
 it'll be faster. Bleh!

Daniel
 Flöijer 

posted 9 months ago

Thanks
 for this article. I've been reading a lot about what patterns to use for data
 access with LINQ to SQL recently and I really agree with you.

Now
 I just need to figure out how to write efficient LINQ to SQL queries with
 eager loading :-)

SoftJunk 

posted a year ago

I
 love this post very much as I always believe repository should not let higher
 layers to be linked with database. Thought my repository do have only one
 SELECT method, but my service layer sets all kind of where conditions an d
 order conditions and paging etc information in different properties of
 repository (List, dictionary, string etc).

And
 in the select method I generate the expression to pull the data from database.
 So I am not in problem if I have to go to different database source, cause I
 just have to change the code in repository to interact with database :)

Inserted from <http://www.codetunnel.com/blog/post/should-you-return-iqueryablet-from-your-repositories>

Wei's Blogger

页面

2013年12月24日星期二

Should you return IQueryable from Your Repositories