segunda-feira, agosto 18, 2008

ASP.NET: DataSet vs DataReader (Pt. 03/03)

DataTables vs. DataSets
A number of commenters brought up mentions of DataTables, correctly pointing out that DataTables offer better efficiency than DataSets, but with more features than the DataReader. For example, one comment read:

I mostly use DataTables (sometimes DataReader) since pages usually deal with a single table of data. It is quite upsetting to find books and tutorials forever teaching the use of DataSet together with DataAdapter. I wonder how many know the DataAdapter works with DataTables stand alone.
-- icelava
Yes, it is true that the DataTable is more efficient than the DataSet. The DataSet is, after all, a set of DataTables; you can programmatically access the DataSet's collection of DataTables through its Tables property. And DataTables offer the functionality not found in DataReaders: DataTables support random access, can be sorted and filtered using DataViews, their contents can modified through insertions, updates, and deletes.

So should you use DataTables in lieu of DataSets? Sure, if you can. A DataTable doesn't support all of the features of a DataSet, so they might not be an appropriate choice. (For example, in .NET 1.x DataTables cannot be natively serialized into XML; additionally, if you need to represent relationships among multiple DataTables, you'll need to use a DataSet.) What it all comes back - and this was my thesis from the original Why I Don't Use DataSets article - is that you should use the right object for the job at hand. Each data object has a time and place - my contention is that DataSets (and DataTables) have limited use in ASP.NET applications.

Returning DataSets from Web Services
In my original article I mentioned that one use of DataSets is a conduit for returning database data from an XML Web Service. Since DataSets can trivially be serialized into XML, developers are quick to use DataSets when returning database information from a Web service. In fact, a number of commenters mentioned this:

If you're using webservices in your application that return result sets, then you would want to use a dataset to hold these result sets.
-- Wessam Zeidan
While DataSets are an easy way to return data, I think they are less than ideal for a couple of reasons. First, they add a lot of bloat to the returned XML payload since DataSets return not only the data they contain but also the data's schema. If you are only returning a small number of records, the payload's size can be dominated by the schema information. Second, DataSets have the air of being platform specific. While it's true that an XML serialized DataSet is, after all, just XML and therefore can be processed by a client on any platform, it still has a platform-specific feel to it since the resulting XML markup is dictated by Microsoft. A .NET client can automatically deserialize a DataSet's XML payload returned from a Web service back into a DataSet, thereby making DataSets an easy and attractive option. Clients using other platforms, however, will find they have to invest a lot more effort in order to work with the server's returned serialized DataSet.

The solution? Create a serializable, custom business object and have your Web services return an array of these custom objects. The overall payload size will be significantly less and the returned payload will be more inviting to clients not using .NET. One commenter summed up this sentiment nicely with:

For me, DataSets are too platform specific to use in Web Services. I prefer returning XML serialized arrays of objects. It may require my consumer to do a little bit more work, but it ensures that I have a wider base of consumers.
-- Scott
The one downside of returning custom collections from a Web service is that for .NET clients, these custom classes are serialized as classes with public fields as opposed to public properties. When binding a custom collection to a DataGrid or other data Web control, the data binding only allows binding to properties. Fortunately this nuisance is fixed in .NET 2.0. (For more information on the benefits of returning custom collections from Web services as well as how to work around the field/properties pain, be sure to check out the "Binding to Web Services" section of Dino Esposito's Collections and Data Binding article.)

Yes, Using DataReaders You Can Do That, Too!
There were a number of comments from readers who seemed to be using the DataSet because they (incorrectly) believed that the same functionality could not be accomplished through the use of DataReaders. One reader asked: "How would you handle sorting, paging and updating if you fill a DataReader into a DataGrid?" Assuming this commenter was asking about ASP.NET DataGrids, the answers can be found throughout the An Extensive Examination of the DataGrid Web Control article series here on 4Guys. See Part 4 for information on sorting, Part 15 for the low-down on paging, and Part 6 for the scoop on editing.

Additionally, in reading many of the comments in favor of DataSets it appeared that a number of commenters seemed to think that the DataSet was an ideal object for computing aggregates - counts, sums, and so on - or for retrieving two related tables and displaying fields from both. While the DataSet can accomplish these tasks, I'd recommend performing these operations at the database if possible. SQL has a rich set of aggregate functions - COUNT, SUM, MAX, and MIN, to name a few - and a join at the database level is always going to be more efficient than bringing back all of the data from both tables and relating the resulting rows at the ADO.NET layer. (Granted, there is a time and a place for this, such as when working with disconnected data in a desktop application, but I would contend that these features are not often needed in Web applications.)

How Important is Performance?
In my original article one of my main thrusts for not using DataSets was the performance disparity between DataSets and DataReaders. As I cited from A Speed Freak's Guide to Retrieving Data in ADO.NET, when bringing in several hundred or thousands of records, a DataSet can take several seconds to be populated, where as a DataReader still boasts sub-second response times. Of course, these comparisons against several hundred to several thousand records are moot if you are working with a much smaller set of data, say just 10 to 50 records in total. For these smaller sized result sets, the DataSet will only cost you less than a second (although the DataReader still is, percentage-wise, much more efficient).

This gives rise to the question, then, of "How often are you bringing back hundreds of records from the database?" And, more importantly, "Should you be bringing back hundreds (or thousands) of records?" One commenter noted that:

The [Speed Feak] article shows how performance degrades all the way up to 10,000 records... realistically you should never be displaying a client more than 100-200 over the web if not for performance purposes but just readability.
-- Eric Wise
I agree wholeheartedly, but what one "should do" vs. what one is "asked to do" can be two different things. I've had clients in the past who, despite my suggestions, were adamant about having large amounts of data shown on the page, be it a gargantuan DataGrid or a drop-down list with hundreds (or even thousands - eep!) of records. Additionally, many developers, when adding paging support to a DataGrid, either always use the default paging model or implement the default paging model at first with plans to later upgrade it to the custom paging model. Default paging, as you may know, is far less efficient than custom paging because it requires that the entire data to be paged through be brought back for each and every page, even though only a small subset of the data is actually shown. Custom paging is more intelligent - yet more difficult to implement - because it only retrieves the precise subset of data to display for the current page. (For more on default paging see An Extensive Examination of the DataGrid Web Control: Part 15; for information on custom paging, refer to my book, ASP.NET Data Web Controls Kick Start.)

Even if you are able to convince your client to show only a reasonable amount of data per page, and even if you have the foresight and expertise to use custom paging on all of your pageable DataGrids, the performance gain from using DataReaders as opposed to DataSets would make your application snappier and more adept to scale.

What It All Boils Down To: Design/Develop-Time Efficiency vs. Run-Time Efficiency
What the DataReader vs. DataSets argument really comes down to is design- and develop-time efficiency vs. run-time efficiency. As I mentioned in the original Why I Don't Use DataSets article, the DataReader boasts much greater run-time efficiency over the DataSet. But, as numerous readers noted, the DataSet comes with a number of features that are lacking in with the DataReader, thereby reducing the time it takes to code the application.

For read-only display data, Scott may have point. But for Web sites that collect information from the user and pass it back to the database, DataSets (or DataTables) are much more efficient because the data adapters have the update methods already constructed (auto generated code).
-- Dave C

I've ventured down both paths and came to the conclusion to use the DataSet over custom objects for complex data requests. It may be a heav[i]er object, but it's very powerful and very flex[i]ble. If you['re] pulling back a large set of relational or semi-relational data it's extraordinarily convenient and easy to implement. The features and services are just to[o] compelling to ignore.
-- Lynn

The #1 key benefit [of using DataSets] for me, though, is maintainability of the resultant code base and the rapidity that I can modify/add new functionality. Not to mention the fact that most any .NET developer walking-in off the street is going to be familiar with DataSets/DataAdapters....
-- Mike
If you are already familiar with the DataSet and/or working on a team that is familiar with that model, and if optimal run-time performance is not a concern, then you may find the DataSet to be a better tool for the job. Of course, not everyone enjoys the added features of the DataSet. Some, like this commenter, prefer the simplicity and higher degree of control that custom business objects affords:
I totally agree with Scott on this one. The DataSet, although a very novel disconnected design, brings with it a rather complex model to support. The more they work into your architecture, the more cumbersome things can become. That statement comes from using them for things like reading XML into the DataSet and having to do a bunch of customized sorting. The syntax is right down bewildering...

My personal approach is using business objects. There are a lot of reasons they end up being highly valuable. One of those reasons is the ability to port designs into other OO languages like Java. ... I was a huge advocate and had the whole DataSet stuff forced down my throat from a "top 3" consulting firm. I did use them and they are still in my architectures. But, from now on I think I am sticking to good old business objects.

-- David

Conclusion
This article served as a follow-up to my earlier article, Why I Don't Use DataSets in My ASP.NET Applications, addressing a number of comments made by readers in the associated blog entry. Thanks to all those who took the time to comment on my blog! If you'd like to continue this discussion, you can do so at http://scottonwriting.net/sowblog/posts/3867.aspx.


Article written by Scott Mitchell (avaliable Here)

Sem comentários: