Amazon.com Widgets Multithreading

WilliaBlog.Net

I dream in code

About the author

Robert Williams is an internet application developer for the Salem Web Network.
E-mail me Send mail
Code Project Associate Logo
Go Daddy Deal of the Week: 30% off your order at GoDaddy.com! Offer expires 11/6/12

Recent comments

Archive

Authors

Tags

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.


Using System.Threading.Tasks and BlockingCollections to FTP multiple Files at the same time

I recently needed to write an application that would loop through a queue of files and FTP them to our Content Delivery Network for streaming. Users upload files, and our administrators can mark some of them as urgent. Urgent ones need to jump to the front of the queue, otherwise everything should be orderd by broadcast date. My initial code was basically a loop that looked something like this:

While (GetFtpQueue().Count > 0)
{
    // Gather all the info from the db,  Ftp the file, then clean up (move the source file, update the db, etc.

}

It worked beautifully while we were just uploading small audio files, but as soon asl we started adding a lot of video files to the queue it became so slow that it might take 2 hours or more to upload a single video file. So, we experimented with Filezilla to see how many concurrent uploads we could add before the overall speed of each upload started to drop. We found that at our location, 4 simultaneous FTP uploads seemed to hit the sweet spot: instead of uploading 1 file at 500 kb/s we could upload all four and each one would still be at that speed, quadrupling our throughput.

I read up on using the new Threading classes in .Net 4.0, and began refactoring my FTP application. I decided to use the Task Factory to manage the threads, in conjunction with a BlockingCollection to create a classic Producer/Consumer pattern. My first attempt looked a lot like this:

int maxThreads = 4;
var filesToFtp = new BlockingCollection<FtpItem>(maxThreads);
var processedFiles = new BlockingCollection<FtpItem>();

// Stage #1: Producer
var initFtp = Task.Factory.StartNew(() =>
{
    try
    {
        While (GetFtpQueue().Count > 0)
        {
            // Gather all the info from the db and use it to create FtpItem objects
            // Add them to list of filesToFtp, which only allows maxThreads in at a time (this allows us to have an urgent item jump to the top while current items are still FTPing)
            filesToFtp.Add(new FtpItem { ... };
        }
    }
    finally { filesToFtp .CompleteAdding(); }
});

// Stage #2 Consumer of initFtpTask and Producer for Cleanup Task
var process = Task.Factory.StartNew(() =>

{
    try
    {
        foreach(var file in filesToFtp.GetConsumingEnumerable()
        {
            // Ftp the file
            // Add to list of processedFiles
            processedFiles.Add(file);
        }
    }
    finally { processedFiles.CompleteAdding(); }
});

// Stage #3
var cleanup = Task.Factory.StartNew(() =>
{
    foreach(var file in processedFiles.GetConsumingEnumerable()
    {
        // Clean up (move the source file, update the db, etc.
    }
});

Task.WaitAll(initFtp, process, cleanup);

Initially, this looked quite promising. I wrote a bare bones version of it like the one above that just did thread.sleep to simulate work and iterated through a list of ints. I was ablt to verify that each "stage" was running on it's own thread, that it never allowed more than 4 items through at a time, that I could add items to the front of the queue and get them processed next, and that it never tried to 'cleanup' a file until that file had passed through both stage 1 and stage 2. However, I did notice that the elapsed time was the same as when I ran a similar unit test in a simple while loop. It might be obvious to you why this is, but at the time I put it down to a limitation of the unit test and pushed my new code to production. The first thing I noticed was that it wasn't any faster. Not even slightly. It took me hours of staring at the code to finally figure out why my multi threaded code was not running any faster, but the answer is simple: I only created one consumer of filesToFtp. I had incorrectly assumed that because I was creating up to 4 ftpItems at a time, and the ftp process was running on it's own thread, that it would consume as many as it could, but the reality is that in the code above, while each of the three stages are running on their own thread, the whole process was still happening in series, since stage 1 doesn't create 4 items at once, it creates them one after the other, stage 2 does begin working before stage 1 is complete (as soon as there is an item to consume), but then it will be busy Ftping that first item until that item is fully uploaded, only then will it grab the next file.

To resolve this problem, I simply wrapped stage 2 in a for loop, and created a IList of Tasks to wait on:

int maxThreads = 4;
var filesToFtp = new BlockingCollection<FtpItem>(maxThreads);
var processedFiles = new BlockingCollection<FtpItem>();
IList<Task> tasks = new List<Task>();

// Stage #1: Producer
tasks.Add(Task.Factory.StartNew(() =>
{
    try
    {
        While (GetFtpQueue().Count > 0)
        {
            // Gather all the info from the db and use it to create FtpItem objects
            // Add them to list of filesToFtp, which only allows maxThreads in at a time (this allows us to have an urgent item jump to the top while current items are still FTPing)
            filesToFtp.Add(new FtpItem { ... };
        }
    }
    finally { filesToFtp .CompleteAdding(); }
}));

// Start multiple instances of the ftp process
for (int i = 0; i < maxThreads; i++)
{
    // Stage #2 Consumer of initFtpTask and Producer for Cleanup Task
    tasks.Add(Task.Factory.StartNew(() =>
    {
	try
	{
		foreach(var file in filesToFtp.GetConsumingEnumerable()
		{
			// Ftp the file
			// Add to list of processedFiles
			processedFiles.Add(file);
		}
	}
	finally { processedFiles.CompleteAdding(); }
	}));
}

// Stage #3
tasks.Add(Task.Factory.StartNew(() =>
{
	foreach(var file in processedFiles.GetConsumingEnumerable()
	{
		// Clean up (move the source file, update the db, etc.
	}
}));

Task.WaitAll(tasks.ToArray());

I reran the unit test and it was faster! Very nearly 4 times faster in fact. Wahoo! I updated the code, published my changes and sat back. Sure enough, the Ftp process finally started to make up some ground. In the mean time, I went back to my unit test and began tweaking. The first thing I noticed was that sometimes I would get a "System.InvalidOperationException: The BlockingCollection<T> has been marked as complete with regards to additions." Luckily, this didn't take a lot of head scratching to figure out: the first thread to reach the 'finally' clause of  stage 2 closed the processedFiles collection, leaving the other three threads hanging. A final refactoring resolved the issue:

int maxThreads = 4;
var filesToFtp = new BlockingCollection<FtpItem>(maxThreads);
var processedFiles = new BlockingCollection<FtpItem>();
IList<Task> tasks = new List<Task>();

// maintain a seperate list of wait handles for the FTP Tasks, 
// since we need to know when they all complete in order to close the processedFiles blocking collection
IList<Task> ftpProcessTasks = new List<Task>();

// Stage #1: Producer
tasks.Add(Task.Factory.StartNew(() =>
{
	try
	{
		While (GetFtpQueue().Count > 0)
		{
			// Gather all the info from the db and use it to create FtpItem objects
			// Add them to list of filesToFtp, which only allows maxThreads in at a time (this allows us to have an urgent item jump to the top while current items are still FTPing)
			filesToFtp.Add(new FtpItem { ... };
		}
	}
	finally { filesToFtp .CompleteAdding(); }
}));

// Start multiple instances of the ftp process
for (int i = 0; i < maxThreads; i++)
{
	// Stage #2 Consumer of initFtpTask and Producer for Cleanup Task
	ftpProcessTasks.Add(Task.Factory.StartNew(() =>
	{
		try
		{
			foreach(var file in filesToFtp.GetConsumingEnumerable()
			{
				// Ftp the file
				// Add to list of processedFiles
				processedFiles.Add(file);
			}
		}
	}));
}

// Stage #3
tasks.Add(Task.Factory.StartNew(() =>
{
	foreach(var file in processedFiles.GetConsumingEnumerable()
	{
		// Clean up (move the source file, update the db, etc.
	}
}));


// When all the FTP Threads complete
Task.WaitAll(ftpProcessTasks.ToArray());

// Notify the stage #3 cleanup task that there is no need to wait, there will be no more processedFiles.
processedFiles.CompleteAdding();

// Make sure all the other tasks are complete too.
Task.WaitAll(tasks.ToArray());

Download a working example (Just enter your FTP Server details prior to running):

ProducerConsumer.zip (11.18 mb)


Posted by Williarob on Monday, April 18, 2011 11:47 AM
Permalink | Comments (0) | Post RSSRSS comment feed

Multi-Threading in ASP.NET

ASP.Net Threading

Inside the ASP.Net Worker Process there are two thread pools. The worker thread pool handles all incoming requests and the I/O Thread pool handles the I/O (accessing the file system, web services and databases, etc.). Each App Domain has its own thread pool and the number of operations that can be queued to the thread pool is limited only by available memory; however, the thread pool limits the number of threads that can be active in the process simultaneously.

  Asp.Net Threading, Threadpools
  Source: Microsoft Tech Ed 2007 DVD: Web 405  "Building Highly Scalable ASP.NET Web Sites by Exploiting Asynchronous Programming Models" by Jeff Prosise.

So how many threads are there in these thread pools? I had always assumed that the number of threads varies from machine to machine – that ASP.NET and IIS were carefully and cleverly balancing the number of available threads against available hardware, but that is simply not the case. The fact is that ASP.Net installs with a fixed, default number of threads to play with: the 1.x Framework defaults to just 20 worker threads (per CPU) and 20 I/O threads (per CPU). The 2.0 Framework defaults to 100 threads in each pool, per CPU. Now this can be increased by adding some new settings to the machine.config file. The default worker thread limit was raised to 250 per CPU and 1000 I/O threads per CPU with the .NET 2.0 SP1 and later Frameworks. 32 bit windows can handle about 1400 concurrent threads, 64 bit windows can handle more, though I don’t have the figures.

In a normal (synchronous) Page Request a single worker thread handles the entire request from the moment it is received until the completed page is returned to the browser. When the I/O operation begins, a thread is pulled from the I/O thread pool, but the worker thread is idle until that I/O thread returns. So, if your page load event fires off one or more I/O operations, then that main worker thread could be idle for 1 or more seconds and in that time it could have serviced hundreds of additional incoming page requests.

  Asp.net Threadpool Saturation
  Source: Microsoft Tech Ed 2007 DVD: Web 405  "Building Highly Scalable ASP.NET Web Sites by Exploiting Asynchronous Programming Models" by Jeff Prosise.

So long as the number of concurrent requests does not exceed the number of threads available in the pool, all is well. But when you are building enterprise level applications the thread pool can become depleted under heavy load, and remember by default heavy load is more than just 200 simultaneous requests assuming a dual CPU Server.

When this happens, new requests are entered into the request queue (and the users making the requests watch that little hour glass spin and consider trying another site). ASP.NET will allow the request queue to grow only so big before it starts to reject requests at which point it starts returning Error 503, Service Unavailable.

If you are not aware of this “Glass Ceiling of Scalability”, this is a perplexing error – one that never happened in testing and may not be reproducible in your test environment, as it only happens under extreme load.

Asynchronous Programming models in ASP.NET

To solve this problem ASP.Net provides four asynchronous Programming models. Asyncronous Pages, Asyncronous HttpHandlers, Asyncronous HttpModules and Asyncronous Web Services. The only one that is well documented and reasonably well known is the asynchronous Web Services model. Since there is quite a lot of documentation on that, and since in future web services should be implemented using the Windows Communication Foundation, we shall concentrate only on the other three.

Let’s begin with the first asynchronous programming model, Asynchronous Pages.

Asynchronous Pages

  Lifecycle of Synchronous/Asynchronous Pages in ASP.NET
  Source: Microsoft Tech Ed 2007 DVD: Web 405  "Building Highly Scalable ASP.NET Web Sites by Exploiting Asynchronous Programming Models" by Jeff Prosise.

To make a page Asynchronous, we insert what we refer to as an “Async Point” into that page’s lifecycle, which you can see in green on the right. We need to write and register with ASP.NET a pair of Begin and End Events. At the appropriate point in the page’s lifecycle, ASP.NET will call our begin method. In the begin method we will launch an asynchronous I/O operation, for example an asynchronous database query, and we will immediately return from the begin method. As soon as we return, ASP.Net will drop the thread that was assigned to that request back into the thread pool where it may service hundreds or even thousands of additional page requests while we wait for our I/O operation to complete.

As you’ll see when we get to the sample code, we return from our begin method an IAsyncResult Interface, through which we can signal ASP.NET when the async operation that we launched has completed. It is when we do that, that ASP.NET reaches back into the thread pool, pulls out a second worker thread and calls our end method, and then allows the processing of that request to resume as normal.

So, from ASP.NET’s standpoint it is just a normal request, but it is processed by 2 different threads; and that will bring up a few issues that we’ll need to discuss in a few moments.

Now, none of this was impossible with the 1.1 framework, but it was a lot of extra work, and you lost some of the features of ASP.NET in the process. The beauty of the 2.0 and later frameworks is that this functionality is built right into the Http pipeline, and so for the most part everything works in the asynchronous page just as it did in the synchronous one.

In order to create an Asynchronous page you need to include the Async=”True” attribute in the page directive of your .aspx file. That directive tells the ASP.NET engine to implement an additional Interface on the derived page class which lets ASP.NET know at runtime that this is an asynchronous page.

What happens if you forget to set that attribute? Well the good news is that the code will still run just fine, but it will run synchronously, meaning that you did all that extra coding for nothing. I should also point out that to make an Asynchronous data call, you also need to add “async=true;” or “Asynchronous Processing=true;” to your connection string – If you forget that and make your data call asynchronously, you will get a SQL Exception.

The second thing we need to do in order to create an asynchronous page is to register Begin and End Events. There are 2 ways to register these events. The first way is to use a new method introduced in ASP.NET 2.0 called AddOnPreRenderCompleteAsync:

using System;

using System.Net;

using System.Web;

using System.Web.UI;

using System.Web.UI.WebControls;

 

public partial class temp : System.Web.UI.Page

{

    private static readonly Uri c_UrlImage1 = new Uri(@"http://williablog.net/williablog/image.axd?picture=2010%2f1%2fSlide6.JPG");

    private HttpWebRequest request;

 

    void Page_Load(object sender, EventArgs e)

    {

        request = (HttpWebRequest)WebRequest.Create(c_UrlImage1);

 

        AddOnPreRenderCompleteAsync(

            BeginAsyncOperation,

            EndAsyncOperation

        );

    }

 

    IAsyncResult BeginAsyncOperation(object sender, EventArgs e,

        AsyncCallback cb, object state)

    {

        // Begin async operation and return IAsyncResult

        return request.BeginGetResponse(cb, state);

    }

 

    void EndAsyncOperation(IAsyncResult ar)

    {

        // Get results of async operation

        HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(ar);

        Label1.Text = String.Format("Image at {0} is {1:N0} bytes", response.ResponseUri, response.ContentLength);

    }

}

 


The second way is to use RegisterAsyncTask:

using System;

using System.Net;

using System.Web;

using System.Web.UI;

using System.Web.UI.WebControls;

 

public partial class temp : System.Web.UI.Page

{

    private static readonly Uri c_UrlImage1 = new Uri(@"http://williablog.net/williablog/image.axd?picture=2010%2f1%2fSlide6.JPG");

    private HttpWebRequest request;

 

    void Page_Load(object sender, EventArgs e)

    {

        request = (HttpWebRequest)WebRequest.Create(c_UrlImage1);

 

        PageAsyncTask task = new PageAsyncTask(

                BeginAsyncOperation,

                EndAsyncOperation,

                TimeoutAsyncOperation,

                null

            );

 

        RegisterAsyncTask(task);

    }

 

    IAsyncResult BeginAsyncOperation(object sender, EventArgs e,

        AsyncCallback cb, object state)

    {

        // Begin async operation and return IAsyncResult

        return request.BeginGetResponse(cb, state);

    }

 

    void EndAsyncOperation(IAsyncResult ar)

    {

        // Get results of async operation

        HttpWebResponse response = (HttpWebResponse)request.EndGetResponse(ar);

        Label1.Text = String.Format("Image at {0} is {1:N0} bytes", response.ResponseUri, response.ContentLength);

    }

 

    void TimeoutAsyncOperation(IAsyncResult ar)

    {

        // Called if async operation times out (@ Page AsyncTimeout)

        Label1.Text = "Data temporarily unavailable";

    }   

}

 

These methods can be called anywhere in the page’s lifecycle before the PreRender event, and are typically called from the Page_Load event or from the click event of a button during a postback. By the way, you can register these methods from within a UserControl, as long as that control is running on a page that has set the async = true attribute. Again, if it runs on a page without that attribute, the code will still run just fine, but it will run synchronously.

As you can see from just these simple examples, building asynchronous pages is more difficult than building synchronous ones. I’m not going to lie to you. And real world use of these techniques is even more complicated – there is no Business Logic or data layer in the examples above. I don’t want you to leave here believing that you need to make every page asynchronous. You don’t. What I recommend, is doing surgical strikes. Identify that handful of pages in your application that perform the lengthiest I/O and consider converting those into asynchronous pages. The cool thing about this, is that it can improve not only scalability, but also performance, because when you are not holding onto the threads, new requests get into the pipeline faster, they spend less time waiting in that application request queue out there. So, users are happier because pages that they would have had to wait on before – even the ones you have not converted to asynchronous pages, but which might have been delayed while threads were idle, will now load faster. What’s more, as you’ll see in a moment, using RegisterAsyncTask will allow you to perform I/O operations in parallel, which may also improve performance. Having said that, making pages asynchronous is not really about improving performance, it is about improving scalability – making sure that we use the threads in the thread pool as efficiently as we possibly can.

Now I’m sure you are wondering why there are two ways, what the differences are between them, and when you should choose one over the other. Well, there are 3 important differences between AddOnPreRenderCompleteAsync and RegisterAsyncTask.

  1. As we have seen, RegisterAsyncTask allows you to specify a timeout method. It is important to note that the timeout value you specify in the Page Directive of your .aspx page <%@ Page Async="true" AsyncTimeout="5" ... %> is the timeout for ALL asynchronous tasks the page is performing, not 5 secs per async task - all async tasks must be competed within 5 seconds, so be sure to allow enough time here.
  2. If you use AddOnPreRenderCompleteAsync, you may find that some things that worked before, no longer work. For example, if you are using User.Identity.Name in your code to get the authenticated username inorder to personalize a page. If this method is called by the first thread it will work fine. But if you call it on the second thread – in your End method or any of the events that fire after the end method User.Identity.Name will be null. This is because as a Request travels through the ASP.NET Http Pipeline, it is accompanied by an object of type HttpContext that basically encapsulates all of the information that ASP.NET knows about that request. When you use AddOnPreRenderCompleteAsync ASP.NET does not take the extra time to map everything in that context object from thread one to thread two. That’s why User.Identity.Name does not work in thread two. In fact, you will often find that HttpContext.Current is null in thread two. However, if you use RegisterAsyncTask, ASP.Net DOES map everything in that context from thread one to thread two. It does take a few microseconds longer to do this, but it will make your life considerably easier.
  3. The third difference is probably the most important of all. AddOnPreRenderCompleteAsync is a quick and easy way of making a page asynchronous and works well if you have a simple page that needs to perform only 1 asynchronous I/O operation. In real life, a page often needs to perform multiple database queries, or grab data from a webservice and pass it to a database, or something like that. The cool thing about RegisterAsyncTask is that it allows you to quickly and easily queue up multiple Async I/O operations. The last argument is a Boolean value that allows you to specify whether each task can run in parallel. Some times you need to wait for one data call to complete in order to send that data somewhere else, but other times you may need to get data from multiple, unrelated sources and this allows you to fetch them all at the same time, instead of one after the other.

  Controlling Order of Operations
  Source: Microsoft Tech Ed 2007 DVD: Web 405  "Building Highly Scalable ASP.NET Web Sites by Exploiting Asynchronous Programming Models" by Jeff Prosise.

N-Tier Applications

OK. So I expect some of you are thinking “But what if I have a data access layer in my application? My pages can’t go directly to the database, they have to go through that data access layer, or they have to go through my BLL, which calls the Data Access Layer.”

Well, ideally, you should simply add the asynchronous methods to your DAL. If you wrote the DAL yourself or have access to its source code, you should add the Begin and End methods to it Adding the asynchronous methods to your DAL is the best, most scalable solution and doesn’t change the example code much at all: Instead of calling begin and end methods defined inside the page class, you simply call MyDAL.Begin… or MyBll.Begin… when you call RegisterAsyncTask or AddOnPreRenderAsync.

Unfortunately, neither Llblgen nor the Enterprise library (nor LINQ for that matter) supports asynchronous data calls natively. However, I believe that you can modify the generated code in llblgen to enable asynchronous data calls. You could also crack open the source code of the Enterprise library and add the asynchronous methods yourself, but before you try check to see if it has already been done.

Asynchronous HTTP Handlers

The 2nd Asynchronous Programming model in ASP.NET is for HttpHandlers and has been around since .Net 1.x, but was not documented any better in version 2 than it was in version 1. Http Handlers are one of the two fundamental building blocks of ASP.NET, an http handler is an object that is built to handle http requests and convert them into http responses. For the most part, each handler corresponds to a file type. For example, there is a built in handler in ASP.NET that handles .aspx files. It is that handler that knows how to instantiate a control tree and send that tree to a rendering engine. The ASMX Handler knows how to decode SOAP and allows us to build web services.

Basically an HTTP Handler is just a class that implements the IHttpHandler interface, which consists of an IsResuable Boolean function and a ProcessRequest method which is the heart of an httphandler as its job is to turn a request into a response. The ProcessRequest method is passed an HttpContext Object containing all the data asp.net has collected about the request, as well as exposing the Session, Server, Request and Response objects that you are used to working with in page requests.

 

using System.Web;

 

public class HelloHandler : IHttpHandler

{

    public void ProcessRequest(HttpContext context)

    {

        string name = context.Request["Name"];

        context.Response.Write("Hello, " + name);

    }

 

    public bool IsReusable

    {

        get { return true; }

    }

}

 

There are 2 ways to build them. One way is to add the class to your project and register is in the web.config. If you want to register it for any file extension not currently handled by asp.net, you would need to add that extension to IIS. The easier way, is to deploy them as .ASHX files. The ASHX extension has already been registered in IIS, it is auto compiled, no changes are required in the web.config and performance is the same. Ok. So you know what they are and how to build one, when is an appropriate time to use one?

Handlers are commonly used to generate custom XML and RSS feeds, to unzip and render files stored as BLOB fields in the database including image files or logos, HTTP Handlers can also be used as the target of AJAX calls.

A common mistake that programmers new to .net, especially those like myself who came from classic ASP or PHP, is to use the Page_Load event of a page to create a new http response. For example, before I learned about httphandlers, I would use the page load event to create an xml document or a dynamic PDF file and output it to the response stream with a response.End() to prevent the page continuing after I output my file. The problem with that approach is that you are executing a ton of code in asp.net that doesn’t need to execute. When ASP.NET sees that request come in, it thinks it is going to need to build and render a control tree. By pointing the link at the handler instead, you will gain a 10-20% performance increase every time that request is fetched, just because of the overhead you have reduced. Put simply, Http Handlers minimize the amount of code that executes in ASP.NET.

To implement an Asynchronous handler you use the interface IHttpAsyncHandler, which adds BeginProcessRequest and EndProcessRequest methods. The threading works the same way as with an async page. After the begin method is called, the thread returns to the thread pool and handles other incoming requests until the I/O thread completes its work, at which point it grabs a new thread from the thread pool and completes the request.

Page.RegisterAsyncTask cannot be used here, so if you need to run multiple async tasks you will need to implement your own IAsyncResult Interface and pass in your own callbacks to prevent the EndProcessRequest method being called before you have completed all your async operations.

Asynchronous HTTP Modules

HTTP Modules are another fundamental building block of ASP.NET. They don’t handle requests, instead they sit in the HTTP Pipeline where they have the power to review every request coming in and every response going out. Not only can they view them, but they can modify them as well. Many of the features of ASP.NET are implemented using httpmodules: authentication, Session State and Caching for example, and by creating your own HTTP Modules you can extend ASP.NET in a lot of interesting ways. You could use an HTTP Module for example to add google analytics code to all pages, or a custom footer. Logging is another common use of HTTP Modules.

E-Commerce web sites can take advantage of HTTP Modules by overriding the default behavior of the Session Cookie. By default, ASP.NET Session Cookies are only temporary, so if you use them to store shopping cart information, after 20 minutes of inactivity, or a browser shut down they are gone. You may have noticed that Amazon.com retains shopping cart information much longer: You could shut down your laptop, fly to Japan and when you restart and return to Amazon your items will still be there. If you wanted to do this in ASP.NET you could waste a lot of time writing your own Session State Cookie Class, or you could write about 10 lines of code in the form of an HTTP Module that would intercept the cookie created by the Session Object before it gets to the browser, and modify it to make it a persistent cookie. So, there are lots and lots of practical uses for HTTP Modules.

An Http Module is nothing more than a class that implements the IHttpModule Interface, which involves an Init method for registering any and all events that you are interested in intercepting, and a dispose method for cleaning up any resources you may have used.

 

using System;

using System.Web;

 

public class BigBrotherModule : IHttpModule

{

    public void Init(HttpApplication application)

    {

        application.EndRequest +=

            new EventHandler(OnEndRequest);

    }

 

    void OnEndRequest(Object sender, EventArgs e)

    {

        HttpApplication application = (HttpApplication)sender;

        application.Context.Response.Write

            ("Bill Gates is watching you");

    }

 

    public void Dispose() { }

}

 

The events you can intercept in an HTTP Module:

 

  HttpApplication Events 
  Source: Microsoft Tech Ed 2007 DVD: Web 405  "Building Highly Scalable ASP.NET Web Sites by Exploiting Asynchronous Programming Models" by Jeff Prosise.

 Notice the HTTP Handler at the end there that converts the request into a response. These events will always fire in this order, in every request. The Authenticate Request event is the one fired by ASP.NET when a requested page requires authentication. It checks to see if you have an authentication cookie and if you do not, redirects the request to the login page. In the simple example, I was using that End Request event, which is the last one before the response is sent to the browser.

So, that is what HTTP Modules are for, and how they work. Why do we need an Asynchronous version? Well if you really want to see how scalable your application is, add an HTTP Module that makes a synchronous call to a webservice or a database. Since the event you register will be fired on every request, you will tie up an additional thread from the asp.net thread pool on every single request that is just waiting for these I/O processes to complete. So, if you write a synchronous HTTP Module that inserts a record into a database for every single request, and that insert takes 1 second, EVERY single request handled by your application will be delayed by 1 second. So if you need to do any type of I/O from within an HTTP Module, I recommend you make the calls asynchronously and if you are retrieving data, cache it!

To Register Async Event Handlers in an http module - In the Init Method, simply register your begin and end methods using AddOnPreRequestHandlerExecuteAsync:

 

using System.Web;

 

public void Init (HttpApplication application)

{

    AddOnPreRequestHandlerExecuteAsync (

        new BeginEventHandler (BeginPreRequestHandlerExecute),

        new EndEventHandler (EndPreRequestHandlerExecute)

    );

}

 

IAsyncResult BeginPreRequestHandlerExecute (Object source,

    EventArgs e, AsyncCallback cb, Object state)

{

    // TODO: Begin async operation and return IAsyncResult

}

 

void EndPreRequestHandlerExecute (IAsyncResult ar)

{

    // TODO: Get results of async operation

}

Error Handling while multithreading

Errors can happen at any point during the execution of a command. When ASP.NET can detect errors before initiating the actual async operation, it will throw an exception from the begin method; this is very similar to the synchronous case in which you get the exceptions from a call to ExecuteReader or similar methods directly. This includes invalid parameters, bad state of related objects (no connection set for a SqlCommand, for example), or some connectivity issues (the server or the network is down, for example).

Now, once we send the operation to the server and return, ASP.NET doesn’t have any way to let you know if something goes wrong at the exact moment it happens. It cannot just throw an exception as there is no user code above it in the stack when doing intermediate processing, so you wouldn't be able to catch an exception if it threw one. What happens instead is that ASP.Net stores the error information, and signals that the operation is complete. Later on, when your code calls the end method, ASP.Net detects that there was an error during processing and the exception is thrown.

The bottom line is that you need to be prepared to handle errors in both the begin and the end methods, so it is wise to wrap both events in a try – Catch block.

Conclusion

Now you have seen three of the asynchronous programming models ASP.NET has to offer, hopefully I have impressed upon you how important it is to at least consider using them when creating pages that do I/O if you expect those pages to be heavily trafficked. Remember you can also create asynchronous web services. I didn’t cover those here because there is pretty good documentation for that already.

The good thing about Asynchronous Programming models is that it enables us to build scalable and responsive applications that use minimal resources (threads/context switches).

What is the down side? Well it forces you to split the code into many callback methods, making it hard to read, confusing to debug and difficult for programmers unfamiliar with asynchronous programming to maintain.

With this in mind, whenever I add an asynchronous method to an object in the my projects, I also add a traditional Synchronous version. For example, if I had created a BeginUpdatexxx() Method in the BLL, there would also be a traditional Updatexxx() Method, so that if anyone else finds themselves having to use that object, they won’t be left scratching their heads, wondering “how on earth do I use that?”

Asynchronous command execution is a powerful extension to.NET. It enables new high-scalability scenarios at the cost of some extra complexity.

For more information on multi-threading in ASP.NET I highly recommend you read "Custom Threading in ASP.Net".


Posted by Williarob on Tuesday, December 16, 2008 5:30 PM
Permalink | Comments (0) | Post RSSRSS comment feed

Custom Threading in ASP.NET

or "One Mans obsession with finding a way to call synchronous methods asynchronously in ASP.NET".

This is for anyone interested in exploring the System.Threading Namespace in ASP.NET. There are many wrong ways to do it, one right way, and one other way that should only be used when you have no alternative.

First, the wrong ways. As you may or may not know, in .NET you can call any synchronous method asynchronously simply by creating a delegate method and calling the delegate’s BeginInvoke and EndInvoke methods. Knowing this, you might be tempted to try this in ASP.NET. For example, suppose you are using a prebuilt library object that contains the following method which makes a synchronous WebRequest:

        private static readonly Uri c_UrlImage1 = new Uri(@"http://williablog.net/williablog/image.axd?picture=2010%2f1%2fSlide6.JPG");

        private HttpWebRequest request;

        public string Result; // Public Variable to store the result where applicable

 

        public string SyncMethod()

        {

            // prepare the web page we will be asking for

            request = (HttpWebRequest)WebRequest.Create(c_UrlImage1);

 

            // execute the request

            HttpWebResponse response = (HttpWebResponse)request.GetResponse();

            return String.Format("Image at {0} is {1:N0} bytes", response.ResponseUri, response.ContentLength);

        }

Then you read my article on MultiThreading in asp.net and decide that this should be an Asynchronous call. If you don't have easy access to the source code of that prebuilt library, you might be tempted to try this:

 

        public delegate string AsyncMthodDelegate();

 

        public IAsyncResult BeginUpdateByDelegate(object sender, EventArgs e, AsyncCallback cb, object state)

        {

            AsyncMthodDelegate o_MyDelegate = SyncMethod;

            return o_MyDelegate.BeginInvoke(cb, state);

        }

 

        public void EndUpdateByDelegate(IAsyncResult ar)

        {

            AsyncMthodDelegate o_MyDelegate = (AsyncMthodDelegate)((AsyncResult)ar).AsyncDelegate;

            Result = o_MyDelegate.EndInvoke(ar);

            lblResult.Text = Result;

        }

 

        protected void Page_Load(object sender, EventArgs e)

        {

            RegisterAsyncTask(new PageAsyncTask(BeginUpdateByDelegate, EndUpdateByDelegate, AsyncUpdateTimeout, null, false));

        }

 

        public void AsyncUpdateTimeout(IAsyncResult ar)

        {

            Label1.Text = "Connection Timeout";

        }

 

On paper this looks great. You just converted a synchronous method to an asynchronous method and called it using RegisterAsyncTask. In spite of how promising this technique looks, I'm sorry to say that it will do nothing for scalability or performance. Unfortunately, the thread used by BeginInvoke is actually taken from the same worker thread pool that is used by ASP.Net to handle Page Requests, so what you are actually doing is returning the main thread to the thread pool when BeginUpdateByDelegate returns, grabbing a second thread from the same thread pool to call BeginInvoke, blocking that thread until EndInvoke is called, then returning thread two to the thread pool. ASP.NET then pulls a third thread from the same pool to call EndUpdateByDelegate, and complete the request. Net gain: 0 threads, and your code is harder to read, debug and maintain.

 

OK. What about using ThreadPool.QueueUserWorkItem()? You rewrite your code again and now it looks like this:

 

 

        public IAsyncResult BeginThreadPoolUpDate(object sender, EventArgs e, AsyncCallback cb, object state)

        {

            AsyncHelper helper = new AsyncHelper(cb, state);

            ThreadPool.QueueUserWorkItem(ThreadProc, helper);

            return helper;

        }

 

        public void EndThreadPoolUpDate(IAsyncResult AR)

        {

            AsyncHelper helper = (AsyncHelper)AR;

 

            // If an exception occurred on the other thread, rethrow it here

            if (helper.Error != null)

            {

                throw helper.Error;

            }

 

            // Otherwise retrieve the results

            Result = (string)helper.Result;

            lblResult.Text = Result;

        }

 

        protected void Page_Load(object sender, EventArgs e)

        {

            RegisterAsyncTask(new PageAsyncTask(BeginThreadPoolUpDate, EndThreadPoolUpDate, AsyncUpdateTimeout, null, false));

        }

 

        public void AsyncUpdateTimeout(IAsyncResult ar)

        {

            Label1.Text = "Connection Timeout";

        }

 

        class AsyncHelper : IAsyncResult

        {

            private AsyncCallback _cb;

            private object _state;

            private ManualResetEvent _event;

            private bool _completed = false;

            private object _lock = new object();

            private object _result;

            private Exception _error;

 

            public AsyncHelper(AsyncCallback cb, object state)

            {

                _cb = cb;

                _state = state;

            }

 

            public Object AsyncState

            {

                get { return _state; }

            }

 

            public bool CompletedSynchronously

            {

                get { return false; }

            }

 

            public bool IsCompleted

            {

                get { return _completed; }

            }

 

            public WaitHandle AsyncWaitHandle

            {

                get

                {

                    lock (_lock)

                    {

                        if (_event == null)

                            _event = new ManualResetEvent(IsCompleted);

                        return _event;

                    }

                }

            }

 

            public void CompleteCall()

            {

                lock (_lock)

                {

                    _completed = true;

                    if (_event != null)

                        _event.Set();

                }

 

                if (_cb != null)

                    _cb(this);

            }

 

            public object Result

            {

                get { return _result; }

                set { _result = value; }

            }

 

            public Exception Error

            {

                get { return _error; }

                set { _error = value; }

            }

        }

    }

Yikes, this time a helper class was needed to provide the IAsyncResult Interface. Now your code is even more unreadable and I'm sorry to tell you that the thread used by ThreadPool.QueueUserWorkItem also comes from the same thread pool that is used by ASP.Net for handling Requests.

Fine. I'll just use Thread.Start() and create my own thread. Well, you could do that - by creating your own thread, you cannot be stealing one from the ASP.Net threadpool. But not so fast! If you have determined that this method needs to be made asynchronous for reasons of scalability, then this page is under heavy load. Think about that for a second. If you are creating a new thread every time your page is requested and your page is being hammered with 1000 requests a second, then you are creating 1000 new threads almost simultaneously and they are all fighting for CPU time. Clearly using Thread.Start() risks unconstrained thread growth and you can easily find yourself creating so many new threads that the increased CPU contention actually decreases rather than increases scalability, so I don't recommend using Thread.Start() in ASP.Net.

So, you troll the internet looking for other ways.  In the bowels of the System.Threading Namespace, you find the ThreadPool.UnsafeQueueNativeOverlapped method that promises to open up an I/O Completion Port Thread, drawn from the I/O thread pool, just for you to run your method on. So you modify your code again, recompiling with the Allow unsafe code box checked. Now it looks something like this:

 

        public IAsyncResult BeginIOCPUpDate(object sender, EventArgs e, AsyncCallback cb, object state)

        {

            AsyncHelper helper = new AsyncHelper(cb, state);

            IOCP.delThreadProc myDel = SyncMethod;

            IOCP myIOCp = new IOCP(myDel);

 

            try

            {

                myIOCp.RunAsync();

            }

            catch (Exception ex)

            {

                helper.Error = ex;

            }

            finally

            {

                helper.CompleteCall();

            }

            return helper;

        }

 

        public void EndIOCPUpDate(IAsyncResult AR)

        {

            AsyncHelper helper = (AsyncHelper)AR;

 

            // If an exception occurred on the other thread, rethrow it here

            if (helper.Error != null)

            {

                throw helper.Error;

            }

 

            // Otherwise retrieve the results

            Result = (string)helper.Result;

            lblResult.Text = Result;

        }

 

        protected void Page_Load(object sender, EventArgs e)

        {

            RegisterAsyncTask(new PageAsyncTask(BeginIOCPUpDate, EndIOCPUpDate, AsyncUpdateTimeout, null, false));

        }

 

        public void AsyncUpdateTimeout(IAsyncResult ar)

        {

            Label1.Text = "Connection Timeout";

        }

 

        class IOCP

        {

 

            public delegate string delThreadProc();

            private readonly delThreadProc _delThreadProc;

 

            public IOCP(delThreadProc ThreadProc)

            {

                _delThreadProc = ThreadProc;

            }

 

            public void RunAsync()

            {

                unsafe

                {

                    Overlapped overlapped = new Overlapped(0, 0, IntPtr.Zero, null);

                    NativeOverlapped* pOverlapped = overlapped.Pack(IocpThreadProc, null);

                    ThreadPool.UnsafeQueueNativeOverlapped(pOverlapped);

                }

            }

 

            unsafe void IocpThreadProc(uint x, uint y, NativeOverlapped* p)

            {

                try

                {                 

                    _delThreadProc();

                }

                finally

                {

                    Overlapped.Free(p);

                }

            }

 

        }

 

        class AsyncHelper : IAsyncResult

        {

            // Code ommitted for clarity: see above for the full AsyncHelper Class          


        }

 

It seems to work but how can you be sure? You add the following code at various points to test:

            Label1.Text += "<b>EndIOCPUpDate</b><br />";

            Label1.Text += "CompletedSynchronously: " + AR.CompletedSynchronously + "<br /><br />";

            Label1.Text += "isThreadPoolThread: " + System.Threading.Thread.CurrentThread.IsThreadPoolThread.ToString() + "<br />";

            Label1.Text += "ManagedThreadId : " + System.Threading.Thread.CurrentThread.ManagedThreadId + "<br />";

            Label1.Text += "GetCurrentThreadId : " + AppDomain.GetCurrentThreadId() + "<br />";

            Label1.Text += "Thread.CurrentContext : " + System.Threading.Thread.CurrentContext.ToString() + "<br />";

 

            int availWorker = 0;

            int maxWorker = 0;

            int availCPT = 0;

            int maxCPT = 0;

            ThreadPool.GetAvailableThreads(out availWorker, out availCPT);

            ThreadPool.GetMaxThreads(out maxWorker, out maxCPT);

            Label1.Text += "--Available Worker Threads: " + availWorker.ToString() + "<br />";

            Label1.Text += "--Maximum Worker Threads: " + maxWorker.ToString() + "<br />";

            Label1.Text += "--Available Completion Port Threads: " + availCPT.ToString() + "<br />";

            Label1.Text += "--Maximum Completion Port Threads: " + maxCPT + "<br />";

            Label1.Text += "===========================<br /><br />";

 

And you discover that while it does indeed execute on an I/O thread it also uses or blocks a worker thread, perhaps to run the delegate method. In any case, net gain -1 threads - at least with the two previous techniques you were only using a worker thread. Now you are using a worker thread and an I/O thread, and using unsafe code, and your code looks even worse!

So, you think for a while and decide to use a custom threadpool. Perhaps you recall that the folks at Wintellect.com used to offer a free custom threadpool in their PowerThreading Library. You find a copy among your archives and set it up. (The code looks just like the Threadpool.QueueUserWorkItem code above, but it is using the custom thread pool. You add your Code to verify and yes, it does everything you need it to. Available worker Threads = Maximum Worker Threads, Available I/O Threads = Maximum I/O threads. No threads are being stolen from ASP.Net and you don't risk unconstrained thread growth. Short of finding a way to add the asynchronous method to the prebuilt dll (obtaining and modifying the source code - contacting the author/vendor to request asynchronous calls be addedto their library) or switching to your own code, or to another library that already supports asynchronous methods, this is your only option.

The only right way to do asynchronous programming in asp.net is to find a way to add true asynchronous capabilities to your library. If this is impossible, then a custom threadpool is your only option. Now before you rush off to wintellect.com to download this magical solution, I should warn you that the custom threadpool is no longer part of the power threading library. Curious to understand why I contacted the author (Mr Jeffrey Richter) and he told me that he removed the custom thread pool as he believed it promoted poor prgramming practice. He explained that the default number of threads has been greatly increased with the introduction of the 3.x framework (and of course you can increase it in the machine.config), but that ultimately if your library does not support Asynchronous I/O calls, your application will not be scalable.

“If they don’t support async operations then they are not usable for scalable apps – period... Personally, I would not use a [library] that didn't support async operations unless I was using it in a situation where I knew I would always have just a few clients.”

I encourage you to use Asynchronous Delegates and ThreadPool.QueueUserWorkItem freely in console applications and windows forms programs,  just don't bother using them in ASP.Net as it is a waste of time.

So. There you have it. The sample code I used to test which thread pool threads were used in each of the techniques discussed above can be downloaded here. The sample includes a version of Wintellect's Power Threading Library that still contains the Custom Threadpool - look for it in the bin folder.

AsyncThreadTests.zip (264.52 kb)

Note: I recommend you test under IIS 6.0 or IIS 7.0: running on the Casini web server or on IIS 5.1 under windows XP shows inconsistant results.


Posted by Williarob on Tuesday, December 16, 2008 10:19 AM
Permalink | Comments (0) | Post RSSRSS comment feed

Delayed Automatic Exit from Console Application

Suppose you need to write a console application that will run as a scheduled task on a busy production server, but you want to delay the exit so that they can read the results on the screen, but not force someone to push a key, as there may not always be someone at the console to do so. Ending with Console.ReadKey() is no good as it will wait until you push a key before it exits. Here is my solution:

 

    1 using System;

    2 using System.Threading;

    3 

    4 namespace ConsoleApplication1

    5 {

    6     class Program

    7     {

    8         static void Main(string[] args)

    9         {

   10             //... code here for whatever your console app does

   11 

   12             Console.WriteLine("Press any key to exit...");

   13 

   14             delay = new ExitDelay();

   15             delay.Start();

   16             MyTimer = new Timer(TimesUp, null, 10000, Timeout.Infinite);

   17         }

   18 

   19         static ExitDelay delay;

   20         static Timer MyTimer;

   21 

   22         // Timer callback: they didn't press any key, but we don't want this window open forever!

   23         private static void TimesUp(object state)

   24         {

   25             delay.Stop();

   26             MyTimer.Dispose();

   27             Environment.Exit(0);

   28         }

   29 

   30     }

   31 

   32     public class ExitDelay

   33     {

   34         private readonly Thread workerThread;

   35 

   36         public ExitDelay()

   37         {

   38             this.workerThread = new Thread(this.work);

   39             this.workerThread.Priority = ThreadPriority.Lowest;

   40             this.workerThread.Name = "ExitTimer";

   41         }

   42 

   43         public void Start()

   44         {

   45             this.workerThread.Start();

   46         }

   47 

   48         public void Stop()

   49         {

   50             this.workerThread.Abort();

   51         }

   52 

   53         private void work()

   54         {

   55             Console.ReadKey();

   56             this.Stop();

   57         }

   58     }

   59 

   60 }

 

This gives someone 10 seconds to read the result and press a key before exiting the application automatically.

Edit: A newer Version of this technique us available . The new Version provides an on screen countdown, notifying the user that the program is about to exit - otherwise if there is someone at the console and it suddenly quits while instructing them to hit any key to quit, that might be a bit worrying! I leave this example here because it is a good example of how start and stop a new Thread as well as how to use the System.Threading.Timer Class in a slightly different way to the new version which uses polling.


Categories: C# | Multithreading
Posted by Williarob on Friday, December 12, 2008 6:12 PM
Permalink | Comments (0) | Post RSSRSS comment feed

Instantly Increase ASP.NET Scalability

Each time a request for a .NET resource (.aspx, page, .ascx control, etc.) comes in a thread is grabbed from the available worker thread pool in the asp.net worker process (aspnet_wp.exe on IIS 5.x, w3wp.exe on IIS 6/7) and is assigned to a request. That thread is not released back into the thread pool until the final page has been rendered to the client and the request is complete.

Inside the ASP.Net Worker Process there are two thread pools. The worker thread pool handles all incoming requests and the I/O Thread pool handles the I/O (accessing the file system, web services and databases, etc.). But how many threads are there in these thread pools? I had assumed that the number of threads would vary from machine to machine – that ASP.NET and IIS would carefully balance the number of available threads against available hardware, but that is simply not the case. ASP.Net installs with a fixed, default number of threads to play with: The CLR for the 1.x Framework sets these defaults at just 20 worker threads and 20 I/O thread per CPU. Now this can be increased by modifying the machine.config, but if you are not aware of this, then 20 threads is all you’re playing with. If you have multiple sites sharing the same worker process, then they are all sharing this same thread pool.

So long as the number of concurrent requests does not exceed the number of threads available in the pool, all is well. But when you are building enterprise level applications the thread pool can become depleted under heavy load, and remember by default heavy load is more than just 20 simultaneous requests. When this happens, new requests are entered into the request queue (and the users making the requests watch an hour glass spin). ASP.NET will allow the request queue to grow only so big before it starts to reject requests at which point it starts returning Error 503, Service Unavailable.

If you are not aware of this “Glass Ceiling of Scalability”, this is a perplexing error – one that never happened in testing and may not reproducible in your test environment, as it only happens under extreme load.

So the first thing you can do to improve scalability is to raise the values. The defaults for the ASP.NET 2.0 are 100 threads in each pool per CPU and the defaults for the ASP.NET 3.x CLR is 250 per CPU for worker threads and 1000 per CPU for I/O threads, however you can tune it further using the guidelines below. 32 bit windows can handle about 1400 concurrent threads, 64 bit windows can handle more, though I don’t have the figures.

You can tune ASP.NET thread pool using maxWorkerThreads, minWorkerThreads, maxIoThreads, minFreeThreads, minLocalRequestFreeThreads and maxconnection attributes in your Machine.config file. Here are the default settings.

<system.net>
  <connectionManagement>
     <add address="*" maxconnection="2" />
  </connectionManagement>
</system.net>
<system.web>
  <httpRuntime minFreeThreads="8" minLocalRequestFreeThreads="4"  />
  <processModel maxWorkerThreads="100" maxIoThreads="100"  />
</system.web>

Here is the formula to reduce contention. Apply the recommended changes that are described below, across the settings and not in isolation.

  • Configuration setting Default value (.NET Framework 1.1) Recommended value
  • maxconnection 2 12 * #CPUs
  • maxIoThreads 20 100
  • maxWorkerThreads 20 100
  • minFreeThreads 8 88 * #CPUs
  • minLocalRequestFreeThreads 4 76 * #CPUs
  • Set maxconnection to 12 * # of CPUs. This setting controls the maximum number of outgoing HTTP connections that you can initiate from a client. In this case, ASP.NET is the client. Set maxconnection to 12 * # of CPUs.
  • Set maxIoThreads to 100. This setting controls the maximum number of I/O threads in the .NET thread pool. This number is automatically multiplied by the number of available CPUs. Set maxloThreads to 100.
  • Set maxWorkerThreads to 100. This setting controls the maximum number of worker threads in the thread pool. This number is then automatically multiplied by the number of available CPUs. * Set maxWorkerThreads to 100.
  • Set minFreeThreads to 88 * # of CPUs. This setting is used by the worker process to queue all the incoming requests if the number of available threads in the thread pool falls below the value for this setting. This setting effectively limits the number of requests that can run concurrently to maxWorkerThreads – minFreeThreads. Set minFreeThreads to 88 * # of CPUs. This limits the number of concurrent requests to 12 (assuming maxWorkerThreads is 100).
  • Set minLocalRequestFreeThreads to 76 * # of CPUs. This setting is used by the worker process to queue requests from localhost (where a Web application sends requests to a local Web service) if the number of available threads in the thread pool falls below this number. This setting is similar to minFreeThreads but it only applies to localhost requests from the local computer. Set minLocalRequestFreeThreads to 76 * # of CPUs.

Note The recommendations that are provided in this section are not rules. They are a starting point. Test to determine the appropriate settings for your scenario. If you move your application to a new computer, ensure that you recalculate and reconfigure the settings based on the number of CPUs in the new computer.

By raising these values, you raise the “glass ceiling of scalability”, and in many cases that may be all you need to do, but what happens when you start getting more than 250 simultaneous requests? To make optimum use of the thread pool all IO requests that you know could take a second or more to process should be made asynchronously. More information on Asynchronous programming in ASP.NET is coming soon.

I recommend testing your new machine.config locally or virtually somewhere first because if you make a mistake - for example you paste this in to the wrong area, or there is already a <system.web> element and you paste in a second one - ALL websites on the box will stop functioning as ASP.Net will not be able to parse the machine.config!!!


Posted by Williarob on Tuesday, December 02, 2008 7:41 AM
Permalink | Comments (1) | Post RSSRSS comment feed

Add Asynchronous Data Methods to the Enterprise Library

A Data Access Layer that does not support Asychronous I/O is not scalable. Period. However, since Microsoft kindly provided the source code along with the Enterprise Library, it is possible to make the Enterprise Library's Data Application Block scalable by adding the Asynchronous methods (BeginExecuteNonQuery, BeginExecuteReader, etc.) to the SQL Database Object. I took the code from the 3.x library and modified the file SqlDatabase.cs found here c:\EntLib3Src\App Blocks\Src\Data\Sql (assuming you installed the source files to the root of your C drive). The following download includes the modified binaries and my SqlDatabase.cs code file:

AsyncEL.zip (348.39 kb)

I don't think I implemented every overload, and again this is for the 3.0 library, but you should be able to modify the 4.x libraries and add more overloads by following the patterns I'm using. To use the binaries included in the zip file above, simply drop them into the bin folder of your web site or web application project, add references if necessary and don't forget to add the async=true attribute to both your <% page %>  tag and your connection string! Here is a simple example of how you might use it, to get you started:

 

    1 using System;

    2 using System.Collections;

    3 using System.Configuration;

    4 using System.Data;

    5 using System.Data.SqlClient;

    6 using System.Linq;

    7 using System.Web;

    8 using System.Web.Configuration;

    9 using System.Web.Security;

   10 using System.Web.UI;

   11 using System.Web.UI.HtmlControls;

   12 using System.Web.UI.WebControls;

   13 using System.Web.UI.WebControls.WebParts;

   14 using System.Xml.Linq;

   15 

   16 public partial class temp : System.Web.UI.Page

   17 {

   18     protected void Page_Load(object sender, EventArgs e)

   19     {

   20         if (!IsPostBack)

   21         {

   22             RegisterAsyncTask(new PageAsyncTask(new BeginEventHandler(BeginUpdateByAsyncEL), new EndEventHandler(EndUpdateByAsyncEL), new EndEventHandler(AsyncUpdateTimeout), null, true));

   23         }

   24     }

   25 

   26     /// <summary>

   27     /// Creates a SqlDatabase Object and stores it in HttpContext for the duration of the request.

   28     /// </summary>

   29     /// <value></value>

   30     /// <returns>A SqlDatabase Object</returns>

   31     /// <remarks></remarks>

   32     public static Microsoft.Practices.EnterpriseLibrary.Data.Sql.SqlDatabase AsyncNorthWindDB

   33     {

   34         get

   35         {

   36             if (HttpContext.Current == null)

   37             {

   38                 return new Microsoft.Practices.EnterpriseLibrary.Data.Sql.SqlDatabase(WebConfigurationManager.ConnectionStrings["AsyncNorthwind"].ConnectionString);

   39             }

   40             if (HttpContext.Current.Items["AsyncNorthWindDB"] == null)

   41             {

   42                 HttpContext.Current.Items.Add("AsyncNorthWindDB", new Microsoft.Practices.EnterpriseLibrary.Data.Sql.SqlDatabase(WebConfigurationManager.ConnectionStrings["AsyncNorthwind"].ConnectionString));

   43             }

   44             return HttpContext.Current.Items["AsyncNorthWindDB"];

   45         }

   46     }

   47 

   48     /// <summary>

   49     /// Begins an asynchronous call the the Northwind Database 

   50     /// </summary>

   51     /// <param name="sender"></param>

   52     /// <param name="e"></param>

   53     /// <param name="cb"></param>

   54     /// <param name="state"></param>

   55     /// <returns>An IAsyncResult Interface</returns>

   56     /// <remarks>These methods do not exist in the standard Enterprise Library</remarks>

   57     public IAsyncResult BeginUpdateByAsyncEL(object sender, EventArgs e, AsyncCallback cb, object state)

   58     {

   59         SqlCommand cmd = new SqlCommand("Update Products SET UnitPrice = 79.0 WHERE productID = 20");

   60         return AsyncNorthWindDB.BeginExecuteNonQuery(cmd, cb, state);

   61     }

   62 

   63     public void EndUpdateByAsyncEL(IAsyncResult ar)

   64     {

   65         AsyncNorthWindDB.EndExecuteNonQuery(ar);

   66     }

   67 

   68     /// <summary>

   69     /// Async Timeout method

   70     /// </summary>

   71     /// <param name="ar"></param>

   72     /// <remarks></remarks>

   73     public void AsyncUpdateTimeout(IAsyncResult ar)

   74     {

   75         Label1.Text = "Connection Timeout";

   76     }

   77 }

 

Download the Full Enterprise Library (including the complete source code) from http://www.codeplex.com/entlib.

 

kick it on DotNetKicks.com


Posted by Williarob on Monday, December 01, 2008 8:50 AM
Permalink | Comments (0) | Post RSSRSS comment feed

Begin/End Async WebService Proxy Methods No Longer Generated

If you have added a Web Reference for a Web Service recently you may have noticed that the wsdl tool no longer creates Begin and End methods for you, instead it implements what is known as the Event based Async Pattern. Which is fine if you are building a Windows or Console application, but if you are trying to call a web service asynchronously using the AddOnPreRenderAsync or RegisterAsyncTask methods in ASP.NET you really need a Begin method that returns an IAsyncResult Interface. There are several ways you can work around this.

You could, assuming you still have it, create a project in Visual Studio 2003, add the web reference there and copy it to your 2005 project, or you can run the wsdl tool manually and use the /parameters option to specify a configuration file with the "oldAsync" flag. For example, for a reference to http://my.server.com/service, which creates a proxy class with both begin/end (a.k.a. "old style") and event-based (a.k.a. "new style") methods, you'd use the command

  wsdl.exe /parameters:MyParameters.xml http://my.server.com/service

Where MyParameters.xml is as below:

MyParameters.xml:

<wsdlParameters xmlns="http://microsoft.com/webReference/">
  <nologo>true</nologo>
  <parsableerrors>true</parsableerrors>
  <sharetypes>true</sharetypes>
  <webReferenceOptions>
    <verbose>false</verbose>
    <codeGenerationOptions>properties oldAsync newAsync</codeGenerationOptions>
    <style>client</style>
  </webReferenceOptions>
</wsdlParameters>

However, my preferred method is to set a project level property WebReference_EnableLegacyEventingModel to true.

To do that, you need unload the project from Visual Studio, and edit the project file directly in a text editor. (I recommend creating a backup first)

In the first section of PropertyGroup, you will see many properties like:

    <ProjectGuid>{F4DC6946-F07E-4812-818A-F35C5E34E2FA}</ProjectGuid>
    <OutputType>Exe</OutputType>
...

Don't change any of those, but do add a new property into that section:


    <WebReference_EnableLegacyEventingModel>true</WebReference_EnableLegacyEventingModel>


Save the file, and reload the project into Visual Studio.

After that, you have to regenate all proxy code (by updating the reference, or running the custom tool on the .map file)

This is my preferred method because unlike the others I described, this one lets you refresh or update the web reference as normal, which if your project is shared with other developers is a big plus!


Posted by Williarob on Monday, December 01, 2008 7:12 AM
Permalink | Comments (2) | Post RSSRSS comment feed

Process and Thread Basics

Programs, Processes and Threads

In .NET terms, a program can be defined as an assembly, or group of assemblies, that work together to accomplish a task. Assemblies are little more than a way of packaging instructions into maintainable elements and are generally compiled into a dynamic link library (DLL) or an executable (EXE), or a combination of the two.

A process gives a program a place to run, allowing access to memory and resources. Generally, each process runs relatively independent of other processes. In particular, the memory where your program variables will reside is completely separate from the memory used by other processes. Your email program cannot directly assign a new value to a variable in the web browser program. If your email program can communicate with your web browser—for instance, to have it open a web page from a link you received in email—it does so with some form of communication that takes much more time than a memory access.

By putting programs into processes and using only a restricted, mutually agreed-upon communication between them has a number of advantages. One of the advantages is that an error in one process will be less likely to interfere with other processes. Before multitasking operating systems, it was much more common for a single program to be able to crash the entire machine. Putting tasks into processes, and limiting interaction with other processes and the operating system, has greatly added to system stability.

All modern operating systems support the subdivision of processes into multiple threads of execution. Threads run independently, like processes, and no thread knows what other threads are running or where they are in the program unless they synchronize explicitly. The key difference between threads and processes is that the threads within a process share all the data of the process. Thus, a simple memory access can accomplish the task of setting a variable in another thread. Every program will have at least one thread.

In his book ".NET Multithreading" Alan Dennis compares a Process to a house and a thread to a housecat. He writes:

The cat spends most of its time sleeping, but occasionally it wakes up and performs some action, such as eating. The house shares many characteristics with a process. It contains resources available to beings in it, such as a litter box. These resources are available to things within the house, but generally not to things outside the house. Things in the house are protected from things outside of the house. This level of isolation helps protect resources from misuse. One house can easily be differentiated from another by examining its address. Most important, houses contain things, such as furniture, litter boxes, and cats.

Cats perform actions. A cat interacts with elements in its environment, like the house it lives in. A housecat generally has a name. This helps identify it from other cats that might share the same household. It has access to some or the entire house depending on its owner’s permission. A thread’s access to elements may also be restricted based on permissions, in this case, the system’s security settings.

Multitasking

Multitasking means that more than one program can be active at a time. You may take it for granted that you can have an email program and a web browser program running at the same time. Yet, not that long ago, this was not the case. In the days of DOS you would need to save and close your spreadsheet before opening your word processor. With the advent of Windows, you could open multiple applications at once. Windows 3.x used something called Cooperative Multitasking which is based on the assumption that all running processes will yield control to the operating system at a frequent interval. The problem with this model was that not all software developers followed these rules and a program that did not return control to the system, or did so very infrequently, could destabilize the operating system, causing it to "lock up". When Windows 3.x started a new application, that application was invoked from the main thread. Windows passed control to the application with the understanding that control would quickly be returned to windows. If that didn't happen, all other running applications including the operating system could no longer execute instructions. Today, Windows employs Preemptive Multitasking. In this model, instead of relying on programs to return control to the system at regular intervals, the OS simply takes it. 

The main thread of a typical windows program executes a loop (Called a message pump). The loop checks a message queue to see if there is work to do and if there is, it does the work. For example, when a user clicks on a button in a windows application the click event adds work to the message queue indicating which method should be executed. This method is of course known as an event handler. While the loop is executing an event handler, it cannot process additional messages. Multithreading (literally using more than one thread) is how we can work around this limitation. Instead of having the main thread that was assigned to this program do the time consuming work, we assign the work to a seperate thread and have it do the work.

There are a number of ways to create and manage these new threads - using System.Threading, creating a delegate method, implementing the Event Based Asynchronous Pattern, using waithandles, etc. and I intend to explore all of them future articles. In a single core processor, this execution on a separate thread would be periodically interrupted by the operating system to allow other threads a chance to get work done; but after decades in a world where most computers had only one central processing unit (CPU), we are now in a world where only "old" computers have one CPU. Multi-core processors are now the norm. Therefore, every software developer needs to Think Parallel.

Multithreading

Multithreading allows a process to overlap I/O and computation. One thread can execute while another thread is waiting for an I/O operation to complete. Multithreading makes a GUI (graphical user interface) more responsive. The thread that handles GUI events, such as mouse clicks and button presses, can create additional threads to perform long-running tasks in response to the events. This allows the event handler thread to respond to more GUI events. Multithreading can speed up performance through parallelism. A program that makes full use of two processors may run in close to half the time. However, this level of speedup usually cannot be obtained, due to the communication overhead required for coordinating the threads.


Posted by Williarob on Saturday, June 28, 2008 9:00 PM
Permalink | Comments (0) | Post RSSRSS comment feed

Asynchronous DataSet

If you haven't yet read the article "Scalable Apps with Asynchronous Programming in ASP.NET" by Jeff Prosise or seen his presentation at TechEd then you should cetainly take the time to do so, however I'll summarize the key points briefly here. Basically, there is a finite number of Threads available to ASP.Net for request handling, and by making database calls the way that most textbooks and articles recommend, many of the available threads that should be handling requests are actually tapping their feet waiting for your database request to complete, before it can serve your page and return to the thread pool. When all the threads are busy, incoming requests are queued, and if that queue becomes too long then users start to see HTTP 503 Service unavailable errors. In other words there is a glass ceiling to scalability, with synchronous IO requests in asp.net.

 To make optimum use of the thread pool all IO requests that you know could take a second or more to process should be made asynchronously and the links above will give you plenty of examples of how you should do this. The purpose of this article is to demonstrate how you can return a DataSet asynchronously, which is not something I could find an example of anywhere. Another thing I could not find in my research was how to use asynchronous database calls when you have a datalayer, as opposed to a page or usercontrol that contacts the database directly, and I will provide you with both here.

If you have looked at examples of asynchronous database calls elsewhere on the web before arriving here you have probably become familair with the BeginExecuteReader and EndExecuteReader methods of the SQLClient namespace, but where is the BeginExecuteDataSet method? If you need a dataset, why should that have to be a synchronous request? I could not find any asynchronous methods for the DataAdapter and while I did find ways to call a WebMethod or WebService that returns a dataset asynchronously, why should you have to break your methods that return datasets out to webservices? I also found some examples of how to create delegates or use System.Threading.Thread.Start to create datasets in their own thread, but, according to Mr. Prosise these are worthless in ASP.Net because both of these methods actually steal threads for the same thread pool ASP.Net is using anyway! So by using System.Threading.Thread.Start to create your dataset all you are doing is returning a thread to the threadpool and immediately grabbing another one. So how can you do it?

For this example I borrowed the Datalayer from the Job Site Starter Kit and added some Asynchronous methods to it. Here is my BeginExecuteDataSet method:

       Public Function BeginExecuteDataSet(ByVal callback As System.AsyncCallback, _
          ByVal stateObject As Object, ByVal behavior As CommandBehavior) As System.IAsyncResult   
             Dim res As IAsyncResult = Nothing
             Me.Open()
             res = cmd.BeginExecuteReader(callback, stateObject, behavior)
             Return res
       End Function

But how is that different to BeginExecuteReader? It is not it is exactly the same, I don't see the need to rewrite the DataAdapter class from scratch to support Asynchronous functions when I can simply use a datareader to populate a dataset. The key differences are in the EndExecuteDataSet Method:

       Public Function EndExecuteDataSet(ByVal asyncResult As System.IAsyncResult) As DataSet   
             Dim ds As Dataset = Nothing
             Dim rdr As SqlClient.SqlDataReader = cmd.EndExecuteReader(asyncResult)
             Dim dt As DataTable = New DataTable()
             dt.Load(rdr)
             ds = New DataSet
             ds.Tables.Add(dt)
             Return ds
       End Function

Calling these methods is therefore no different to calling BeginExecuteReader. For example the following code would work from both an .aspx page or an .ascx control.


    Dim db As Classes.Data.DAL
    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        'Append async attribute to connection string
        db = New Classes.Data.DAL(String.Concat _
        (ConfigurationManager.ConnectionStrings("MyDB")ConnectionString, ";async=true"))
        db.CommandText = "asyncTest"
        ' Launch data request asynchronously using page async task  
        Page.RegisterAsyncTask(New PageAsyncTask(New BeginEventHandler(AddressOf BeginGetData), _
        New EndEventHandler(AddressOf EndGetData), New EndEventHandler(AddressOf GetDataTimeout), _
        Nothing,True))
    End Sub
    Function BeginGetData(ByVal sender As Object, ByVal e As EventArgs, _
        ByVal cb As AsyncCallback, ByVal state As Object) As IAsyncResult
       Return db.BeginExecuteDataSet(cb, state)  
    End Function
    Sub EndGetData(ByVal ar As IAsyncResult)
        Try
           gv1.DataSource = db.EndExecuteDataSet(ar)
           gv1.DataBind()
        Catch ex As Exception
           lblMsg.Text = ex.ToString()
        Finally
           db.Dispose()
        End Try
    End Sub
    Sub GetDataTimeout(ByVal ar As IAsyncResult)
        db.Dispose()
        lblMsg.Text = "Async connection timed out"
    End Sub

There you have it, a DataSet returned asynchronously, from a Data Access Layer. Download the entire Data Access Layer (zip file contains both Visual Basic and C# versions - 3.05 kb).


Posted by Williarob on Wednesday, December 19, 2007 8:07 AM
Permalink | Comments (0) | Post RSSRSS comment feed