Recently I had a project that required me to be able to transfer large files via a service call to a remote server where the file would be ingested by an always-on application for further processing. When thinking about the requirements for such a service, a few things came to mind:
- The transfer had to be streaming, as attempting to buffer requests for files in excess of 100MB would be taxing on the host server.
- To maximize throughput, the requests need to be released as fast as possible so that new requests can be processed.
- The service should be as simple and quick to build as possible.
After a bit of research on the matter, I struck the jackpot with the
HttpTaskAsyncHandler class and the recently introduced
GetBufferlessInputStream() method and I soon got to work on mocking up the functionality.
HttpTaskAsyncHandler is an implementation of the
IHttpAsyncHandler interface introduced in .NET 4.0, but with the added benefit of the awaitable and async features introduced in .NET 4.5 that allow a much easier syntax for dealing with asynchronous processing over the Begin/End async pattern from the older interface. The only method that needs to be implemented is
ProcessRequestAsync(HttpContext context). We can then use the await keyword to indicate to the handler that the work to be done will be longer running. Let’s look at the code:
When the compiler sees the await keyword, it rewrites the method into the begin/end async pattern for us. Let’s take a look at the implementation of
ProcessRequestAsync(). During execution, when the
await TransferFileAsync(context) is called, each await that calls into the ReadAsync or WriteAsync methods suspends the execution of the TransferFileAsync method and returns the thread to the pool to process other requests. When the async operation completes, .NET requisitions another thread from the pool to resume the method right where it left off. This ensures that the threads handling requests will be available to process the maximum number of requests while the worker threads handle the disk I/O. Even better, if you’re current execution environment has a SynchronizationContext, that is captured and used to marshal to the same thread for the callback (in case you call await from the UI thread of a forms app, for example).
The handler can get wired up in the web.config as follows:
Normally when accessing the Request.InputStream property of the
HttpContext object, ASP.NET will only acquire the stream when the entire message body has been received. For cases where the message is large (like an extremely large file) that translates into the memory use of the process inflating like a balloon to the size of the file being transferred, and then being released once the processing has taken place. You can imagine that if the service is handling multiple large file requests concurrently this can quickly lead to the machine running out of memory to process further requests. In contrast, the
Request.GetBufferlessInput() method will allow access to the stream immediately by the request as it starts to flow in. This allows full control over the processing of the input stream as well as a smaller memory footprint. Let’s take a look at the part of our handler example that will process the incoming stream:
In the example above, we wrap the
GetBufferlessInputStream() input stream into a
StreamReader so that we can control how much data we want to read/write. Here I’m calling the ReadToEndAsync() method of the reader. keep in mind that it’s horrible practice to read an entire file into memory, but I’m simply trying to demonstrate the benefits of using the async and awaitable features of the 4.5 framework.
There’s one last setting that needs to be taken care of to ensure that ASP.NET will allow access to the incoming stream as soon as possible. In the config file for our handler, we add the following:
it may also be beneficial to override the ASP.NET maximum request size limit:
That’s all there is to it. Happy streaming!