Wow its been over 9 months since the last blog post, time to get back on the wagon!
I recently ran into an undocumented timeout limit for services hosted in Azure App Service and I struggled to figure out if I was doing something stupid or if the issue was the hosting platform.
There is a hard timeout limit of ~4mins on http(s) calls to APIs hosted in Azure App Service that is undocumented. There are no plans to address this so if you have long running api calls you will need to implement an alternative mechanism (polling, webhook push etc.) which is no bad thing assuming you have access to the API code.
500 - Request Timed Out
I was recently involved in a project which included calling an existing API which was now hosted in Azure app service. The API was a triggering a long running process which meant the API calls were blocked for a number of mins (yes this is bad but it does happen!). During development locally there wasn't any indication that the long running nature of this API call would be a problem as we had access to the API hosted locally and we had already tweaked IIS timeouts.
As soon as we were deployed onto App Service (S1 plan) and began testing we kept on seeing 500 - Request Timed Out after approx 230s. After checking all the possible timeout values, including some more obscure ones, I was stumped.
Finally I reached out to some of the Azure App Service engineering team who quickly told me it was a limit of app service and that there are no plans to lift this limit.
So what can you do?
Unfortunately there isn't a simple workaround if you are working with legacy(ish) code as I was. I had to implement a second API endpoint which was asynchronous and supported polling to check for progress (webhooks would probably be a better option if your upstream stack can support them). Most of you reading this will say that having an API that runs for minutes is shocking and I would agree that its not ideal but we often have to deal with older/legacy components which need to be surfaced via an API, however I will reiterate that being forced to work in an async manner is no bad thing.
Running into undocumented "features" of a platform are thankfully much rarer than the dark old days but I thought it might be useful to highlight this for anyone else who might run into the problem.