Troubleshooting Sitecore on Azure PAAS

I am currently working on Sitecore (9.0.2) implementation on Azure PAAS. Sitecore can take a lot of advantages on Azure PAAS. Be it setting up environments quickly, Maintaining your resources (on your own or with the help of Cloud Solution Provider -CSP), Scaling, and Monitoring.

Troubleshooting of Sitecore on Azure sometimes became complicated and different then your own Infrastructure. I am stating few ways troubleshooting your Sitecore implementation on Azure PAAS based on my recent experience.

Instance not coming up

After setting up initial single setup on Azure PAAS for our Dev environment, we were required to restart the Webapp either manually in case of any issue or automatically as part of deployment process from Octopus. After performing the restart on Webapp, it was showing Running in Azure portal as shown below:

2019-03-31 14_39_35-Home - Microsoft Azure

But when i access the Sitecore instance, i was facing The 503 Service Unavailable error.

4056246_en_2

Even after restarting the Webapp multiple times, result was the same. It shows running in azure portal. But Sitecore instance could not come up.

Solution:

First culprit which we think in such case is your Webapp. Webapp might be going high on CPU or Memory usages. But when i checked the CPU and Memory usages, it was very normal. So, no issue there.

In Azure Webapp, first thing you should check is the Database. Database could be the real issue in most of the times in Azure PAAS. Sitecore creates many databases and we were not sure what could be the issue there.

I restarted Webapp and accessed the environment and at the same time I started checking all master/web/core databases one by one for the issue. When you select the database in Azure portal, you would see the resource utilization graph where you will find the DTU percentage metric.

What is DTU?

Database Transaction Unit – is one of the unit (another is vCores) in Azure SQL using which you can determine measure of CPU, memory, and data I/O and transaction log I/O in a ratio.

Please see here for more information on SQL DTU.

Back to our problem and solving them.

When i was checking one by one database, i found that Core database was utilizing 100% DTUs while instance is starting up.
2019-03-31 15_05_07-isango-dev-rg-core-db - Microsoft Azure
And as it’s reaching the limit, necessary data I/O operations are not able to get complete.

Solution here is to increase DTUs for Core database to at least 50 – 100 and then adjust. We made DTU count for Core database only to 100. And our Webapp was coming up without any issue.

Package installation takes forever

We are installing TDS packages generated by our solutions using Hedgehog’s Sitecore package deployer from Octopus in our continues deployment process. While performing this setup for one of the new instance we are setting up, it took more than 8 hours to proceed. Which is quite huge as same step gets completed in other environments in 20 minutes.

So, when this step starts in that new instance, it keeps installing for long time and also makes instance unavailable sometimes. We thought to create a memory dump and see what exactly blocking or causing the issue.

Generating Memory dump becomes much easy with Azure PAAS. Visit https://kb.sitecore.net/articles/111669 to know more on how to generate memory dumps. Simplest way to generate this is using Diagnostics as a service (DAAS).

Solution:

After generating full dump, we identified that many threads are trying to write the logs at the storage in csv file at the same time. Thus making all other threads wait for it to get completed. Which never gets completed.

I will share views on how Application Insights comes into picture with Sitecore on Azure PAAS in another post. We checked the Log level of our newly configured Webapp. And it was as mentioned below:

2019-03-31 15_44_29-Logs - Microsoft Azure

So, the log level was the Information. When i checked the log files, there were so, many entries for installation of a single item. And that is why it’s slowing down the process. Also it causes Memory utilization to reach 100%.

We did change the Log levels as per screenshot below:

2019-03-31 15_51_19-Logs - Microsoft Azure

We disabled the file level application logs and changed log level from Information to Error for Blob storage level logs.

After making these changes, we started deployment again and complete process got completed in 30 minutes against package installation was alone taking more than 8 hours.

So, if you experience any such issues with Sitecore on Azure PAAS. It’s worth checking above mentioned steps. It might solve your issue too and can save your troubleshooting time.

Follow process of troubleshooting mentioned in below info-graphic for your Sitecore instance on Azure PAAS.

1_taiXxcFwP_hLn1-t4w-6IQ

Keep building Sitecore on Azure PAAS.

One thought on “Troubleshooting Sitecore on Azure PAAS

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s