Monday, May 16, 2022

Large AsyncOperationBase increase in Dataverse/Dynamics 365 CE: the canary in the coalmine

In normal circumstances, you probably rarely look at the System Job views in your  Dynamics 365 environment. But if you see a sudden and unexpected (large) increase in the storage used by the AsyncOperationBase table or AsyncOperation in File storage is like a canary in the coalmine - something is wrong in your Dynamics365/Dataverse environment.


System jobs (or Async Operations) are tasks that Dynamics performs in the background that don't have to be completed straight away. When such an asynchronous task is being setup, Dynamics creates a row in the AsyncOperationBase table and puts in all the details about when to run, what function/plugin/cloud flow to call, parameters to pass etc... 

All of this information used to be stored in columns of the AsyncOperationBase table but it this can get really large (KB or MBs) depending on the actual task being scheduled.  Part of this payload has now been moved out of the database and into a file stored in Azure Blob Storage and the database row now has a reference to the blob file.  You can follow up on the numbers in the capacity details for an instance in the Power Platform Admin Center (PPAC) - where you will now also see AsyncOperation in the file section after this architectural change from Microsoft. (Also see Storage capacity management for Dynamics 365/Dataverse - how to track storage evolution with Power BI and New Microsoft Dataverse storage capacity (Microsoft docs))




Dynamics 365 has a recurring system job to delete successfully completed system jobs  - see screenshot below - when an asyncoperation record is cleared from the database, the linked file storage is also cleared up.


One of the reasons you might see a large increase in storage consumed is because of a large amount of failed or cancelled system jobs.  The system jobs status (internal name statecode, possible values = ready, suspended, locked and completed)  and status reason ( internal name statuscode, possible values for completed jobs = succeeded, cancelled and failed)  can be used to identify possible issues.

I really like the SQL 4 CDS XrmToolBox add-in to troubleshoot issues with system jobs - it allows you to write simple SQL queries to identify system jobs causing issues. 

I also use  SQL4CDS for further analysis when troubleshooting async plugins using SQL queries e.g. select top 100 * from plugintracelog where typename like 'JOPX.CRM.Plugins.Notification%' but there are also other tools in XrmToolBox which you can use



No comments: