I'm running SQL Profiler on a busy Able store that was updated to R6 last week. I'm seeing massive SQL queries to ac_CustomUrls, even ac_Stores.
Isn't that all being cached? The store can't even stay up long enough to take an order, let alone update a CustomUrl or the Store object itself. Why are those objects constantly being queried from SQL??
The w3wp.exe process is pegging 100% constantly. I don't even know where to begin to troubleshoot this.
We updated the url cache setting from 1,000 to 4,000, didn't help.
We need the SQL schema changes for R6 ASAP so we can unwind this upgrade.
R6 is caching actually happening?
R6 is caching actually happening?
Joe Payne
AbleCommerce Custom Programming and Modules http://www.AbleMods.com/
AbleCommerce Hosting http://www.AbleModsHosting.com/
Precise Fishing and Hunting Time Tables http://www.Solunar.com
AbleCommerce Custom Programming and Modules http://www.AbleMods.com/
AbleCommerce Hosting http://www.AbleModsHosting.com/
Precise Fishing and Hunting Time Tables http://www.Solunar.com
Re: R6 is caching actually happening?
We'll look into this right away. I've alerted the dev team already. They should be in early AM.
Thank you for choosing AbleCommerce!
http://help.ablecommerce.com - product support
http://wiki.ablecommerce.com - developer support
http://help.ablecommerce.com - product support
http://wiki.ablecommerce.com - developer support
Re: R6 is caching actually happening?
Ok we've got the site running again.
Seems like there is an thread contention issue when a large number of tax rules exist in the store. Like 180+ rules. At some point a massive nHibernate query gets built and exceeds the 2,100 parameter limit on SQL queries.
This was causing CPU on the IIS process to peg to nearly 100% and the site to become virtually unusable.
We cleared all the tax tables like ac_taxrules, ac_taxcodes etc. Re-added the washington state tax provider, re-applied taxcodeid to all the products and everything is back to normal.
I need a drink.
Seems like there is an thread contention issue when a large number of tax rules exist in the store. Like 180+ rules. At some point a massive nHibernate query gets built and exceeds the 2,100 parameter limit on SQL queries.
This was causing CPU on the IIS process to peg to nearly 100% and the site to become virtually unusable.
We cleared all the tax tables like ac_taxrules, ac_taxcodes etc. Re-added the washington state tax provider, re-applied taxcodeid to all the products and everything is back to normal.
I need a drink.
Joe Payne
AbleCommerce Custom Programming and Modules http://www.AbleMods.com/
AbleCommerce Hosting http://www.AbleModsHosting.com/
Precise Fishing and Hunting Time Tables http://www.Solunar.com
AbleCommerce Custom Programming and Modules http://www.AbleMods.com/
AbleCommerce Hosting http://www.AbleModsHosting.com/
Precise Fishing and Hunting Time Tables http://www.Solunar.com
Re: R6 is caching actually happening?
Can you please provide some more details? Actully stack trace to be exact. I am looking into codes and there are two places which I suspect could be possible culprits. In one case either we query tax rules for 2100 tax codes and in second we maybe querying against 2100 ship zones. Do you think that any of these two can be the problem in your case?
Re: R6 is caching actually happening?
Thanks Katie and Mazhar for the quick responses, here's what I do know:
Client's IT provider upgraded the R4 install to R6 a week ago. Initially the site was much improved. App pool memory usage was down significantly. Originally 800mb to 1gb of RAM now taking only 250Mb. But worker process utilization was screaming, constantly at about 60-70% CPU utilization on a dual-cpu quad-core dedicated server. Even with 300+ orders a day, that's too high for the equipment specs they are on.
As they rolled through last weekend, site stability degraded. Starting this past Monday, the site started hanging and worker process was pegging 90-100% cpu. Maybe 1-2 times a day the site would crash entirely. By Tuesday it increased to 3-4 times a day. By yesterday, the site couldn't run for more than 10-15 minutes before pages simply stopped responding.
You could stop the website, stop the app pool and restart IIS. The IIS process would still keep running. A new IIS process would spawn, but the old one continued for a period of time (say 3-5 minutes).
Seeing the old process continue to run after the app pool stopped led me to suspect a thread lock or contention issue of some kind. I loaded up DebugDiag 2.0 and started watching .Net threading performance counters using performance monitor. DebugDiag gave me some indicators that garbage collection was running like crazy. About that same time, we noticed the /app_data/app.log recording massive queries from nhibernate against the tax tables.
At some point earlier in the day, the client IT provider had deleted some regions from the system thinking they somehow were causing the massive queries.
Since it's a Washington State business, they are using a custom written washington state tax provider. This tax provider (indirectly) caches the tax responses from the state by creating an Able tax rule after each tax lookup. This would (in theory) provide faster tax responses by finding a local table entry and eliminate having to call the state provider (again). This custom provider has been working fine all the way through R4.
We first suspected user maintenance being the original cause, so we cleared out baskets and basketitems just to be safe. We immediately noticed the basket items table was accumulating thousands of line items shortly after being cleaned out. A quick check revealed these line items were all tax line items. It was almost as if the system was now creating a tax line item on every basket for every anonymous hit, regardless if the basket had actual items in it.
So seeing this was possibly a tax calculator issue, we turned off sales tax calculation in Able. Worker process immediately dropped to normal CPU levels, thread contentions per second immediately dropped as well. Suddenly, the storm was over. The site was snappy and functioning normally.
So, now we know it's a data issue with taxes, an issue with the custom provider, or both. At this point, we wiped the slate clean. We disabled the custom state tax provider and turned on the able state tax provider. We cleaned out all records from all the tax tables, created a new 'Taxable' tax code, created 1 base rule for Washington State and updated all products with the new taxcodeid. Originally there were 180+ tax rules, now there is only 1.
The app continued to work well. Performance continued to remain normal. After 24 hours, we have seen very good performance and no stability issues. So it was definitely tax rule related.
The gigantic nHibernate query suggests the TaxRuleHelper.GetPotentialTaxRules() has the potential to create a massive query depending on the number of tax rules defined. I'll leave it to you guys to figure out if that's actually the case.
I'm hesitant to publish full dumps and log messages here in a public forum for security reasons, but I would be happy to provide them privately. Just PM me how you want them delivered. it's about 1.2gb of dump data in total. I'm not entirely sure i have stack traces, but I have a bunch of stuff from DebugDiag. I use Dropbox for large file transfers, I can send you a folder invite if you'd like.
Client's IT provider upgraded the R4 install to R6 a week ago. Initially the site was much improved. App pool memory usage was down significantly. Originally 800mb to 1gb of RAM now taking only 250Mb. But worker process utilization was screaming, constantly at about 60-70% CPU utilization on a dual-cpu quad-core dedicated server. Even with 300+ orders a day, that's too high for the equipment specs they are on.
As they rolled through last weekend, site stability degraded. Starting this past Monday, the site started hanging and worker process was pegging 90-100% cpu. Maybe 1-2 times a day the site would crash entirely. By Tuesday it increased to 3-4 times a day. By yesterday, the site couldn't run for more than 10-15 minutes before pages simply stopped responding.
You could stop the website, stop the app pool and restart IIS. The IIS process would still keep running. A new IIS process would spawn, but the old one continued for a period of time (say 3-5 minutes).
Seeing the old process continue to run after the app pool stopped led me to suspect a thread lock or contention issue of some kind. I loaded up DebugDiag 2.0 and started watching .Net threading performance counters using performance monitor. DebugDiag gave me some indicators that garbage collection was running like crazy. About that same time, we noticed the /app_data/app.log recording massive queries from nhibernate against the tax tables.
At some point earlier in the day, the client IT provider had deleted some regions from the system thinking they somehow were causing the massive queries.
Since it's a Washington State business, they are using a custom written washington state tax provider. This tax provider (indirectly) caches the tax responses from the state by creating an Able tax rule after each tax lookup. This would (in theory) provide faster tax responses by finding a local table entry and eliminate having to call the state provider (again). This custom provider has been working fine all the way through R4.
We first suspected user maintenance being the original cause, so we cleared out baskets and basketitems just to be safe. We immediately noticed the basket items table was accumulating thousands of line items shortly after being cleaned out. A quick check revealed these line items were all tax line items. It was almost as if the system was now creating a tax line item on every basket for every anonymous hit, regardless if the basket had actual items in it.
So seeing this was possibly a tax calculator issue, we turned off sales tax calculation in Able. Worker process immediately dropped to normal CPU levels, thread contentions per second immediately dropped as well. Suddenly, the storm was over. The site was snappy and functioning normally.
So, now we know it's a data issue with taxes, an issue with the custom provider, or both. At this point, we wiped the slate clean. We disabled the custom state tax provider and turned on the able state tax provider. We cleaned out all records from all the tax tables, created a new 'Taxable' tax code, created 1 base rule for Washington State and updated all products with the new taxcodeid. Originally there were 180+ tax rules, now there is only 1.
The app continued to work well. Performance continued to remain normal. After 24 hours, we have seen very good performance and no stability issues. So it was definitely tax rule related.
The gigantic nHibernate query suggests the TaxRuleHelper.GetPotentialTaxRules() has the potential to create a massive query depending on the number of tax rules defined. I'll leave it to you guys to figure out if that's actually the case.
I'm hesitant to publish full dumps and log messages here in a public forum for security reasons, but I would be happy to provide them privately. Just PM me how you want them delivered. it's about 1.2gb of dump data in total. I'm not entirely sure i have stack traces, but I have a bunch of stuff from DebugDiag. I use Dropbox for large file transfers, I can send you a folder invite if you'd like.
Joe Payne
AbleCommerce Custom Programming and Modules http://www.AbleMods.com/
AbleCommerce Hosting http://www.AbleModsHosting.com/
Precise Fishing and Hunting Time Tables http://www.Solunar.com
AbleCommerce Custom Programming and Modules http://www.AbleMods.com/
AbleCommerce Hosting http://www.AbleModsHosting.com/
Precise Fishing and Hunting Time Tables http://www.Solunar.com