I have a question about reliability of the sharepoint actions as last Friday on the 24-08 requests timed out after hanging for 35 minutes. Hanging for 35 minutes and then erroring out can’t happen as it for a larger client. As this solution is going to be presented to the client I would first like to understand the reliability of the action services. Another forum post has said the improvements were going to be made but I cannot find any place for announcements or if this kind of thing would be announced.
Firstly I want to apologize for any inconvenience related to recent service upgrade.
I understand your concern related to the stability of our service.
But I want to assure you that our the recent upgrade helped to improve it.
Currently, all requests are getting to a queue and the queue processed by a few our servers.
We can add additional servers to process the jobs, based on the workload.
Additionally, we are constantly monitoring key metrics and tune up settings based on the workload.
Hi, we also had and continue to have (we just opened another ticket today) these stability issues since July and doesn’t seem to be improving after the mentioned service upgrades, we have deployed PlumSail actions into Production and have reported this issue every other month and the only response is that servers are added as needed and/or a service upgrade will fix the problem but none of these have to improved the service.
- How do you constantly monitor the service metrics but not add or resolve these issues until it is reported?
- Isn’t your service a cloud SaaS that can automatically increase servers as need when these kind of spikes occur?
Please be aware your service is used in different time zones and as customers we can’t wait until it is a business day in your offices to get a response, this is extremely disruptive.
Please accept our apologies for the incident. The issue was fixed yesterday and now all should work as expected. Currently, we’re investigating the root cause of the issue and I’ll post an update later today.
We investigated the issue and here are the steps we are going to implement to improve the situation:
- Improve monitoring of actions execution
- Introduce additional fallback techniques and servers to back stability up
We are going to complete the first steps in this process next week. Future two weeks we will be working on it.