Feeds:
Posts
Comments

Archive for the ‘Monitoring’ Category

Dear all,

while you can plan and organize availability on farm level by configuring Network Load Balancing and also SQL Server database mirroring and failover clustering, you do also have the possibility to plan for high availability on service application level, especially for critical service applications like PerformancePoint. In order to perform this, you must have a failover database server along your normal SQL server.

 

Higher availability for PerformancePoint Services can be achieved by adding additional failover databases to the PerformancePoint Service Application.

 

Go to the Central Administration, choose the row of the PerformancePoint Services Application (don’t click the name, otherwise you’ll “Manage”), and select “Properties” from the ribbon. Just like in Fig. 1

 

image

Fig.1: Open the properties window for the PerformancePoint Service Application

 

Next, in the “Failover Database Server”, insert the name of your failover database server.

image

Fig. 2: Specify the database failover server

After you did this, click “OK” to confirm.

 

Before fully relying on this mechanism, it could make sense to do a test run to verify whether the failover is working at all.

 

Caution: Do not execute the following steps in a production environment, unless you know what you are doing. At all costs, before testing the settings, notify users, schedule downtimes and make sure to do this together with a SQL administrator. Before applying any changes in production, test them in a similar test environment. The following steps assume that you have a working failover cluster for SQL server. If you don’t, DO NOT PROCEED at this point. If you are having only a single instance of SQL server (tested or untested) or a failover cluster that has been untested or is known not to be working, DO NOT EXECUTE the following steps. Otherwise you might lose data and functionality of your SharePoint environment.

 

1. Open a PPS dashboard.

2. Then, shut the principal SQL server instance down (again: Caution! do not do this in a production environment or without experience. You might lose all or partial functionality of SharePoint temporarily). This is the machine that is running Configuration Manager.

3. The failover server should now take over all the workload of the principal server.

4. Refresh the dashboard. It should work fine.

5. Try to modify it to see whether also write operations work on the PPS failover database.

6. Make sure to start again the SQL Server instance and verify that everything is up and working again.

 

The PowerShell approach:

The same result can also be achieved using the following PowerShell cmdlet:

 

Set-SPPerformancePointServiceApplication –Identity <Identity> –DatabaseFailoverServer <Servername>

 

This is very straightforward and increases the availabilty of your PerformancePoint Services.

Upshot: If you have a running SQL Server failover cluster, with little effort you can achieve high availability also for you Business Intelligence data. Not bad, is it?

 

Stay tuned, and till next time!

Martin (still in Calabria)

Advertisements

Read Full Post »

Dear all,

While reviewing the Microsoft Operations Framework (MOF) and Checklists for the daily use of monitoring health of SharePoint environments, I found the following interesting information about memory leak detection.

 

These for sure are among the worst things that can happen to a SharePoint administrator, since they can cause occasional downtimes and cryptic error messages.

 

As with almost any troubleshooting approach for SharePoint, you should go to the ULS logs and search for specific error messages. These are the two most common sources of memory leaks:

 

1. “An SPRequest object was not disposed before the end of this thread. To avoid wasting system resources, dispose of this object or its parent (such as a SPSite or SPWeb) as soon as you are done using it. This object will now be disposed”

Careful: Watch out for large numbers of this error message. It can be the case that such messages are “false positives”. Luckily, the ULS logs provide also object counts, so you can easily see whether this is occurring often or only a single event.

 

2. No specific error message, but:

Symptom: Intermittent application pool recycles.

Result: Downtime.

Problem: Hard to reproduce and debug for the administrator.

Explanation: Not properly disposed objects cause the garbage collector not to be able to recover used memory. As a result, the growth of memory by the application pool can increase greatly, causing a security protection mechanism to trigger an application pool recycle.

 

Solution for both problems: Dispose objects properly. Microsoft provides a guide on how to do this properly here:

 

Best Practices: Using Disposable Windows SharePoint Services Objects

 

Also, there is a very handy tool to detect memory leaks, which detects possible memory leaks in custom SharePoint solutions which do not comply with Microsoft’s best practices. You can find descriptions and the download of this tool here:

 

SharePoint Dispose Checker Tool

 

Happy and (safe) coding & stay tuned till the next time!

 

Best regards,

Martin

Read Full Post »

1. Close Distance Rule:

Keep Web Frontends, Application Servers and Database Servers physically located as close as  possible.

 

Rule of thumb: No more than 1 ms of latency between WFEs/AS and DB servers. In practice, this means: WFEs/ASs should reside in the same data center as DB servers.

 

 

2. Co-Location/Separation of Databases

Certain databases must be co-located or ideally separated from other databases.

 

Rule of thumb: Separate the following databases:

 

 

 

image

Source: Microsoft Technet

 

Rule of thumb: Co-locate the following databases:

 

 

image

Source: Microsoft Technet

 

3. Constantly Monitor Database Servers

Size: Rules of thumb:

    Pre-grow databases and logs.

    Monitor disk space at all times.

    < 50 Databases per SQL Server instance (when mirroring)

                          < 200 GB per content database

 

 

Metrics: Rules of thumb:

    Network Queue: 0 or 1 for best performance

    Average Disk Queue length (latency) : < 5 ms

                       Memory used: < 70%

                       Free disk space: > 25%

                       Buffer cache hit ratio: >= 90%

 

4. Transaction Logs Backup

Rule of thumb: Back up and truncate the transaction logs every 5 minutes. Shrinking the transaction logs is not recommended since it will have a performance impact while it re-grows.

 

 

For preventing the transaction logs to grow unexpectedly, view the following KB: http://support.microsoft.com/kb/873235

 

That’s it for this time. Have fun configuring your DB servers for your SP2010 environment. As usual, no responsibility is taken for any damage that could occurr. And as ususal, I highly recommend trying all changes on a test environment and the adaptation to your specific IT infrastructure.

 

Enjoy and best regards,

Martin

Read Full Post »