SP 2010 Capacity Planning
One of the most important things for planning, managing and governing a SharePoint 2010 environment is to understand the boundaries and thresholds that affect performance and maintenance. Capacity Planning is extremely important when architecting a new SharePoint 2010 environment. A lot of times people designing SharePoint environments or creating SharePoint solutions completely forget about how Capacity Planning is directly tied to use cases and business scenarios. Knowing where your limits are and how you plan to grow will affect both your physical topology and logical topology.
Microsoft has actually put out a ton of information that you can use for SharePoint 2010 Capacity Planning.
In this blog I will touch on:
- Web Site and Site Collection Capacity Planning
- Content Database Capacity Planning
- List and Library Capacity Planning
- Search Capacity Planning
- Security Capacity Planning
- Social Networking Capacity Planning
- User Profile Service and Social Networking Capacity Planning
- Web Analytics Capacity Planning
- Workflow Capacity Planning
- Excel Services Capacity Planning
- PerformancePoint Service Capacity Planning
- InfoPath Services Capacity Planning
- Visio Services Capacity Planning
- Word Automation Services Capacity Planning
- Office Web Application Services Capacity Planning
- Access Services Capacity Planning
The following are the resources that I used for this analysis
- SharePoint 2010 Capacity Planning (Boundaries and Limits) Whitepaper (View or Download). This is a really good whitepaper which should be used as a baseline for architecting your SharePoint 2010 environments. I know there will be creative solutions over the long run that will allow you to exceed these however this is the best place to start. I actually enjoyed reading through this.
- SharePoint Server 2010 performance and capacity test results and recommendations (View orDownload). This is a set of several detailed whitepapers that go into the nuts and bolts of how SharePoint 2010 was tested and provides tons of very detailed recommendations.
All this information in this blog comes from these whitepapers and these are basically my notes.Web Application Limits
- Try to limit to 300 content databases per web application. This is an important one because a well managed SharePoint environment will not have one big content database; but instead have many to facilitate back-up and recovery procedures. If you plan to have lots of content databases, the recommendation is to use Powershell to manage the web application instead of using the management interface.
- Zones have a hard limit of five (default, intranet, extranet, internet and custom); nothing new there.
- It is recommended that you do not exceed 10 managed paths per web application without testing.
Thoughts: Not much new here other than the most important one everyone should know – do not have one big massive content database for your web application.
Web Server and Application Pools
- It is recommended to have no more than 10 Application Pools per web server. This is mostly driven by amount of RAM and usage by the users.
Content Database Limits
- Content databases should not exceed 200GB of data. Again SharePoint 2010 does scale up to 1 TB but that is only recommended for large, single site repositories.
- Remote BLOB Storage (RBS) cannot exceed 20 milliseconds to access the first byte of data.
Thoughts: I can say this over and over but you need to plan to keep your content database sizes down and do not let them exceed recommended limits. RBS is good for performance but you still think about not letting your content databases store TBs of data. That is a business requirements issue.
Site Collection Limits
- It is recommended to not exceed 250,000 sites per Site Collection. Basically the deal is that there is performance overhead when both creating and deleting sites. If you have a highly dynamic SharePoint web site, where this many sites are created and deleted on a regular basis, you will experience performance issues across the board.
- Site Collections should not exceed more than 100 GB. Again the issue is to make backup and restore procedures run quickly.
Thoughts: If this becomes an issue you probably have too many people with rights to create sites. Another way I can see this happening is if you create some sort of automated site provisioning process that just creates too many sites. Your SharePoint Topology should plan for not exceeding these limits.
SQL Service Column Limits
- As we mentioned before, we need to limit the number of row wrapping to 6 rows. According to SQL Server the threshold is 64 columns per row, so you should not exceed more than 384 columns in your SharePoint list or library definition.
- As mentioned before, the amount data stored SharePoint item per row cannot be more that 8,000 bytes.
- This whitepaper goes into specifics the size limits by SharePoint column type.
Thoughts: I had never really thought about this in the past. It will become a more important factor in your SharePoint design given the amount that can now be stored in SharePoint.
Web Page Limits
- It is not recommended to exceed more than 25 web parts per SharePoint web page.
Thoughts: If you have than many on your screen you are trying to display too much or your have made your web parts way too granular.
List and Library Limits
Understanding SharePoint list and library capacity is extremely important, especially with the new improvements to SharePoint 2010.
Here are some specifics:
- For each row in a SharePoint List or Document Library List Item you have a hard maximum of 8,000 bytes. What this basically means is the amount of data stored in the columns of a list item cannot total more than 8,000 bytes.
- The maximum File Size is 2GB. The default is 50 MB.
- It is recommended to not exceed 30,000,000 documents per document library.
- It is recommended to not exceed 30,000,000 items per SharePoint list.
- Should not exceed more than 6 table rows in the Content database to store a list or document library item. What this basically means is that if your list or library definition has lots of columns, SharePoint will break that storage of that data across rows. There is fixed amount of columns in a SQL table and if there more list columns that SQL table columns the data for the list item will be stored across rows (referred to as row wrapping). So this will only become an issue for you if you have an extreme amount of columns in your SharePoint list definitions.
- The SharePoint list user interface only allows 100 items to be processed in bulk at any one time.
- SharePoint List queries have a maximum of 8 joins allowed per query. If there are more than 8 joins in the query, it will be blocked.
- List view threshold by default will show 5,000 items. Going beyond this you will experience poor performing queries and overall throughput will decrease. List view threshold for auditors/administrators is 20,000 items.
- It is not recommended to exceed 2,000 subsites for each website. Performance will degrade on several of the out of the box pages, controls and web parts when 2,000 subsites are exceeded in a single website.
- It is recommended to not exceed 10 users simultaneously editing the same document. If there are more than 10, you will receive performance issues when committing the document to SharePoint. The hard boundary is 99 users.
- When extracting datasheet view into Excel, there is a hard limit of 50,000 items that can be exported.
- A SharePoint Workspace has a hard limit of not containing more than 30,000 items for synchronization.
Now are some other recommendations that you should be made aware of:
- A basic way for calculating the potential size of the list is the following. First estimate the average size of a document to be stored. Multiply that against the number of versions of that document. They multiple that by 20% for the data stored associated to the content database. It is also recommended to add in an appropriate buffer.
- When designing how content will be stored in lists, you need to consider if storing data in a single list, across multiple lists or even across lists in different site collections. There is a detailed discussion of this in "Single list, multiple lists, or multiple site collections" section in DesigningLargeListsMaximizingListPerformance.docx.
- Even through list and libraries can store large amount of content, we still need to get it back out. Microsoft recommends that when retrieving data from large lists using lists views or CAML queries, they be partitioned across folders, indexes or both. If not, the best and most efficient way to retrieve items is through search.
- Lots of permissions applied to a library will degrade performance. It is worse if you are doing item level permissions on large amounts of documents. There is a configuration will limit 50,000 unique permissions to a list. However it is recommended to lower this to 5,000.
- I mentioned row wrapping earlier and why it is used. It is recommended on for lists with large amount of data, to not exceed 1 or 2 additional rows of wrapping.
- Lookup columns can again be another source for performance issues. Lookup columns in a list view will result in a join being applied to the query which increases the complexity. By default each list view will support up to 8 lookup columns. It is strongly recommended to not exceed this number as it will cause significant throughput decrease for queries in a view and use lots of SQL resources.
- List indexes have little effect to list performance but can insert and update operations if there are lots of them. Up to 20 indexes can be applied per list. It is recommended to only use indexes when they are needed.
- There are three basic ways to access list data: list views, content query web parts and search. List views access data directly off SQL Server which incur slower query performance and higher load on the SQL Server. They also generate the most HTML which can slow down page rendering. However list views provide the best user experience when it comes to managing data in a list. Content query web parts display a configured view of data that is cached in the portal site map provider. It will generate the least amount of HTML and is cached, so overall performance will be better. Search web parts offer the best performance as it is optimized for queries large amounts of data. However the data returned in search is only as recent as the last crawl.
- Microsoft has several types of lists they have identified and when designing your SharePoint site, it may be good to classify in this manner so you can plan and govern better of the long run.
- There are unstructured document libraries which are basically team libraries with tens to hundreds of documents. They are highly utilized with expensive operations but the volume is low. It is important to make sure they libraries stay under 5,000 items.
- There are collaborative large lists which can have thousands of documents that can be edited, like in a knowledge management solution. Typically these lists grow significantly faster than anticipated and requiring lots of administration.
- There are structured large repositories which may have thousands to hundreds of thousands of documents. Typically these are departmental archives with automated processes to publish the documentation in a controlled manner. Almost all of the interactions with the list are read only.
- Finally there are large scale archives which contain millions of documents spread across multiple lists. Typically there are is a low amount of reads and updates and the purpose is for long term storage for compliance requirements.
- Content Organizers are a new feature that will route content to configured libraries, folders or sites. Users can submit data and not be concerned about the rules for storing content. It can support evenly storing data within a library to better manage performance.
- Metadata Navigation is a new feature that allows users to search for content in a tree based on the data associated to the document. This feature will use the most efficient query possible to return items instead of allowing the user to freely search and filter for content.
- Throttling of SharePoint list is also new. This can be configured at the web application so that inefficient operations by a user will not affect performance of the entire farm.
- Another new feature is compound indexes so indexes can be built across multiple columns instead of just one.
- The developer dashboard is another new feature that can help when understanding potential performance issues with a page. When it is turned on, tons of statistics are available, including database queries and load times.
- Allow object model is a new configuration that allows for object model to override the list view thresholds. This allows for developers to create web parts that query for content that do not adhere to list view thresholds, which is a good thing.
- Daily time window is another new configuration that when turned on, allows queries to be run that do not adhere to list view thresholds.
- You need to be aware that if the list will grow very large and exceed thresholds, several operations will be blocked, prevented or limited. For the details read the "Differences between large lists and regular lists" section in DesigningLargeListsMaximizingListPerformance.docx.
- Microsoft testing of extremely large document libraries (millions of documents) generally concluded that most of the bottlenecks would occur at the SQL Server level. To get around this bottleneck you should split content across multiple instances of SQL Server.
Thoughts: I am a broken record on this topic. Yes SharePoint 2010 resolved the challenges of storing mass amounts of data in SharePoint lists. However I still believe quantities of list data, especially "highly relational data", should be stored in a relational database and then exposed to SharePoint through the appropriate means.
- It is recommended to not exceed having a user belong to more than 5,000 SharePoint groups. There are no severe performance penalties if you go over, just a recommendation. If you do go over the user's token is bigger, it takes longer to cache access control list (ACLs) for search, and increases the amount of time check security.
- It is recommended to not exceed 2,000,000 users per site collection. Again you can exceed this number however there are long term management issues with that many users and it is recommended that you use PowerShell.
- It is recommended to not exceed 5,000 users or external groups (AD group) per SharePoint Group. The more users, the longer it takes to validate permissions or render a screen.
- It is not recommended to exceed more than 10,000 SharePoint Groups per Site Collection.
- It is not recommended to exceed 5,000 permission items in an access control list (ACL).
Thoughts: It is important to keep in mind for very large organizations and when planning to pull in data from external directory services (which may not always be Active Directory).
SharePoint Search Limits (not FAST)
When it comes to search, this is just such a big topic. I highly recommend reading all of "Estimate performance and capacity requirements for SharePoint Server 2010 Search" (SearchforSPServer2010CapacityPlanningDoc.docx). Understanding the search architecture is critical for sizing and capacity management. I am going to go over some high level concepts to get you started:
- First when planning you need to understand what data needs to be searchable and how much of it is there. You need to know how available the data must be. You need to know how fresh the data must be kept and many people will be searching for data simultaneously.
- For SharePoint 2010 there is a search life-cycle that you need to understand. There is Index Acquisition which is where full crawls for data are performed which is dependent on the size and access to the content is being crawled. Next there is Index Maintenance which is incremental crawls of all content. Finally there is Index Cleanup is basically when content sources change or removed from the crawler.
- Next there is the query service which is responsible for querying for data. It is important to know both the query latency and query throughput. You want to decrease latency which is the amount of time it takes for data to become searchable. You want to increase throughput which is the amount of data which can be searched.
- There are really just so many strategies for configuration your SharePoint logical and physical topology based on your requirements. Here are some simple things to think about. First never run both the query and search services on the same machine. Second, if you have the hardware, it is also recommended to run your query service on a separate machine which will decrease latency and increase throughput. Third it is good to have multiple search servers which build the indexes so there is redundancy when there is a failure or if you need to dedicate resources to search specific content sources.
- There some calculations which you can do to estimate the amount of space you need to indexes that will be built. Again, read this whitepaper for further details.
Some general thresholds and boundaries are:
- It is recommended to not have more than 20 SharePoint search service applications deployed to a single farm.
- For each SharePoint search service, do not exceed 10 crawl databases which store crawl data. The optimal configuration is 4 crawl databases per Search service. Each crawl databases should not exceed 25 million items.
- It is recommended to not exceed 16 total crawl components per service application.
- It is recommended to not exceed 20 index partitions for a search service. The limit is 128 index partitions. Partitioning an index allows for smaller indexes to created which can be searched faster, however having too many can have opposite effects.
- It is recommended to not exceed 10 million items per search index. The overall recommended limit for all indexes in a search service is 100 million items.
- It is recommended to not exceed 100 million crawl log items per search service.
- For each search service, it is not recommended to exceed 10 property databases. The property database contains metadata for items in each index partition. There is a hard limit of 128 property databases.
- There is a hard limit of 128 query components per search application.
- For search scopes, you should not exceed 100 scope rules per scope. As well you should not exceed 600 scope rules per search service. Exceeding these thresholds will affect performance. As well, you should not exceed more than 200 scopes per web site, which again can affect performance.
- It is recommended to not exceed 25 display groups per site. Display groups are used for grouped display of scopes through the user interface. Exceeding this number will affect performance.
- For alerts, there is a limit of 1M per search application.
- For search content sources it is not recommended to exceed 50 data sources but the top threshold is 500 data sources.
- There is a threshold for running 20 concurrent crawls within the same search service; exceeding this will cause performance issues.
- During crawling, each search application will support up to 500,000 properties.
- Recommended not to exceed 100 crawl impact rules per farm.
- Recommended not to exceed 100 crawl rules per search service. This is because the administrative screen performance will degrade.
- There is a threshold of 100,000 managed properties per search services. Managed properties are used in search queries. Basically the crawled properties are mapped to the managed properties.
- It is not recommended to exceed 100 mappings per managed properties. Exceeding this will degrade crawl speed and query performance.
- Each search service will support 100 URL removals per operations.
- It is recommended to have only 1 top level authoritative page and minimize second and third level pages.
- It is recommended not to exceed 200 keywords per site collection but the maximum is 5,000 per site collection. The only real effect of exceeding the recommendation is performance of the administration page.
- There is a limit of 10,000 metadata properties that can be crawled per item crawled.
Thoughts: Microsoft has put a lot of emphasis into this. Even though FAST is a very compelling, the out of the box SharePoint search has been dramatically improved.
User Profile Service and Social Networking limits
- The user profiles service supports up to 2 million user profiles with full social networking features. This basically means this is the limit of profiles that can be imported into the people profiles repository.
- There is a limit of 500 million total social tags, notes and ratings in the social database. Exceeding this runs the risk of significant performance issues.
- Microsoft's testing found that there was linear increase in throughput up to 8 WFEs. Past that there was no improvement. Scaling can be further achieved when separating the content and services database onto two separate database services.
- Most of the bottlenecks that will be experienced are typically due to WFEs. They found no bottlenecks associated to any of the application servers when evaluating social networking features.
- It is expected that separate web site with dedicated resources is created for MySites and the new social networking features. When planning to support performance this is not just maintaining a user's profile information, connections, newsfeeds, etc. It is also storage of documentation, collaboration, etc.
- User Profile service does have caching. The user profile service job is to take data from whatever location where you are loading from, and it will save that data in SharePoint profile database. Once the user profile cache is loaded, WFEs request for profile data is returned from the application server without requiring a call to the SQL database. I guess Microsoft's strategy is that profile information is not really dynamic data and they want to keep transactions down.
- I am not going to go in the details of the Outlook Social Connector however there is very detailed information about the performance of this in the Social Computing Capacity Planning whitepaper.
Thoughts: Much of this is associated to the new social networking features of SharePoint 2010.
Business Connectivity Services limits
- There is a hard limit of 5,000 external content type definitions that can be loaded into memory at any given time.
- There is a limit of 500 external system connections that can either be active or open at any given time. The default is set to 200 and it does not matter what type of connection is being used either.
- The default is 2,000 database items that can be returned in a request when using the database connector. The boundary is 1 million.
- There are three variables that you must always consider: the number of items in the external list, the number of columns per item and the size of each item.
- Profile pages display data from external content type data. The performance of these pages is driven by the complexity of the associations to external systems.
- The internal process for bringing external data into SharePoint via BCS is pretty simple. There is load (queries external source and loads into SharePoint), process (applies sort, filter, group processing) and render (display data onto page). BCS does not have in-memory cache for external items. Data has to go through load, process and render each time an external list is refreshed. Knowing this you need to make sure you control the amount of data that is processed at any given time.
- Microsoft's recommendation is to keep the number of items to be processed as possible by reducing the amount of items returned from external systems. It is recommended to keep the number of returned between 100 to 500 rows and it is recommended to not exceed 2.000 rows. It is recommended to use filters to ensure you work within these guidelines. More can be returned if needed but this needs to be done by an administrator.
- Rendering external list can be intensive for both the WFE and the application server. It is recommended to keep the number of items being displayed at any time to around 30. Note that the number of items that are rendered is not the same amount of items that were processed. The number of items rendered is controlled by the external list view that is on a SharePoint page.
- It is recommended to reduce the number of columns from external list. Obviously a large number of columns will affect performance.
- When rendering it is recommended not to use large-sized columns in list views. Columns larger that 1KB should not be utilized in a view. However performance is more affected by the number of items and not the size. So it is recommended always try to keep the number of items lower for better results.
- When designing an external list which is using BCS, make sure the default view will be the view the user needs to see the most. If the user needs to sort or filter the view, that will require data go back through the load, process and render process.
- For a profile page, the number of associations is the key for good performance. It is recommended to not exceed two associations. There will be lower performance of both throughput and latency when large number of items in an association.
- Diagnostic logging of BCS can become a factor for performance and it is recommended to lower when not doing testing.
- Performance of the external system will have performance implications to BCS and you need to sure those systems perform well.
Thoughts: I personally think is an area that has not been explored enough. Understand that BCS is another layer that the data must pass through. In Microsoft's detailed testing results for BCS they only tested WCF Web Services using .NET data types and SQL Server 2008 databases. As we know in an enterprise architecture not everything will be SQL Server or web service based. Data resides in formats all over the organization. Performance will be driven by factors out of scope of this blog.
Web Analytics limits
- There are three basic categories of analytics – traffic, search and inventory. Reports can be aggregated at the site, site collection, and web application level. The web analytics service mostly utilizes the application servers and SQL server, so that is where most of the capacity planning needs to be.
- At a high-level this is how the data is gathered. On each web server usage data is gathered and there is a Usage time job then calls the Logging Web Service to submit the data. The data is stored in a staging database for seven days. The web analytics components clean and process the submitted data and then every 24 hours the data is aggregated from the staging database and written to a reporting database. The aggregated data is stored for 25 months (default).
- Performance of the logging web service is directly related to the number of application servers on the farm. However the performance of the web analytics components is dependent on the performance of the analytics and reporting databases.
- Other areas of performance challenges are number of queries run each day, number of unique users each day, total number of unique assets clicked each day and existing data size in the reporting database. Basically the point is, the more data and more interactive your SharePoint farm is, the more performance overhead there will be to process that data for reporting purposes.
- Workflow throughput can be affected by numerous factors such as the number of users, the type of workflow, the complexity of the workflow, dependencies on external calls and frequency on the number of users accessing the workflow. Note that workflows that use data from the content database or are registered for lots of events will run slower. So if you reference a large SharePoint list or you call out to an external database obviously that will take a little more time.
- Microsoft's testing found that throughput for workflows topped out at three to four WFEs added.
- There are some administrative settings that can be set which will affect the performance of workflows.
- Postpone Threshold is the number of maximum workflows that can run concurrently against a single content database. Subsequent ones will be placed into a queue. The default max is 15 workflows. It is important to understand this is the number of workflows that can be in progress at one time and it not the total number workflows that can be in progress.
- Batch Size is an exception to the Postpone Threshold limit and can pick up larger amounts of workflow items from the queue. The default batch size is 100 and this can be set per service instance.
- Another configuration is the Timer Job Frequency which is the frequency that the workflow time service runs. This service is responsible for picking up items off the workflow queue based on the interval. The default is every 5 minutes.
- I found this interesting but you should consider how all three of these configurations are used together to affect workflow performance. Workflow throughput is impacted based on how quickly operations get out of the queue and then processed. In the example I read, if there are 1,000 work items in the queue, it will take the time job 10 runs to execute all of them (Batch Size), which would take close to 50 minutes to execute (Timer Job Frequency). Modifying the Batch Size would increase that at the expense of taking up lots of processing resources.
- Another thing to make sure you do not do is use the same list for workflow history and task lists. Lots of items will be written to these lists as time goes on. Allowing these lists to get very large will affect performance.
- To keep the task lists from not getting too large there is a workflow job (autoclean) that will delete workflow instances and associated tasks that have been in a completed state for greater than 60 days. It is recommended that if historical information is needed past that, to write the data to other locations as part of the workflow. Workflow history items are not part of this autoclean job and a script should be written to purge those items.
- Another maintenance trick I learned which can affect performance is removing workflow columns. If you have a list with like 50,000 items in it, and you remove the workflow status column will cause database operation proportional to each item in the list, which can flood your SharePoint environment with transactions. It is recommended to just turn off the workflow so no new instances can be created and may do the operation during off hours.
- Another performance consideration when building workflows is make sure you are not going to violate SharePoint's database locking for modifying data at the same time. You need to make sure users and/or other workflows do not try to access the same thing at the same time.
Thoughts: I found the discussion on how to configure the workflow service very interesting because I had really never thought about that when I had built workflows in MOSS 2007. I can see how you may want to configure this very differently based on the site where the workflow is running. They had a creative example that was discussed where if you want to dedicate running workflows to a specific machine in the topology, you could try to lower the postpone threshold to force all workflows onto the queue so the batch job would pick them up on a specific machine. You would have to set the batch size to be high so that it picks up the work items before another machine. If you want to make sure the processing is shared across all the machines, you could try increasing the postpone threshold so items never sit around for a long time.
Excel Services limits
- Excel Services performance is directly correlated to the size and complexity workbooks that are hosted within SharePoint.
- Excel Services is stateful meaning that workbooks will remain in memory between user interactions.
- Excel Services have most effect on both the application servers and the WFEs. Excel Services will not highly utilize SQL Server because the workbook is read as a binary blob and used within the Excel Service. Microsoft's finding was that bottlenecks only happened on either the application or WFE servers, which can be scaled out by upping CPU memory or adding new Excel Services machines to the topology.
- Within Central Admin there are several configurations which can be used to better manage performance.
- Maximum Private Bytes is a percentage value Excel Services can use of memory on the machine. If the machine is dedicated to Excel Services, you can up this percentage. This is one of the main bottlenecks identified by Microsoft.
- Memory Cache Threshold is another configuration available. Excel Services will cache unused objects in memory and by default it will use 90% of the Maximum Private Bytes. Increasing this number gives a better chance that the requested workbook would already be in memory when the user accesses it. However lowering this number will improve performance of other services if the application server is not just hosting Excel Services.
- Maximum Unused Object Age is a configuration that helps keep objects in memory as long as possible but you will want to again modify this if other services are hosted on the application server.
- There are also settings which are not global to Excel Services however they are specific to the trusted locations where Excel Services are being used. For instance there is max workbook size, max chart/image size, session timeout, volatile function cache lifetime and allow external data all of which can be used to better manager performance.
Thoughts: For SharePoint 2010 Excel Services I would say there is more anticipation for a more feature rich solution than what was available for SharePoint 2007. It is very hard in my opinion to come up with a one size fits all sort of assessment for performance when it comes to Excel Services. This is because each hosted workbook can be dramatically different than the other.
InfoPath Service limits
- There are numerous factors that affect the throughput of InfoPath Services, some of which are: number of users, type/complexity/frequency of user operations, number of postbacks per operation, and performance of data connections. When building and deploying forms, this should always be accounted for.
- In the design of the form, try to optimize the first request to the form template. Try to exclude custom onLoad events, queries or business logic that occur when opening the form. You should try to make these operations as user driven as possible. What we want to do is delay the creation of session state data in the database until an actual post occurs. If the form only have one post, meaning when it is submitted, no session state for the form will ever be created.
- If the InfoPath form needs to be saved to a document library, it is better to submit the form into the document library rather than saving it. The reason is a submit operation triggers only one post but a Save will trigger two posts.
- Document library forms can support more throughput that InfoPath list forms.
- Form complexity such as the number of controls, rules, encapsulated data, etc. dramatically affects performance. The more complex, the more pressure will be put on the WFEs to support greater throughput.
- It is recommended to minimize the amount of controls displayed at any given time. If you do have a lot of controls that have to be displayed, it is recommended to put them into a secondary view so that you reduce the first page loading we talked about earlier. This is not just to reduce the amount of HTML generated; there can be a lot of presentation rules which can slow performance down.
- Large amounts of complex XML can cause more performance issues. Be careful of using attachment controls as the binary of the attachment will actually be embedded into the XML of the InfoPath form, making it very large.
- Database locks prevent multiple users from making changes to the same set of data. Since an InfoPath form is more than a piece of data in SharePoint, a lock is placed on it while it is edited. So it will not support concurrent usage of the same form well.
- It is recommended to return filtered data directly to the InfoPath form instead of filtering it within SharePoint.
- When reviewing the results of Microsoft's test, there was always more performance overhead when the form has web service submits.
Thoughts: From a performance perspective, not much has changed from a performance perspective. Yes - InfoPath 2010 form services will be faster than SharePoint 2007 – but the design concepts to create forms that perform well has not really changed. Your design of the forms drives performance and you should not lean on adding more WFEs to solve your problems.Visio Services limits
- There are three basic factors that affect performance: drawing size, complexity and data connectivity.
- Typically Visio Services are tough on the WFEs which will require you to scale up or out.
- The initial file size of Visio web drawings is 50MB. This can be changed by the administrator however larger files will increase memory required for Visio services, increase in CPU usage, reduction in service requests per second, increased latency, and increased SharePoint farm network load.
- There is a configuration called Visio Web drawing recalculation timeout which changes the maximum amount of time it will spend recalculating a drawing after a data refresh. The default it 120 seconds. Increasing the timeouts will increase CPU and memory, reduce application requests per second and increase latency across all documents.
- For Visio services there is a Minimum cache age setting which is used for data connected diagrams. This setting is between 0 to 24 hours. Setting this to a lower value will reduce throughput and increase latency because recalculations have to occur more often. This subsequently reduces CPU and memory.
- As well there is a Maximum cache age for non-data connected diagrams that determines how long to keep the diagram in memory. Increasing this value will decrease latency for diagrams that are used a lot. However increasing the maximum cache age will increase latency and slow down throughput for all items that have not been cached because the cached items consume available memory.
Thoughts: I found this section interesting to read because it gave me insight into how Visio Services is actually working. I had heard the Visio services can be data driven. When reading this you need to focus on caching as part of your design and understand the types of documents that you plan to post.
PerformancePoint Service limits
- The factors that most affect throughput are number of users, type of user interaction, frequency of use, number of postbacks in an operation and data connection performance.
- When a scorecard uses an Excel data sources, there is a limit 1 million cells per query.
- In a dashboard that is using Excel as a data source, the max number of columns is 15 and 60,000 rows.
- It is recommended that if you do have business logic or complex controls, the demand on WFEs will increase and adding more WFEs will alleviate those issues.
Word Automation Services limits
- The input file size cannot exceed 512MB.
- The frequency of the conversions can be configured by minutes. A lower number will allow the timer job to run faster. The default is every 15 minutes but the recommendation is every 1 minute.
- There is a threshold for the number of conversions to start per conversion process. Obviously if the value is high, this can cause intermittent failures.
- The conversion job size can support 100,000 items at one time. A large number of conversion items increase the amount of time it takes to execute.
- The total number of conversion processes is directly related to the number of core processors. You should have not more conversion processors running than you have of processors. The reason why you should know this is because word automation services can at times full leverage a processing core and cause issues if other services are hosted on the application server. There is a property of word automation services called Total Active Conversion Processes which can be used to "throttle down" word automation services such that it does not use too many processing cores per application server.
- The work automation service database size cannot exceed 2 million conversion items. Items are not deleted from the database automatically after processing; so administrators will have to write jobs to remove the history. Exceeding 2 million items can cause performance challenges the longer the word automation service runs.
- When scaling out word automation services it is recommended to add more application servers and potentially add a new SQL Server which has the work automated services database dedicated to it. Another scaling recommendation would be to create dedicated word automation application servers.
- There is a fundamental performance limitation with converting PDF and XPS file types. Adding servers will not always solve the challenge.
Office Web Application Services limits
- This includes the Word Web App, PowerPoint Web App and OneNote Web App. There are several different perspectives that you have to look at each of these. Specifically you should focus on viewing versus editing because the resources required are quite different.
- The drivers for performance are expected concurrent users and what type of operations are going to be done. Microsoft's initial recommendations are 100 daily users with average of 10 concurrent can be supported by 1 WFE and 1 App Server. Going to 1000 daily and 30 concurrent would require 2 WFE and 2 App servers. Going to 10,000 daily and 300 concurrent would require 4 WFE and 3 app servers.
- Heavy Word Web App viewing will typically require more CPUs to the application servers.
- Heavy Word Web App editing and OneNote Web App viewing/editing require more CPU to the WFEs.
- Heavy PowerPoint Web App viewing requires more CPUs to the app servers and more memory to the WFEs.
- Heavy PowerPoint Web App editing requires more memory to be added to the application services.
- If you did not know what this is this feature enables presenters to broadcast a slide show from Microsoft PowerPoint 2010 to remote viewers over the web through SharePoint. It is positioned as a low-infrastructure presentation capability. Heaving PowerPoint Broadcast will require the WFEs to have more CPU. It is recommended that if you have extremely PowerPoint Broadcast feature to create an environment to support that. The reason being is web browsers will hit the server every second to get the latest changes to the slidedeck.
- Much of the limits associated to OneNote Services are directly tied to the limits for list and libraries. This is because each section in OneNote is stored as folders and documents in a library.
- The maximum size for each section of an OneNote section is again driven by the file size limits for lists and libraries.
- If there are embedded images, files, etc. in OneNote, that are greater than 100KB, they will be split out into their own binary files within the SharePoint library.
Access Services limits
- The performance of Access Services is dependent on the other applications that are hosted within the service. It would be recommended that Access Services to be dedicated to a service if there is lots of data to be managed.
- The amount of data in the tables and the size of the queries will have the most impact to performance of Access Services. It is recommended to limit the size and complexity of queries and well as control the amount of data that flows through. This can be done in Central admin. First the query can be configured by controlling the max sources per query and max records per table. Second results can be configured by max columns per query, max rows per query and allowing out joints. Third processing complexity can be configured by max calculation columns per query and max order by clauses per query.
- There is another configuration called All Non-Remotable Queries. Way Access Services work under the hood is that is augments SharePoint query operations support Access Service features, like working with large amounts of data. If there are performance challenges, you can change this configuration so that Access Services just uses the SharePoint query operations – subsequently making data fetches less complex (at the same time not as robust).
- Access Services is stateful. It will maintain in memory cursors and record sets between user interactions. Microsoft's testing did not determine that this would be a bottleneck.
- There are no special hard requirements for WFE or application servers to support Access Services. It is recommended to increase capacity by scaling up your existing servers or scale out by adding more servers into the SharePoint topology.
- To support more users it is suggested to add more CPUs to the servers and more Access Services computers if needed. Microsoft's testing found out that if you add too many Access Service machines the WFEs can become a bottleneck to the Access Services. This tends to happen when the four Access Service machines have been added.
- Reporting services are utilized with Access Services. It is recommended to create a dedicated machine to support it if processing of reports is taking a long time.
- There are several performance counters that can be used to determine performance that are in the whitepaper.
Farms and Web Applications
|Boundary||Hard-coded – Default, Intranet, Extranet, Internet and Custom (just labels, can use for different purposes if needed)|
|Supported||Cached on web server so can affect web front-end performance. You normally get 2 by default in your first web app: ‘sites’ and ‘my’. ‘My’ should be in a separate web app if you have more than 100 users or allow large mysites.|
|Threshold||Allows InfoPath Forms service to hold solutions in cache to speed retrieval of data. If exceeded, solutions are retrieved from disk which may slow response times|
|Supported||Limit depends on RAM allocated to web servers and workload of the farm (user base and usage characteristics – a single highly active application pool can reach 10GB or more)|
|SharePoint Entities with Web Analytics enabled||30,000 per farm||Supported||Do not enable web analytics if a farm is going to contain more than 30,000 entities: includes all web applications, site collections and sites.|
|Office Web Application Cache||100GB||Threshold||Space available to render documents, created as part of a content database. Not advised to increase this default. Note: it will drop itself in the first site collection. Can use PowerShell to manage. Don’t have it in a busy or important site collection.|
Content Databases and Site Collections
|Supported||Increasing the number of content DBs doesn’t affect end user ops but will affect admin such as creating new site collections.|
|Content DB size|
- General usage
|200GB||Supported||Limit to 200GB for active site collections and general usage scenarios – i.e. collaborative working, search etc.|
If using remote blob storage (RBS), total volume of remote BLOB storage and metadata in content database must not exceed 200GB
|Content DB size|
- All usage
|4 TB*||Supported||Big change with Service Pack 1, previous limit was 1TB and only for archiving/records management – see next row for that scenario. Can now go beyond 200GB for any scenario provided you meet certain criteria, including figuring how you are going to backup and restore in a timely fashion because the cheap methods won’t work. Recommends still keeping site collections to 100GB. Disk sub-system performance must be 0.25 IOPs per GB, prefer 2 IIOPs per GB = expensive… If refactoring site collections, stick to the 200GB limit for content DBs.|
|Content DB size|
|No explicit limit*||Supported||Previously 1TB. Updateed with Service Pack 1. Read Microsoft’s guidance before considering adopting this approach (links at the end of this post). Sites must be based on Document Center or Records Center site templates. (But no longer specifies only 1 site) Less than 5% of content accessed per month and less than 1% modified or written per month. No alerts, workflows, item level security (but can use Content Routing workflows to get content in)|
|Content DB items or documents||60,000,000*||Supported||60 million is the largest numner of items per content database that has been tested on SharePoint 2010. So your 4TB or unlimited Content DB must have less than 60 million items in it.. (items = list items and library documents).|
|Supported||The larger the number of site collections in a content database, the slower the upgrade. Limit is linked to size – i.e. must keep within 200GB database limit. 2,000 site collections in a single content DB = 100MB per site collection! Note: if have more than 5,000 users, will need to split MySites (each MySite is a site collection) across more than one contentDB. Easy enough to do, even in Central Admin (just online/offline each DB) but do put them in a dedicated web app.|
|Maximum size of the content database*||Supported||This was 100GB pre-Service Pack 1. However Microsoft still strongly recommend keeping the size of site collections to 100GB for two reasons: 1. Certain site collection actions may affect performance or fail if other site collections are active in the content database. (Solution: have 1 site collection per content database) 2. SharePoint site collection backup and restore is only supported for a maximum site collection size of 100GB. If larger, entire content DB must be backed up. If multiple site collections larger than 100GB are in a single content DB, operations may take a long time and fail (Solution: have 1 site collection per content database)|
|Number of web sites||250,000 per|
|Supported||Nesting sites is recommended, e.g. shallow hierarchy of 100 sites, each with 1,000 subsites, or a deep hierarchy with 100 sites, each with 10 subsite levels, 100 per level. Both = 100,000.|
Note: My recommendation: don’t have more than 100 at the top-level… wouldn’t go more than 5 levels deep either – you’ll hit the URL string limit. And I avoid having more than 250 at any level.
|Threshold||Interface for enumerating subsites of a given web site does not perform well above 2,000. Will also affect All Site Content page and Tree View control.|
|20ms||Boundary||If using Remote Blob Storage (RBS), Network Attached Storage (NAS) must respond to a BLOB request within 20ms (i.e. SharePoint must receive first byte from the NAS within 20 milliseconds. For other RBS limits, see also the Content DB size for general usage – RBS deployments must keep the content DB under 200GB.|
|Content deployment jobs running on different paths||20||Supported||For paths connected to site collections in the same source content database – exceeding increases risk of deadlocks on the database. If you are using SQL Server snapshots for content deployment, each path creates a snapshot increasing I/O requires for the source database|
Lists and Libraries, Documents and Pages
|List row size||8,000 bytes||Boundary||Each list or library item can only occupy 8,000 bytes total in the database. 256 bytes are reserved for built-in columns leaving 7,744 bytes to use. The limit includes with row-rapping (a single item can take up to 6 rows to overcome SQL type limits but the list row size must still be less than 8,000 bytes. See notes at the end of this table for designing large lists.|
|Row size limit||6 table rows||Supported||SQL supports row-wrapping that SharePoint can make use of – it means an item can wrap across up to 6 rows (6 rows required per item) when the number of columns exceed SQL type limits. See notes at the endo fthis table for more details.|
|File size||2GB||Boundary||Default size is 50MB. Can be increased but other limits will need to be reduced to protect performance (i.e. fewer major versions of documents)|
Note: All document max. values are based on the default size. Change it and you change the values.
|Supported||You will need to nest folders to reach this maximum. The value varies depending on how documents and folders are organised and by the type and size of documents stored. This number can be misleading. See also Major Versions and List View Thresholds – views must return less than 5,000 items. Also, this limit is based on the max. upload size of 50MB. If increase the upload size, reduce the number of documents per library. You still need to consider your chosen content DB limit. If 200GB, that means 400,000 documents based on average file size of 5MB.|
|Major versions||400,000 per library*||Supported||The number of documents per library includes all versions where version history is used. If 400,000 major versions is the supported limit, then the library supports 400,000 documents from the end user’s perspective (each document is only listed once, regardless of the number of versions behnd it). Actual max. value will depend on your content DB size. If it’s a collaborative site collection with a 200GB limit and you keep 4 versions per document, that’s 50,000 documents, and less if they are larger than 5MB on average.|
|Supported||Note that Microsoft gets around large lists by suggesting the use of views, site hierarchies and metadata navigation to break up large data repositories. See List View thresholds for performance concerns with large lists.|
|Content DB limit||60,000,000|
items or documents
|Limit introduced with Service Pack update. If you have 120 million items, you need 2 content databases, each containing only 2 lists or libraries and nothing else, regardless of the size of the database.|
|List View threshold||5,000 items||Threshold||Maximum number of list or library items that a database operation, such as query, can process at the same time outside of daily time window (overnight), when queries are unrestricted. If you want to have a list with more than 5,000 items, you need views that will always return less than 5,000 items.|
|List View threshold for authors and admins||20,000||Threshold||The increase above 5,000 for users is possible for auditing so requires appropriate admin permissions – works with the Allow Object Model Override.|
|8 join ops|
|Threshold||Maximum number of joins per query, such as lookup columns, person/group columns or workflow status columns. Operation blocks after more than 8 joins, i.e. only see first 8. Doesn’t apply to single item ops.|
|Bulk Operations||100 items per bulk operation||Boundary||The user interface allows a maximum of 100 items to be selected for bulk operations. If you are using datasheet view and try selecting all, you’ll have problems if all means more than 100 items. Ditto if using the checkboxes instead to select items.|
|Web Parts||25 per wiki or|
web part page
|Threshold||Estimate based on simple web parts. Complexity will determine performance e.g. displaying Excel Services or InfoPath forms, connecting Web Parts etc., meaning fewer web parts more likely|
|Coauthoring Office files||10 recommended|
|Threshold||Recommended maximum is 10 concurrent editors per document. Boundary is 99. 100th+ user will get ‘file in use’ error and can only view a read-only copy. Applies to Word (.docx) and PowerPoint (.pptx and ppsx) when authored in Office 2010 (coauthoring not yet supported in browser…)|
Column Limits (Large Lists)
|Singe line of text|
|64||276||28||Maximum is capped from 384 due to exceeding 7,744 bytes|
|Multiple lines of text||32||192||28|
|Number or currency||12||72||12|
|Date and Time||8||48||12|
|Lookup, Person or Group||16||96||4|
|Hyperlink or Picture||32||138||56||Maximum is capped from 192 due to exceeding 7,744|
|First managed metadata field in a list is allocated 4 columns: tag, string value, catch all and spillover.|
Subsequent fields have 2 columns: tag and string value.
|Number of SharePoint groups a user|
can belong to
|5,000||Supported||Not a hard limit, consistent with AD guidelines.|
|Users (security objects) in a|
|2 million per|
|Supported||Based on number of entries listed in permissions. Workaround is to use Windows security groups instead of individual users to apply permissions. Also, use Powershell to manage users instead of UI if rendering is slowing down|
|Active Directory Principles/Users in|
a SharePoint group
|Supported||Can have up to 5,000 AD users or groups in a SharePoint group.|
|SharePoint groups||10,000 per|
|Supported||Above 10,000 groups, ops slow down such as adding users to a group|
|Security scope for lists and libraries||1,000 per list||Threshold||Maximum number of unique security scopes set for a list. Scope contains ACL but can include security principals specific to SharePoint as well as Windows user accounts (i.e. SharePoint groups, Forms-based accounts, AD groups).|
Size of Security Scope
|5,000 per ACL||Supported||Size of scope affects data used for a security check calculation on security principals|
|Search service apps||20 per farm||Supported||Shouldn’t need more than 20…|
|Content Sources||50 per search service app||Threshold||The recommended limit of 50 can be exceeded up to the boundary of 500 per search service application if fewer start addresses are used and the concurrent crawl limit must be followed.|
|Start addresses||100 per content source||Threshold||The recommended limit of 100 can be exceeded up to the boundary of 500 per content source if fewer content sources are used (i.e. less than 50). Ditto for vice versa. Use fewer than 100 start addresses if you have more content sources. Recommended tip if you have lots of start addresses is to use an HTML page containing the start addresses instead – the HTML crawler will then hit the page first and follow the links on the page|
|20 per search app||Threshold||The number of crawls (content sources/start addresses) that can be underway at the same time. Exceeding will just slow down crawling, defeating the purpose.|
|Crawl rules||100 per search app||Threshold||Can be exceeded but may struggle to view the rules in search admin UI.|
|Crawl impact rule||100 per farm||Threshold||Recommendation can be exceeded but display of site hit rules in search admin UI may be affected. At approx. 2,000 site hit rules, the Manage Site Hit Rules page becomes unreadable|
|Crawl DB and|
Crawl DB items
|10 crawl DBs per search service app25 million items per crawl DB||Threshold||Crawl database stores the crawl data (time, status, etc. ) about all items indexed. Supported limit is 10 crawl database per SharePoint Search service application. Recommended limit is 25 million items per crawl database but at the limit, only 4 crawl database per search service application (due to indexed items limit of 100m per search service app).|
|Crawl components||16 per search app||Threshold||Require 2 components per crawl database and 2 per server, assuming server has at least 8 cores. Total number per server must be less than 128/(total query components) to avoid performance problems with I/O.|
|Crawl log entries||100 million per search app||Supported||Number of individual log entries in the crawl log – matches the Indexed items.|
|Indexed items||100 million per search service application||Supported||Items includes everything that is indexed – people (profile pages), list items, documents, web pages, files|
|Index partitions||20 per search service app||Threshold||Each index partition holds a subset of the search service application index. Each partition is recommended to not hold more than 10 million items – i.e. if you have less than 10 million items to index, you only need one partition. Plus at limit, could only have 10 per search app due to Indexed items limit of 100 million. Each partition will also have a property database.|
|Property DBs||10 per search service app||Threshold||Stores the metadata for items in the index partition associated with it. An index partition can only be associated with one property store. (Doesn’t mention if 2 partitions can share a property DB – presumably, given the difference in max. value per search service app).|
|Crawled Properties||500,000 per search app||Supported||The properties (metadata that goes into the property DB) that are discovered during crawling – i.e. 500,000 equates to 500,000 columns created in the Property DB that contain the metadata.|
|Managed Properties||100,000 per search app||Threshold||Mapped from crawled properties (for use in queries and scopes)|
|Mappings||100 per managed property||Threshold||Mappings for each managed property (i.e. mapping the number of crawled properties to a single managed property). Exceeding decreases both crawl speed and query performance|
|Metadata properties recognised||10,000 per item crawled||Boundary||This is the number of metadata properties that can be determined and indexed when an item is crawled. If you’ve got a content source with more than 10,000 columns, only the contents of the first 10,000 columns will be indexed.|
|Query components||128 per search app||Threshold||Total number of query components is limited by the ability of the crawl components to copy files, i.e. query components need to absorb files propagated from crawl components|
|Scopes||200 site scopes and 200 shared scopes per search service app||Threshold||Exceeding the limit can reduce crawl efficiency (scope membership is determined as items are indexed) and if too many are added to display groups, can affect browser latency for end users. Admin interface will also be affected.|
|Scope rules||100 per scope||Threshold||Exceeding the limit can reduce crawl freshness and delay potential results from scoped queries|
|Display groups||25 per site*||Threshold||Degrades search admin UI if exceed + assume is per site collection – display groups are only configured as part of site collection admin or in the search site template, not per site for other templates.|
|Alerts||1,000,000 per search app||Supported||This is the tested limit. Note: you can configure how many alerts an individual can create. The default is 500.|
|Keywords||200 per site collection||Supported||The boundary (ASP.NET imposed) is 5,000 per site collection given 5 best bets per keyword. Max limit can be modified by tweaking Web.Config or Client.Config files but just don’t|
|Authoritative pages||1 top level and minimal 2nd and 3rd level per search app||Supported||The boundary is 200 per relevance level per search app but if you add more than a few you’re blurring too much that it’s not going to improve relevance. Stick to 1 for the first level (Note: authoritative pages will add a relevance boost to all content at that link (site in effect).|
|URL removals||100 removals per operation||Supported||Similar to bulk operation limit in lists – how many URLS can be removed from the index in one go|
|User profiles||2,000,000 per service app||Supported||If you want to profile more than 2 million people, you’ll need to split into multiple profile services, e.g. Europe, Americas, Asia-Pacific. Exceed 2 million in one profile service app and directory import will likely fail.|
|Social tags, notes and ratings||500 million per database||Supported||Across all users. Limit is due to concern with backup/restore so if have that solved, can have more.|
|Blog posts||5,000 per site||Supported||Won’t be using SharePoint for busy blog sites then… Plenty of other reasons not to. For large and public blog sites, using a dedicated tool like WordPress (what this post is being written on)|
|Comments||1,000 per site||Supported||Techcrunch definitely couldn’t run their site on SharePoint…|
Business Connectivity Services
|5,000 per web server|
|Boundary||Total number of external content types (ECT) definitions loaded into memory at a given point in time on a web server|
|External system connections||500 per web server||Boundary||Number of active/open external system connections at a given point in time. Default value is 200. Limit is enforced at the web server scope, regardless of the kind of external system.|
|Database items returned per request||2,000 per|
|Threshold||Default maximum of 2,000 is used by database connector to restrict the number of results that can be returned per page|
|Workflow postpone threshold||15||Threshold||Maximum number of workflows allowed to be executing against a content database at the same time, excluding instances that are running in the timer service. When threshold is reached, new requests will be queued to run by the timer service later.|
|Workflow timer batch size||100||Threshold||The number of events that each run of the workflow timer job will pick up and deliver to workflows. Configured via PowerShell|
|Levels of nested terms||7 per term store||Supported||Term store hierarchy can have up to seven levels.|
|Number of term sets||1,000 per term store||Supported|
|Number of terms in a term set||30,000||Supported||Additional labels (synonyms and translations) for the same term do not count.|
|Total number of items in a term store||1,000,000||Supported||An item is either a term or a term set.|
per query on Excel Services data source
|Boundary||PerfPoint scorecard can’t query more than 1 million cells in an Excel Services Data Source, per query.|
|Columns and rows||15 columns by 60,000 rows||Threshold||Maximum number when rendering any PerfPoint dashboard object that uses an Excel workbook as the data source. The number of rows can change depending on the number of columns – fewer of one allows more of the other.|
|Query on a SharePoint list||15 columns by 5,000 rows||Supported||Officially, same as above – fewer columns = more rows. However recommend sticking with 5,000 max – fits the query threshold limit for list views.|
|Query on a SQL Server data source||15 columns by 20,000 rows||Supported||As with previous – number of rows can change depending on number of columns.|
SharePoint Workspace (formerly known as Groove)
|Synchronisation item limit||30,000 items per list||Boundary||SharePoint Workspace won’t synchronise lists that have more than 30,000 items due to download time being too long and occupying resources.|
|Synchronisation doucment limit||1,800 documents in a workspace||Boundary||Users receive a warning when they have more than 500 documents in a workspace.|
|End of project time||Date: 31 Dec 2039||Boundary||Project plans cannot stretch into 2050… draw your own conclusions about that.|
|Deliverables per project plan||1,500 deliverables||Boundary|
|Number of fields in a view||256||Boundary|
|Numbr of clauses in a filter for a view .||50||Boundary|