Externalizing BLOB Storage in SharePoint 2010
These are some notes from a helpful session today at the SharePoint Conference 2009 in Las Vegas delivered by Srini and Burzhin (I typed too slowly to get their last names), product engineers from Microsoft. The notes are as organized as I can make them while I’m sitting here in the room, but they will of course not be as polished as I’d like them to be. I’ve decided to err on the side of more information – less polish. In the interest of full-disclosure, this is a pretty new topic for me personally. Therefore, it is possible that some of these notes represent things a bit differently than how the presenters intended. The RBS team blog is available here: http://blogs.msdn.com/sqlrbs to find things in their own words.
BLOBs are Binary Large OBjects – a container of unstructured bytes of data. SharePoint data that is not meta-data (documents – most other list items are completely meta-data) is stored in BLOBs in SQL databases. BLOBs typically account for 60-70% of all content storage. Most SharePoint operations act against the meta-data, not the BLOB data – until you go to click on the link and open the document. By default, BLOB data is stored in the content database with the meta-data.
This model works well, but it does have some pain points. SQL storage is inherently expensive, especially if it’s on a SAN. The more data existing in SQL, the more performance load there to retrieve it. Large data sets are slow to backup and recover. SQL data is difficult to guarantee retention and deletion for compliance. So…, Remote BLOB Storage (RBS) will solve all of our problems and bring about world peace by allowing us to store BLOB data outside of our content databases.
Previously with 2007, EBS (External BLOB Storage) meant that third-party providers were responsible for both managing external BLOB storage and creating the API libraries to interface with SharePoint. The objective now is for SharePoint, itself to provide a common set of API libraries to do so. The result is a downloadable add-in component that can be registered for a SharePoint farm via the SQL 2008 R2 Feature Pack (see below). EBS is supported in 2010, but is deprecated. Migration from EBS to RBS can be performed via PowerShell commands.
RBS is fully managed code, can be scoped to individual content databases (instead of at the farm level), can be configured and managed via PowerShell, supports many providers (including third-party), and supports migration both ways. It ships with a native RBS FileStream store provider.
From the user’s perspective, SharePoint 2010 does all of the dancing transparently. They’ll never know something is different.
From the administrator’s perspective, there are new PowerShell cmdlets that talk to the relevant SQL stored procedures for installing, configuring, provisioning, and maintaining RBS.
From the third-party provider view ,there is now no need to write the BLOB store libraries.
The RBS add-in must be installed first in SQL (the SQL RBS 2008 R2 with FILESTREAM Provider)
RBS and Provider DLLs must be installed on all WFEs
RBS must be enabled and configured using PowerShell:
- SetActiveProvider (1 BLOB store to many content database)
- Migrate (copy entire BLOBs in or out of the db with no downtime)
Backup and Restore
This will by necessity be more complicated with multiple stores for SharePoint. However, it is workable by following some simple restores.
- Always start SQL backups first (the windows can overlap)
- Always start BLOB restores first (the windows can overlap)
A longer BLOB retention policy can make it realistic to back it up less frequently than your SQL backups. The RBS Maintainer keeps track of deletions and propagates them to the BLOB store. Deletions don’t have to be concurrent from SharePoint to BLOB. You should retain BLOBs long enough to allow you to restore the previous version of the content database without also restoring the BLOB store.
RBS seems to add little to no performance overhead in internal Microsoft testing with 128 users. In fact, with larger files – it may have a slight advantage. Third party providers may vary a bit, but are expected to cause no more than a 5 – 10% degradation and less for larger files.