Lync Server 2013 的容量和可用性管理Capacity and availability management in Lync Server 2013

 

主題上次修改日期: 2014-08-18Topic Last Modified: 2014-08-18

容量管理與可用性管理的目的是測量和控制系統效能。The purpose of capacity management and availability management is to measure and control system performance. 建議您執行容量管理和可用性管理程式,讓您能夠測量和控制系統效能。We recommend that you implement capacity management and availability management procedures so that you can measure and control system performance. 您必須瞭解系統是否可供使用,以及是否可以透過設定基線及監控系統以尋找趨勢,來處理目前和預計的需求。You have to know whether the system is available and if it can handle the current and the projected demands by setting baselines and monitoring the system to look for trends.

容量管理Capacity management

容量管理包括規劃、調整大小及控制服務容量,以協助保證已超過 SLA 中所指定的最低效能等級。Capacity management involves planning, sizing, and controlling service capacity to help guarantee that the minimum performance levels specified in your SLA are exceeded. 良好的容量管理可協助確保您能夠以合理的成本提供 IT 服務,而且仍然符合您在用戶端的 Sla 中所定義的效能等級。Good capacity management helps ensure that you can provide IT services at a reasonable cost and still meet the levels of performance defined in your SLAs with the client. 這些準則可包含下列專案:These criteria can include the following:

  • 系統回應時間    這是系統執行一般動作所需的測量時間。System Response Time   This is the measured time that the system takes to do typical actions. 範例包括音訊/視訊伺服器角色處理音訊/視頻流量所需的時間、用戶端建立及加入會議所需的時間,或要在所有的觀察者用戶端中更新目前狀態所需的時間。Examples include the time that is required for the audio/video server role to process audio/video traffic, the time that is required for a client to create and join a conference, or the time taken for presence to be updated in all watcher clients.

  • 儲存容量    不論是內容資料庫、備份裝置或本機磁片磁碟機,都是指儲存系統的容量。Storage Capacity   This is the capacity of a storage system, whether it is a content database, a backup device, or a local drive. 範例包含每個網站所提供的儲存空間量上限,以及備份在覆寫之前應儲存的時間。Examples include the maximum amount of storage space to be provided per site and the time that backups should be stored before they are overwritten.

調整容量的情況通常是確定有足夠的實體資源可供使用,例如磁碟空間和網路頻寬。Adjusting capacity is frequently a case of making sure that enough physical resources are available, such as disk space and network bandwidth. 下表列出容量相關問題的一般解決方法。The following table lists typical resolutions for capacity-related issues.

問題Issue 可能的解決方法Possible resolution

具有不良音訊/視頻效能的遠端使用者Remote users having poor audio/video performance

查看 WAN 連結上是否有適當的頻寬可用,以及是否 QoS 已啟用且設定正確。Check to see whether appropriate bandwidth is available on the WAN links and if QoS is enabled and appropriately configured. 檢查 QoE 資料。Check QoE data.

Lync 環境整體回應的速度很慢。Overall response of the Lync environment is slow.

執行測試,檢查現有的前端伺服器是否可以處理負載。Run tests to check that the existing front-end servers can deal with the load. 如有需要,請引進新的前端伺服器。請檢查 SQL 資料庫回應時間,並修正延遲的原因 (例如,改善磁片 I/O) 。Introduce a new front-end server if it is needed.Check SQL database response times and fix the causes for the delays (for example, improve disk I/O).

有關疑難排解的詳細資訊,請參考 Lync Server 網路指南。Troubleshooting in greater detail is covered in the Lync Server Networking Guide.

容量會受到系統設定的影響,取決於實體資源,例如網路頻寬。Capacity is affected by system configuration and depends on physical resources such as network bandwidth. 例如,如果 Lync 環境已設定為每夜執行完整備份,則必須小心謹慎,以協助保證使用者對互動式效能的影響降至最低。For example, if a Lync environment is configured to perform a full backup nightly, care must be taken to help guarantee that the effect on the interactive performance experienced by end-users is minimized.

容量管理是將系統容量保持在可接受的層級,並解決下列問題的程式:Capacity management is the process of keeping the capacity of a system within acceptable levels and addresses the following issues:

  • 對需求     變更的反應必須調整容量需求,以考慮系統或組織中的變更。Reacting to changes in requirements   Capacity requirements have to be adjusted to account for changes in the system or the organization. 例如,如果您的環境決定實現 Enterprise Voice,轉送伺服器和公用交換電話網路 (PSTN) 閘道的數目和位置將非常重要。For example, if your environment decides to implement Enterprise Voice, the number and placement of Mediation Servers and public switched telephone network (PSTN) gateways will be very important. 如果您會執行會話初始通訊協定 (SIP) 中繼或直接 SIP,整體設計將會大幅變更,以提供最佳的企業語音效能。If you'll be doing Session Initiation Protocol (SIP) trunking or direct SIP, the overall design will be significantly changed to provide the best Enterprise Voice performance.

  • 預測未來需求    有些容量需求會隨著時間而變更。Predicting future requirements   Some capacity requirements change predictably over time. 透過追蹤趨勢,您可以事先規劃升級。By tracking trends you can plan upgrades in advance. 例如,必須監視不同 Lync 網站間的可用頻寬,以建立基準。For example, available bandwidth between various Lync sites must be monitored to create a baseline. 此基準可讓您預測當您必須將更多頻寬新增至這些連結時,這些遠端網站中的使用者計數會隨著時間而增加。This baseline will allow you to predict when you have to add more bandwidth to these links as user count in these remote sites increases with time.

可用性管理Availability management

可用性管理是一種程式,可確保任何 IT 服務始終如一且成本有效地傳遞客戶所需的一致、可靠的服務層級。Availability management is the process of making sure that any IT service consistently and cost effectively delivers the level of consistent, reliable service that is required by the customer. 可用性管理會使服務遺失降至最低,並確保在服務遺失時採取適當的動作。Availability management deals with minimizing loss of service and with making sure that appropriate action is taken if service is lost. 在 Lync 環境中,您可能會擔心 Enterprise Voice 服務是否可供使用、使用者是否可以加入排程的會議等等。In a Lync environment, you may be concerned about whether the Enterprise Voice service is available, whether users can join scheduled conferences, and so on. SLA 定義可接受的頻率及中斷長度,當系統無法用於計畫的維護時,可在特定期間內使用。An SLA defines an acceptable frequency and length of outages and allows for certain periods when the system is unavailable for planned maintenance.

如果您必須提供報告給您的系統可用性,或您有財務或其他與缺失可用性目標相關的處罰,您必須記錄可用性資料。If you have to provide reports to your management about the availability of systems, or if you have financial or other penalties associated with missing availability targets, you must record availability data. 即使您沒有這類正式的需求,最好還是至少知道系統在特定時段內失敗的頻率。Even if you do not have such formal requirements, it is a good idea to at least know how frequently a system has failed in a certain time period. 例如,最近12個月的系統可用性,以及從每個失敗中復原所需的時間。For example, system availability in the last 12 months and how long it took to recover from each failure. 此資訊可協助您測量和提高小組回應系統失敗的效能。This information will help you measure and improve your team’s effectiveness in responding to a system failure. 如果有爭議,也可提供您有用的資訊。It can also give you useful information if there is a dispute.

可用性的相關度量如下:Measures related to availability are as follows:

  • 可用性    這通常會以系統或服務可存取的時間來表示,與系統或服務的使用時間相較。Availability   This is typically expressed as the time that a system or service can be accessed compared to the time that it is down. 它通常會以百分比表示。It is typically expressed as a percentage. (您可能會看到「三個9」或「五個9」的參照。(You may see references to “three nines” or “five nines”. 這兩個參考是99.9% 或99.999% 的可用性。 ) These refer to 99.9 percent or 99.999 percent availability.)

  • 可靠性    這是測量系統失敗之間的時間,有時會以平均 (或平均) 時間之間的平均 (MTBF) 來表示。Reliability   This is a measure of the time between failures of a system and is sometimes expressed as mean (or average) time between failures (MTBF).

  • 修復時間    這是發生失敗後復原服務所花費的時間,而且通常表示為平均) 修復 (MTTR) 的平均時間 (。Time to Repair   This is the time taken to recover a service after a failure has occurred and is often expressed as mean (meaning average) time to repair (MTTR).

可用性、可靠性及修復時間與下列相關:Availability, reliability, and time to repair are related as follows:

可用性 = (MTBF-MTTR) /MTBF    例如,如果伺服器在六個月的期間內失敗兩次,且平均的20分鐘無法使用,則 MTBF 為三個月或90天,而 MTTR 是20分鐘。Availability = (MTBF – MTTR) / MTBF   For example, if a server fails two times over a six-month period and is unavailable for an average of 20 minutes, the MTBF is three months or 90 days and the MTTR is 20 minutes. 因此,可用性 = (90 天–20分鐘) /90 天 = 99.985%。Therefore, Availability = (90 days – 20 minutes) / 90 days = 99.985 percent.

可用性管理是確定可用性是否已最大化並保留在 Sla 中所定義的參數中的處理常式。Availability management is the process of making sure that availability is maximized and kept within the parameters that are defined in SLAs. 可用性管理包含下列程式:Availability management includes the following processes:

  • 監控    檢查服務無法使用的時間和時間。Monitoring    Examining when and for how long services are unavailable.

  • 報表    可用性圖表應該定期提供給管理、使用者和作業團隊。Reporting   Availability figures should be regularly provided to management, users, and operations teams. 這些報告應該會反白顯示趨勢,並找出順利進行的區域,以及需要注意的區域。These reports should highlight trends and identify areas that are doing well and areas that require attention. 報告應該摘要規定與 Sla 中所設定的目標。The report should summarize compliance with targets set in the SLAs.

  • 改進    如果可用性不符合 Sla 中所定義的目標,或趨勢在降低可用性的位置,則可用性管理程式應規劃補救步驟。Improvement   If availability does not meet targets that are defined in the SLAs or where the trend is toward reduced availability, the availability management process should plan remedial steps. 這應包括與其他責任小組合作,以強調中斷的原因,並規劃補救的動作,以防止中斷週期的重複。This should include working with other responsible teams to highlight reasons for outages and to plan remedial actions to prevent a recurrence of the outages.

容量和可用性測量值是重複性的工作,特別適合自動工具和腳本,例如 Microsoft System Center Operations Manager (先前的 Microsoft Operations Manager) ,本檔稍後將討論。Capacity and availability measurements are repetitive tasks that are ideally suited to automated tools and scripts such as Microsoft System Center Operations Manager (formerly Microsoft Operations Manager), which is discussed later in this document.