使用基线和性能历史记录优化 Office 365 性能Office 365 performance tuning using baselines and performance history

有几种简单的方法可以检查 Office 365 与您的业务之间的连接性能,让您能够建立连接的粗略基准。There are some simple ways to check the connection performance between Office 365 and your business that will let you establish a rough baseline of your connectivity. 了解客户端计算机连接的性能历史记录可帮助您及早检测新兴问题,识别和预测问题。Knowing the performance history of your client computer connections can help you detect emerging issues early, identify, and predict problems.

如果您没有使用过性能问题,本文旨在帮助您考虑一些常见问题,如如何知道您所看到的问题是性能问题而不是 Office 365 服务事件?If you're not used to working on performance issues, this article is designed to help you consider some common questions, like How do you know the problem you're seeing is a performance issue and not an Office 365 service incident? 如何规划最佳性能、长期?How can you plan for good performance, long term? 如何能够保持对性能的关注?How can you keep an eye on performance? 如果您的团队或客户端在使用 Office 365 时发现性能低,并且您对上述任何问题感到疑惑,请继续阅读。If your team or clients are seeing slow performance while using Office 365, and you wonder about any of these questions, read on.

重要

你的客户端和 Office 365 现在是否有性能问题?Have a performance issue between your client and Office 365 right now? 按照针对 Office 365 的性能故障排除计划中概述的步骤进行操作。Follow the steps outlined in the Performance troubleshooting plan for Office 365.

您应该了解的有关 Office 365 性能的事项Something you should know about Office 365 performance

Office 365 位于高容量专用 Microsoft 网络中,该网络不只受自动化的监视,而是真实人员。Office 365 lives inside a high-capacity, dedicated Microsoft network that is steadily monitored not just by automation, but by real people. 维护 Office 365 云的角色的一部分是在实现性能调整和简化的情况下进行优化。Part of the role of maintaining the Office 365 cloud is building-in performance tuning and streamlining where it's possible. 由于 Office 365 云的客户端必须通过 Internet 进行连接,因此在不同的 Office 365 服务中进行微调的速度也会不断提高。Since clients of the Office 365 cloud have to connect across the Internet, there is a continuous effort to fine-tune the performance across Office 365 services too. 性能改进决不会真正在云中停止,并且在保持云正常运行和快速的情况方面有很多积累的体验。Performance improvements never really stop in the cloud, and there is a lot of accumulated experience with keeping the cloud healthy and quick. 如果从您的位置连接到 Office 365 时遇到性能问题,最好不要从支持用例开始,再等待。Should you experience a performance issue connecting from your location to Office 365, it's best not to start with, and wait on, a Support case. 相反,应开始调查 "内部输出" 中的问题。Instead, you should begin investigating the problem from 'the inside out'. 也就是说,从你的网络中开始,然后向 Office 365 推出。That is, start inside of your network, and work your way out to Office 365. 在使用 Office 365 支持打开事例之前,您可以收集数据并执行操作,并可能会解决问题。Before you open a case with Office 365 Support, you can gather data and take actions that will explore, and may resolve, your problem.

重要

请注意 Office 365 中的容量规划和限制。Be aware of capacity planning and limits in Office 365. 当您尝试解决性能问题时,该信息将使您排在曲线的前面。That information will put you ahead of the curve when trying to resolve a performance issue. 以下是Microsoft 365 和 Office 365 服务说明的链接。Here's a link to the Microsoft 365 and Office 365 service descriptions. 这是一个中心中心,Office 365 提供的所有服务都有一个从此处转到其自己的服务说明的链接。This is a central hub, and all the services offered by Office 365 have a link that goes to their own Service Descriptions from here. 这意味着,如果您需要查看 SharePoint Online 的标准限制,请单击 " Sharepoint Online 服务说明" 并找到其 " sharepoint online 限制" 部分That means, should you need to see the standard limits for SharePoint Online, for example, you would click SharePoint Online Service Description and locate its SharePoint Online Limits section.

确保您进行故障排除,以了解性能是可调的比例,不应获取 idealized 值并永久维护它(如果您认为这是这样,那么偶尔的高带宽任务(如加入大量用户,或执行大型数据迁移)将非常紧张--因此需要对性能影响进行规划)。Make sure you go into your troubleshooting with the understanding that performance is a sliding scale, it's not about achieving an idealized value and maintaining it permanently (if you believe this is so, then occasional high-bandwidth tasks like on-boarding a large number of users, or doing large data migrations will be very stressful -- so do plan for performance impacts then). 你可以且应该能够大致了解性能目标,但很多变量在性能上会发挥作用,因此性能各不相同。You can, and should, have a rough idea of your performance targets, but a lot of variables play into performance, therefore, performance varies. 这就是性能的性质。That's the nature of performance.

性能故障排除不涉及会议特定目标并无限期地维护这些号码,这是为了改进现有活动(假设所有变量)。Performance troubleshooting isn't about meeting specific goals and maintaining those numbers indefinitely, it's about improving existing activities, given all the variables.

好的,性能问题看起来是什么样子的?Okay, what does a performance problem look like?

首先,你需要确保你遇到的问题确实是性能问题,而不是服务事件。First, you need to make sure that what you are experiencing is indeed a performance issue and not a service incident. 性能问题与 Office 365 中的服务事件不同。A performance problem is different from a service incident in Office 365. 下面介绍如何区分这些内容。Here's how to tell them apart.

如果 Office 365 服务有问题,这就是服务事件。If the Office 365 service is having issues, that's a service incident. 您将在 Microsoft 365 管理中心的 "当前运行状况" 下看到红色或黄色图标,在连接到 Office 365 的客户端计算机上,您可能还会注意到性能较慢。You will see red or yellow icons under Current health in the Microsoft 365 admin center, you may also notice slow performance on client computers connecting to Office 365. 例如,如果当前运行状况报告了一个红色的图标,并且你在 Exchange 旁边看到了调查,则你可能还会收到来自组织中的人员的一组呼叫,他们抱怨使用 Exchange Online 的客户端邮箱执行时错误。For example, if Current health reports a red icon and you see Investigating beside Exchange, you might then also receive a bunch of calls from people in your organization who complain that client mailboxes that use Exchange Online are performing badly. 在这种情况下,假定您的 Exchange Online 性能仅成为服务中的问题的牺牲品是合理的。In that case, it's reasonable to assume that your Exchange Online performance just became a victim of issues within the Service.

Office 365 运行状况仪表板,所有工作负荷显示为绿色,Exchange 除外,后者显示服务已还原。

此时,Office 365 管理员应检查当前运行状况,然后查看详细信息和历史记录,以及时更新我们在系统上执行的维护。At this point, you, the Office 365 admin, should check Current health and then View details and history, frequently, to keep up to date on maintenance we perform on the system. 已创建当前运行状况仪表板,以更新有关服务中的更改和问题的信息。The Current health dashboard was made to update you about changes to, and problems in, the service. 向管理员的运行状况历史记录中写入的注释和说明可帮助您评估影响,并使您在日常工作中进行了发布。The notes and explanations written to health history, admin to admin, are there to help you gauge your impact, and to keep you posted about ongoing work.

Office 365 运行状况仪表板的图片,它说明 Exchange Online 服务已还原,以及原因。

性能问题不是服务事件,即使事件可能导致性能降低。A performance issue isn't a service incident, even though incidents can cause slow performance. 性能问题如下所示:A performance issue looks like this:

  • 无论管理员中心的当前运行状况是为服务报告的,都会发生性能问题。A performance issue occurs no matter what the admin center Current health is reporting for the service.

  • 用于相对无缝的行为需要很长时间才能完成或永远不会完成。A behavior that used to be relatively seamless takes a long time to complete or never completes.

  • 您还可以复制问题,或者至少您知道在执行正确的一系列步骤时,将会发生此问题。You can replicate the problem too, or, at least, you know it will happen if you do the right series of steps.

  • 如果问题是间歇性的,则仍有一种模式,例如,在10:00 的情况下,你将收到无法可靠地访问 Office 365 的用户的呼叫,并且呼叫将在中午周围下降。If the problem is intermittent, there is still a pattern, for example, you know that by 10:00 AM you will have calls from users who can't reliably access Office 365, and that the calls will die down around noon.

这可能听起来非常熟悉;或许过于熟悉。This probably sounds familiar; maybe too familiar. 一旦您知道存在性能问题,问题就变成 "你接下来做什么?"Once you know it's a performance problem, the question becomes, "What do you do next?" 本文的其余部分将帮助您确定确切的情况。The rest of this article helps you determine exactly that.

如何定义和测试性能问题How to define and test the performance problem

性能问题通常会随着时间的推移而发生,因此定义实际问题可能是很困难的。Performance issues often emerge over time, so it can be challenging to define the actual problem. 您需要创建一个好的问题陈述并为问题上下文做一个好主意,然后需要可重复的测试步骤才能赢得这一天。You need to create a good problem statement and a good idea of issue context, and then you need to repeatable testing steps to win the day. 否则,如果没有自己的故障,可能会丢失。Otherwise, through no fault of your own, you may be lost. 为什么?Why? 嗯,下面是一些不提供足够信息的问题声明示例:Well, here are some examples of problems statements that don't provide enough information:

  • 从 "我的收件箱" 切换到我没有注意到的 "我的日历",现在它是一个咖啡中断。Switching from my Inbox to my Calendar used to be something I didn't notice, and now it's a coffee-break. 你可以让它与它的行为类似吗?Can you make it act like it used to?

  • 将我的文件上传到 SharePoint Online 会永久占用。Uploading my files to SharePoint Online is taking forever. 为什么下午慢,但在其他时间,速度很快?Why is it slow in the afternoon, but any other time, it's fast? 它无法快速实现吗?Can't it just be fast?

上面的问题声明带来了一些大型挑战。There are several large challenges posed by the problem statements above. 具体来说,有很多多义性需要处理。Specifically, there are a lot of ambiguities to deal with. 例如:for example:

  • 不清楚如何在便携式计算机上使用的收件箱和日历之间切换。It's unclear how switching between Inbox and Calendar used to act on the laptop.

  • 当用户说: "无法再快些",什么是 "快"?When the user says, "Can't it just be fast", what's "fast"?

  • "永久" 的时间是多长?How long is "forever"? 是数秒或数分钟,或者用户是否可以访问午餐,如果用户返回午餐,它会在十分钟后完成?Is that several seconds, or minutes, or could the user go to lunch and it would finish up ten minutes after the user got back?

在所有这些情况下,不考虑管理员和疑难解答不能了解来自以下问题声明的许多详细信息。All of this is without considering that the admin and troubleshooter can't be aware of many details from problem statements like these. 例如,当问题开始发生时;用户在家工作,仅在家庭网络中看到慢速切换;用户必须在本地客户端上运行其他几个内存密集型应用程序,或者用户运行的是较旧的操作系统,或者没有运行最近的更新。For example, when the problem started happening; That the user works from home and only ever sees slow switching while on a home network; That the user must run several other RAM intensive applications on the local client, or the user is running an older operating system or hasn't run recent updates.

当用户报告性能问题时,需要收集大量信息。When users report a performance problem, there's a lot of information to collect. 收集此信息是对问题进行作用域或调查它的过程的一部分。Collecting this information is part of a process called scoping the issue, or investigating it. 以下是可用于收集有关性能问题的信息的基本作用域列表。The following is a basic scoping list you can use to collect information about your performance issue. 此列表并不详尽,但它是启动您自己的一个位置:This list is not exhaustive, but it's a place to start one of your own:

  • 问题是在什么日期发生的,以及在白天或晚上的什么时间?On what date did the issue happen, and around what time of day or night?

  • 你使用的是哪种类型的客户端计算机,以及如何连接到企业网络(VPN、有线、无线)?What kind of client computer were you using, and how does it connect to the business network (VPN, Wired, Wireless)?

  • 您是否正在远程工作或是否已在 office 中工作?Were you working remotely or were you in the office?

  • 您是否尝试在另一台计算机上执行相同的操作并查看相同的行为?Did you try the same actions on another computer and see the same behavior?

  • 演练为您提供问题的步骤,以便您可以编写您将执行的操作。Walk through the steps that are giving you the trouble so that you can write the actions you take down.

  • 以秒或分钟为单位的性能速度有多慢?How slow in seconds or minutes is the performance?

  • 你在世界上的位置是什么?Where in the world are you located?

其中一些问题比其他问题更明显。Some of these questions are more obvious than others. 大多数人都要了解故障排除程序需要的具体步骤才能再现问题。Most everyone will understand a troubleshooter needs the exact steps to reproduce the issue. 毕竟,如何记录错误,以及如何在问题修复时测试其他信息?After all, how else can you record what's wrong, and how else can you test if the issue is fixed? 不太明显,如 "您看到问题的日期和时间是什么?" 和 "世界上的位置是什么?",可以串联使用的信息。Less obvious are things like "What date and time did you see the issue?", and "Where in the world are you located?", information that can be used in tandem. 根据用户的工作时间,几个小时的差异可能意味着维护已在公司网络的各个部分进行。Depending on when the user was working, a few hours of time difference may mean maintenance is already underway on parts of your company's network. 例如,如果您的公司有混合实施,如混合 SharePoint 搜索(可以同时在 SharePoint Online 和本地 SharePoint Server 2013 实例中查询搜索索引),则可能会在内部部署服务器场中进行更新。If, for example, your company has a hybrid implementation, like a hybrid SharePoint Search, which can query search indexes in both SharePoint Online and an On-premises SharePoint Server 2013 instance, updates may be underway in the on-premises farm. 如果您的公司都在云中,则系统维护可能包括添加或删除网络硬件、推出公司范围的更新,或者对 DNS 或其他核心基础结构进行更改。If your company is all in the cloud, system maintenance may include adding or removing network hardware, rolling out updates that are company-wide, or making changes to DNS, or other core infrastructure.

当您解决性能问题时,我们有点像犯罪场景,您需要精确而 observant 地从证据中得出任何结论。When you're troubleshooting a performance problem, it's a bit like a crime scene, you need to be precise and observant to draw any conclusions from the evidence. 若要执行此操作,您必须通过收集证据获取一个正常的问题声明。In order to do this, you must get a good problem statement by gathering evidence. 它应包括计算机的上下文、用户的上下文、问题开始时的上下文以及暴露性能问题的确切步骤。It should include the computer's context, the user's context, when the problem began, and the exact steps that exposed the performance issue. 此问题语句应和在您的笔记中的最顶层页面保持不变。This problem statement should be, and stay, the topmost page in your notes. 通过在解决解决方案后再次浏览问题声明,您将采取相应步骤来测试并证明您执行的操作是否已解决问题。By walking through the problem statement again after you work on the resolution, you are taking the steps to test and prove whether the actions you take have resolved the issue. 在完成工作时,了解这一点非常关键。This is critical to knowing when your work, there, is done.

您是否知道性能如何在什么情况中进行查看?Do you know how performance used to look when it was good?

如果你是 unlucky,则没有人知道。If you're unlucky, nobody knows. 无人有号码。Nobody had numbers. 这意味着无人可以回答这样一个简单的问题: "大约有多少秒用于在 Office 365 中显示收件箱?",或 "在执行官拥有 Lync Online 会议时它使用的时间长短?",这是许多公司的常见方案。That means nobody can answer the simple question "About how many seconds did it used to take to bring up an Inbox in Office 365?", or "How long did it used to take when the Executives had a Lync Online meeting?", which is a common scenario for many companies.

此处缺少的是性能基准。What's missing here is a performance baseline.

基线为您的性能提供了一个上下文。Baselines give you a context for your performance. 您应偶尔花费一些基准,具体取决于贵公司的需求。You should take a baseline occasionally to frequently, depending on the needs of your company. 如果您是一个更大的公司,则操作团队可能会对您的本地环境执行比较基准。If you are a larger company, your Operations team may take baselines for your on-premises environment already. 例如,如果您在每月的第一个星期一对所有 Exchange 服务器进行修补,并在第三个星期一的所有 SharePoint 服务器上进行修补,则操作团队可能会提供一系列任务和修补程序在修补后运行的方案,以证明关键功能能够正常运行。For example, if you patch all the Exchange servers on the first Monday of the month, and all your SharePoint servers on the third Monday, your Operations team probably has a list of tasks and scenarios it runs post-patching, to prove that critical functions are operational. 例如,打开收件箱,单击 "发送/接收",并确保文件夹更新,或者在 SharePoint 中浏览网站的主页面,进入企业级搜索页面,并执行返回结果的搜索。For example, opening the Inbox, clicking Send/Receive, and making sure the folders update, or, in SharePoint, browsing the main page of the site, going into the enterprise Search page, and doing a search that returns results.

如果您的应用程序在 Office 365 中,则您可以使用的一些最基本的基准是,从网络内部的客户端计算机、传出点或您离开网络并转到 Office 365 的位置测量时间(以毫秒为单位)。If your applications are in Office 365, some of the most fundamental baselines you can take measure the time (in milliseconds) from a client computer inside your network, to an egress point, or the point where you leave your network and go out to Office 365. 下面是您可以调查和录制的一些有用的基线:Here are some helpful baselines that you can investigate and record:

  • 确定客户端计算机和传出点之间的设备,例如代理服务器。Identify the devices between your client computer and your egress point, for example, your proxy server.

    • 您需要知道您的设备,以便为性能问题提供上下文(IP 地址、设备类型等)。You need to know your devices so that you have context (IP addresses, type of device, et cetera) for performance problems that arise.

    • 代理服务器是常见的出口点,因此您可以检查 web 浏览器以查看其设置使用的代理服务器(如果有)。Proxy servers are common egress points, so you can check your web browser to see what proxy server it is set to use, if any.

    • 有第三方工具可以发现和映射网络,但了解设备的最安全方法是询问网络团队的成员。There are third party tools that can discover and map your network, but the safest way to know your devices is to ask a member of your network team.

  • 确定您的 Internet 服务提供商(ISP),记下其联系信息,并询问您拥有多少个电路。Identify your Internet service provider (ISP), write down their contact information, and ask how many circuits how much bandwidth you have.

  • 在你的公司内部,识别你的客户端与传出点之间的设备的资源,或确定要与网络问题交谈的紧急联系人。Inside your company, identify resources for the devices between your client and the egress point, or identify an emergency contact to talk to about networking issues.

下面是使用工具进行的简单测试可为您计算的一些基准:Here are some baselines that simple testing with tools can calculate for you:

  • 从客户端计算机到你的出局点的时间(以毫秒为单位)Time from your client computer to your egress point in milliseconds

  • 从传出点到 Office 365 的时间(以毫秒为单位)Time from your egress point to Office 365 in milliseconds

  • 在浏览时解析 Office 365 URL 的服务器世界上的位置Location in the world of the server that resolves the URLS for Office 365 when you browse

  • 您的 ISP 的 DNS 解析速度(毫秒)、数据包到达时不一致(网络抖动)、上载和下载时间(以毫秒为单位)The speed of your ISP's DNS resolution in milliseconds, inconsistencies in packet arrival (network jitter), upload and download times in milliseconds

如果您不熟悉如何执行这些步骤,我们将在本文中详细介绍。If you're unfamiliar with how to carry out these steps, we'll go into more detail in this article.

什么是比较基准?What is a baseline?

您可以知道它在出现故障时的影响,但如果您不知道您的历史性能数据,则不可能有可能会有什么损坏的上下文,以及何时发生的情况。You'll know the impact when it goes bad, but if you don't know your historical performance data, it's not possible to have a context for how bad it may have become, and when. 因此,如果没有比较基准,就不会错过解决谜题的关键线索: "拼图" 框中的图片。So without a baseline, you're missing the key clue to solve the puzzle: the picture on the puzzle box. 在性能故障排除中,需要进行比较In performance troubleshooting, you need a point of comparison . 简单的性能比较基准不是很难执行。Simple performance baselines aren't difficult to take. 你的运营团队可能会对按计划执行这些任务的任务负责。Your Operations team can be tasked with carrying these out on a schedule. 例如,假设您的连接如下所示:For example, let's say your connection looks like this:

显示客户端、代理和 Office 365 云的基本网络图形。

这意味着你已与你的网络团队进行了核对,并发现你通过代理服务器将公司留在 Internet 上,并且该代理将处理客户端计算机发送到云的所有请求。That means you've checked with your network team and found out that you leave your company for the Internet through a proxy server, and that proxy handles all the requests your client computer sends to the cloud. 在这种情况下,应绘制一个可列出所有干预设备的简化版本的连接。In this case, you should draw a simplified version of your connection that lists all the intervening devices. 现在,可以使用 "插入" 工具来测试客户端、传出点(您将网络留在 Internet 上)与 Office 365 云之间的性能。Now, insert tools that you can use to test the performance between the client, the egress point (where you leave your network for the Internet), and the Office 365 cloud.

包含客户端、代理和云的基本网络以及工具建议 PSPing、TraceTCP 和网络跟踪。

由于查找性能数据所需的专业知识量,这些选项将列为简单高级The options are listed as Simple and Advanced because of the amount of expertise you need in order to find the performance data. 与运行命令行工具(如 PsPing 和 TraceTCP)相比,网络跟踪需要很长时间。A network trace will take a lot of time, compared to running command-line tools like PsPing and TraceTCP. 这两个命令行工具已被选择,因为它们不使用 ICMP 数据包(它将被 Office 365 阻止),因为它们提供了离开客户端计算机或代理服务器(如果有权访问)并到达 Office 365 所需的时间(以毫秒为单位)。These two command-line tools were chosen because they don't use ICMP packets, which will be blocked by Office 365, and because they give the time in milliseconds that it takes to leave the client computer, or proxy server (if you have access) and arrive at Office 365. 从一台计算机到另一台计算机的每个跃点都将以时间值结束,这对于比较基准而言非常好!Each individual hop from one computer to another will end up with a time value, and that's great for baselines! 同样重要的是,这些命令行工具允许您向命令中添加端口号,这很有用,因为 Office 365 通过端口443进行通信,后者是安全套接字层和传输层安全性(SSL 和 TLS)使用的端口。Just as importantly, these command-line tools allow you to add a port number onto the command, this is useful because Office 365 communicates over port 443, which is the port used by Secure Sockets Layer and Transport Layer Security (SSL and TLS). 但是,其他第三方工具可能是您的情况的更好解决方案。However, other third-party tools may be better solutions for your situation. Microsoft 不支持所有这些工具,因此,如果由于某种原因无法获取 PsPing 和 TraceTCP,请使用类似于 Netmon 的工具移动到网络跟踪。Microsoft doesn't support all of these tools, so if, for some reason, you can't get PsPing and TraceTCP working, move on to a network trace with a tool like Netmon.

你可以在营业时间之前获取比较基准,再次使用,然后在数小时后再次使用。You can take a baseline before business hours, again during heavy use, and then again after hours. 这意味着,您的文件夹结构可能会在结尾处看上去类似于以下内容:This means you may have a folder structure that looks a bit like this in the end:

建议将性能数据组织到文件夹的方式的图形。

您还应选取文件的命名约定。You should also pick a naming convention your files. 下面是一些示例:Here are some examples:

  • Feb_09_2015_9amPST_PerfBaseline_Netmon_ClientToEgress_NormalFeb_09_2015_9amPST_PerfBaseline_Netmon_ClientToEgress_Normal

  • Jan_10_2015_3pmCST_PerfBaseline_PsPing_ClientToO365_bypassProxy_SLOWJan_10_2015_3pmCST_PerfBaseline_PsPing_ClientToO365_bypassProxy_SLOW

  • Feb_08_2015_2pmEST_PerfBaseline_BADPerfFeb_08_2015_2pmEST_PerfBaseline_BADPerf

  • Feb_08_2015_8 30amEST_PerfBaseline_GoodPerfFeb_08_2015_8-30amEST_PerfBaseline_GoodPerf

有很多不同的方法可以实现此目的,但使用格式 <dateTime><what's happening in the test> 是一个很好的入门之处。There are lots of different ways to do this, but using the format <dateTime><what's happening in the test> is a good place to start. 为此,我们会在稍后尝试解决问题时帮助您这。Being diligent about this will help a lot when you are trying to troubleshoot issues later. 稍后,您将能够说出 "我在2月8日做了两个跟踪,一个显示好的性能,另一个显示了错误,因此可以对它们进行比较。Later, you'll be able to say "I took two traces on February 8th, one showed good performance and one showed bad, so we can compare them". 这对于故障排除非常有帮助。This is extremely helpful for troubleshooting.

您需要有一个有条理的方式来保留历史比较基准。You need to have an organized way to keep your historical baselines. 在此示例中,简单方法生成了三个命令行输出,并将结果作为屏幕截图进行收集,但您可能有网络捕获文件。In this example, the simple methods produced three command line outputs and the results were collected as screen shots, but you may have network capture files instead. 使用最适用于您的方法。Use the method that works best for you. 存储历史比较基准并在您注意到在线服务行为发生变化的点处对其进行引用。Store your historical baselines and refer to them at points where you notice changes in the behavior of online services.

为何在试点过程中收集性能数据?Why collect performance data during a pilot?

在试用 Office 365 服务的过程中,开始制作基准的时间不会更好。There is no better time to start making baselines than during a pilot of the Office 365 service. 您的 office 可能有成千上万个用户、成百上千个,或者可能有五个,但即使用户数量较少,您也可以执行测试来测量性能波动。Your office may have thousands of users, hundreds of thousands, or it may have five, but even with a small number of users, you can perform tests to measure fluctuations in performance. 在大型公司的情况下,可以将数百个用户试点 Office 365 的代表性示例向外投影到几个数以千计的,以便您知道问题在发生之前可能出现的位置。In the case of a large company, a representative sample of several hundred users piloting Office 365 can be projected outward to several thousands so you know where issues might arise before they happen.

在小型公司中,如果使用的是小型公司,则表示所有用户同时进入服务且没有试点,请保留性能措施,以便您可以向可能需要对执行错误的操作进行故障排除的任何人显示数据。In the case of a small company, where on-boarding means that all users go to the service at the same time and there is no pilot, keep performance measures so that you have data to show to anyone who may have to troubleshoot a badly performing operation. 例如,如果您注意到,突然你可以在上载中型图形时所花的时间内开始进行你的建设,在这种情况下,它会很快发生。For example, if you notice that all of a sudden you can walk around your building in the time it takes to upload a medium-sized graphic where it used to happen very quickly.

如何收集基线How to collect baselines

对于所有故障排除计划,您需要至少确定以下各项:For all troubleshooting plans you need to identify these things at a minimum:

  • 正在使用的客户端计算机(计算机或设备的类型、IP 地址以及导致问题的操作)The client computer you're using (the type of computer or device, an IP address, and the actions that caused the issue)

  • 客户端计算机在世界上的位置(例如,此用户是否位于 VPN 上的网络、远程工作或公司 intranet 上)Where the client computer is located in the world (for example, whether this user on a VPN to the network, working remotely, or on the company intranet)

  • 客户端计算机从你的网络中使用的出局点(流量为 ISP 或 Internet 离开你的业务点)The egress point the client computer uses from your network (the point at which traffic leaves your business for an ISP or the Internet)

您可以从网络管理员处了解网络布局。You can find out the layout of your network from the network administrator. 如果你在小型网络上,请查看连接到 Internet 的设备,如果你对布局有任何疑问,请致电你的 ISP。If you're on a small network, take a look at the devices connecting you to the Internet, and call your ISP if you have questions about the layout. 创建最终布局的图形以供参考。Create a graphic of the final layout for your reference.

本节分为简单的命令行工具和方法,以及更高级的工具选项。This section is broken into simple command-line tools and methods, and more advanced tools options. 我们将首先介绍简单的方法。We'll cover simple methods first. 但是,如果现在遇到了性能问题,则应跳转到高级方法,并尝试执行示例性能故障排除操作计划。But if you've got a performance problem right now, you should jump to advanced methods and try out the sample performance-troubleshooting action plan.

简单方法Simple methods

这些简单方法的目标是了解如何在一段时间内学习、理解和正确地存储简单的性能基准,以便了解有关 Office 365 性能的信息。The objective of these simple methods is to learn to take, understand, and properly store simple performance baselines over time so that you are informed about Office 365 performance. 下面是简单的简单图表,如下所示:Here's the very simple diagram for simple, as you've seen before:

包含客户端、代理和云的基本网络以及工具建议 PSPing、TraceTCP 和网络跟踪。

备注

TraceTCP 包含在此屏幕截图中,因为它是一种有用的工具,用于显示请求处理所需的时间,以及从一台计算机到下一台计算机的网络跃距或从一台计算机到下一台计算机之间的连接要到达目标所需的时间(以毫秒为单位)。TraceTCP is included in this screen shot because it's a useful tool for showing, in milliseconds, how long a request takes to process, and how many network hops, or connections from one computer to the next, that the request takes to reach a destination. TraceTCP 还可以提供在跃点期间使用的服务器的名称,这对于支持中的 Microsoft Office 365 疑难解答非常有用。TraceTCP can also give the names of servers used during hops, which can be useful to a Microsoft Office 365 troubleshooter in Support. > TraceTCP 命令可能非常简单,例如: > tracetcp.exe outlook.office365.com:443> 请务必在命令中包括端口号!> TraceTCP commands can be very simple, such as: > tracetcp.exe outlook.office365.com:443> Remember to include the port number in the command! > TraceTCP是免费下载,但依赖于 Wincap。 > TraceTCP is a free download, but relies on Wincap. Wincap 是一种也由 Netmon 使用和安装的工具。Wincap is a tool that is also used and installed by Netmon. 我们还在 "高级方法" 部分中使用 Netmon。We also use Netmon in the advanced methods section.

如果有多个办公室,则还需要在每个位置的客户端中保留一组数据。If you have multiple offices, you'll need to keep a set of data from a client in each of those locations as well. 此测试可衡量延迟,在这种情况下,它是一个数字值,它描述客户端向 Office 365 发送请求与 Office 365 响应请求之间的时间量。This test measures latency, which, in this case, is a number value that describes the amount of time between a client sending a request to Office 365, and Office 365 responding to the request. 测试源于客户端计算机上的域,并在 Internet 中从传出点传出网络中的往返行程,并将其从 Internet 发送到 Office 365,然后再到上一页进行测量。The testing originates inside your domain on a client computer, and looks to measure a round trip from inside your network, out through an egress point, across the Internet to Office 365, and back.

在这种情况下,有几种方法可以处理出局点(在此示例中为代理服务器)。There are a few ways to deal with the egress point, in this case, the proxy server. 您可以从1到2进行跟踪,然后从2到3进行跟踪,然后以毫秒为单位添加数字以获得网络边缘的最终总计。You can either trace from 1 to 2 and then 2 to 3, and then add the numbers in milliseconds to get a final total to the edge of your network. 或者,可以将连接配置为绕过 Office 365 地址的代理。Or, you can configure the connection to bypass the proxy for Office 365 addresses. 在具有防火墙、反向代理或两者的某个组合的大型网络中,可能需要在代理服务器上发出例外,允许流量传递给大量 Url。In a larger network with a firewall, reverse proxy, or some combination of the two, you may need to make exceptions on the proxy server that will allow traffic to pass for a lot of URLs. 有关 Office 365 使用的终结点列表,请参阅office 365 url 和 IP 地址范围For the list of endpoints used by Office 365, see Office 365 URLs and IP address ranges. 如果您有身份验证代理,请首先测试以下项的异常:If you have an authenticating proxy, begin by testing exceptions for the following:

  • 端口80和443Ports 80 and 443

  • TCP 和 HTTPsTCP and HTTPs

  • 出站到以下任意 Url 的连接:Connections that are outbound to any of these URLs:

  • *。 microsoftonline.com*.microsoftonline.com

  • *。 microsoftonline-p.com*.microsoftonline-p.com

  • *.sharepoint.com*.sharepoint.com

  • *。 outlook.com*.outlook.com

  • *。 lync.com*.lync.com

  • osub.microsoft.comosub.microsoft.com

必须允许所有用户访问这些地址,而无需任何代理干扰或身份验证。All users need to be allowed to get to these addresses without any proxy interference or authentication. 在小型网络中,应将它们添加到 web 浏览器中的代理绕过列表中。On a smaller network, you should add these to your proxy bypass list in your web browser.

若要将它们添加到 Internet Explorer 中的代理绕过列表,请转到Tools > Internet Options > Connections > LAN settings > AdvancedTo add these to your proxy bypass list in Internet Explorer, go to Tools > Internet Options > Connections > LAN settings > Advanced. 您还可以在 "高级" 选项卡中找到代理服务器和代理服务器端口。The advanced tab is also where you will find your proxy server and proxy server port. 您可能需要单击 "为 LAN 使用代理服务器" 复选框,以访问 "高级" 按钮。You may need to click the checkbox Use a proxy server for your LAN, to access the Advanced button. 您需要确保选中 "对本地地址绕过代理服务器"。You'll want to make sure that Bypass proxy server for local addresses is checked. 单击 "高级" 后,您将看到一个可在其中输入例外的文本框。Once you click Advanced, you'll see a text box where you can enter exceptions. 使用分号分隔上面列出的通配符 Url,例如:Separate the wildcard URLs listed above with semi-colons, for example:

*. microsoftonline.com;*。 sharepoint.com*.microsoftonline.com; *.sharepoint.com

绕过代理之后,您应该能够直接在 Office 365 URL 上使用 ping 或 PsPing。Once you bypass your proxy, you should be able to use ping or PsPing directly on an Office 365 URL. 下一步将测试 ping outlook.office365.comThe next step will be to test ping outlook.office365.com. 或者,如果您正在使用 PsPing 或另一个工具,以允许您向命令提供端口号,PsPing 将根据portal.microsoftonline.com:443查看平均往返时间(以毫秒为单位)。Or, if you're using PsPing or another tool that will let you supply a port number to the command, PsPing against portal.microsoftonline.com:443 to see the average round trip time in milliseconds.

往返行程时间(或 RTT)是一个数字值,用于衡量向服务器(如 outlook.office365.com)发送 HTTP 请求所需的时间,并获得确认服务器知道您已执行此操作的响应。The round trip time, or RTT, is a number value that measures how long it takes to send a HTTP request to a server like outlook.office365.com and get a response back that acknowledges the server knows that you did it. 有时,您会看到这种缩写为 RTT。You'll sometimes see this abbreviated as RTT. 这应该是相对较短的时间量。This should be a relatively short amount of time.

您必须使用PSPing或其他不使用 ICMP 数据包的工具,这些数据包被 Office 365 阻止,以便执行此测试。You have to use PSPing or another tool that does not use ICMP packets which are blocked by Office 365 in order to do this test.

如何使用 PsPing 从 Office 365 URL 中直接获取整体往返时间(以毫秒为单位)How to use PsPing to get an overall round trip time in milliseconds directly from an Office 365 URL

  1. 通过完成以下步骤运行提升的命令提示符:Run an elevated command prompt by completing these steps:

  2. 单击“开始”。Click Start.

  3. 在 "开始搜索" 框中,键入 cmd,然后按 CTRL + SHIFT + ENTER。In the Start Search box, type cmd, and then press CTRL+SHIFT+ENTER.

  4. 如果出现“用户帐户控制”对话框,请确认所显示的是您想要执行的操作,然后单击“继续”。If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

  5. 导航到安装工具(在此示例中为 PsPing)的文件夹,并测试这些 Office 365 Url:Navigate to the folder where the tool (in this case PsPing) is installed and test these Office 365 URLs:

  • psping portal.office.com:443psping portal.office.com:443

  • psping microsoft-my.sharepoint.com:443psping microsoft-my.sharepoint.com:443

  • psping outlook.office365.com:443psping outlook.office365.com:443

  • psping www.yammer.com:443psping www.yammer.com:443

    PSPing 命令转到 microsoft-my.sharepoint.com 端口443。

请务必包括端口号443。Be sure to include the port number of 443. 请注意,Office 365 在加密频道上工作正常。Remember that Office 365 works on an encrypted channel. 如果没有端口号,则您的请求将失败 PsPing。If you PsPing without the port number, your request will fail. 在 ping 你的短列表后,请查找以毫秒为单位的平均时间(毫秒)。Once you've pinged your short list, look for the Average time in milliseconds (ms). 这就是您想要录制的内容!That is what you want to record!

该图显示了在往返时间为2.8 毫秒的情况下,客户端到代理 PSPing 的说明。

如果您不熟悉代理旁路,并且想要分步采取措施,则需要先找出代理服务器的名称。If you're not familiar with proxy bypass, and prefer to take things step-by-step, you need to first find out the name of your proxy server. 在 internet Explorer 中,转到工具 > Internet 选项 > 连接 > LAN 设置 > 高级In Internet Explorer go to Tools > Internet Options > Connections > LAN settings > Advanced. "高级" 选项卡是您将在其中看到列出的代理服务器的位置。The Advanced tab is where you will see your proxy server listed. 通过完成此任务在命令提示符处 Ping 代理服务器:Ping that proxy server at a command prompt by completing this task:

Ping 代理服务器并获取阶段1到2的往返行程值(以毫秒为单位)To ping the proxy server and get a round trip value in milliseconds for stage 1 to 2

  1. 通过完成以下步骤运行提升的命令提示符:Run an elevated command prompt by completing these steps:

  2. 单击“开始”。Click Start.

  3. 在 "开始搜索" 框中,键入 cmd,然后按 CTRL + SHIFT + ENTER。In the Start Search box, type cmd, and then press CTRL+SHIFT+ENTER.

  4. 如果出现“用户帐户控制”对话框,请确认所显示的是您想要执行的操作,然后单击“继续”。If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

  5. 键入 ping <the name of the proxy server your browser uses, or the IP address of the proxy server> ,然后按 enter 键。Type ping <the name of the proxy server your browser uses, or the IP address of the proxy server> and then press ENTER. 如果已安装 PsPing 或其他工具,则可以选择改用该工具。If you have PsPing, or some other tool, installed, you can choose to use that tool instead.

    您的命令可能如下例所示:Your command may look like any of these examples:

  • ping ourproxy.ourdomain.industry.business.comping ourproxy.ourdomain.industry.business.com

  • ping 155.55.121.55ping 155.55.121.55

  • ping ourproxyping ourproxy

  • psping ourproxy.ourdomain.industry.business.com:80psping ourproxy.ourdomain.industry.business.com:80

  • psping 155.55.121.55:80psping 155.55.121.55:80

  • psping ourproxy:80psping ourproxy:80

  1. 当跟踪停止发送测试数据包时,你将获得一个小摘要,其中列出了平均值(以毫秒为单位),这是你的值。When the trace stops sending test packets, you'll get a small summary that lists an average, in milliseconds, and that's the value you're after. 获取提示的屏幕截图并使用您的命名约定保存它。Take a screen shot of the prompt and save it using your naming convention. 此时,还可能会帮助您在图表中填入值。At this point it may also help to fill in the diagram with the value.

或许你已在上午早上进行跟踪,并且你的客户端可以快速访问代理(或任何出口服务器退出到 Internet)。Maybe you've taken a trace in the early morning, and your client can get to the proxy (or whatever egress server exits to the Internet) quickly. 在这种情况下,你的号码可能如下所示:In this case, your numbers may look like this:

显示从一台客户端到2.8 毫秒的代理的往返时间的图形。

如果您的客户端计算机是对代理(或传出)服务器的访问权限之一的选择,则可以通过远程连接到该计算机来运行下一条测试,从那里运行命令提示符 PsPing 到 Office 365 URL。If your client computer is one of the select few with access to the proxy (or egress) server, you can run the next leg of the test by remotely connecting to that computer, running the command prompt to PsPing to an Office 365 URL from there. 如果您没有对该计算机的访问权限,则可以联系网络资源,以获取下一条腿的帮助,并获得确切的号码。If you don't have access to that computer, you can contact your network resources for help with the next leg and get exact numbers that way. 如果不可能,请针对相关 Office 365 URL 采取 PsPing,并将其与 PsPing 或 Ping time 相对于代理服务器进行比较。If that's not possible, take a PsPing against the Office 365 URL in question and compare it to the PsPing or Ping time against your proxy server.

例如,如果从客户端到 Office 365 URL 的51.84 毫秒,并且从客户端到代理(或传出点)的时间为2.8 毫秒,则从传出到 Office 365 中有49.04 毫秒。For example, if you have 51.84 milliseconds from the client to the Office 365 URL, and you have 2.8 milliseconds from the client to the proxy (or egress point), then you have 49.04 milliseconds from the egress to Office 365. 同样,如果从客户端到代理的 PsPing 为12.25 毫秒,在一天的高度和从客户端到 Office 365 URL 的62.01 毫秒之间,则到 Office 365 URL 的代理传出的平均值为49.76 毫秒。Likewise, if you have a PsPing of 12.25 milliseconds from the client to the proxy during the height of the day, and 62.01 milliseconds from the client to the Office 365 URL, then your average value for the proxy egress to the Office 365 URL is 49.76 milliseconds.

显示从客户端到 Office 365 旁边的 ping 的其他图形(以毫秒为单位),以便可以减去这些值。

在进行故障排除的过程中,您可能会发现,只是保留这些基准是很有趣的。In terms of troubleshooting, you may find something interesting just from keeping these baselines. 例如,如果发现从代理或传出点到 Office 365 URL 通常有大约40到59毫秒的延迟,并将客户端到大约3到7毫秒的代理或传出点延迟(根据在一天中看到的网络流量量而定),如果您的最后三个客户端代理或传出基准显示的延迟为45毫秒,则您肯定会知道存在问题的问题。For example, if you find that you generally have about 40 to 59 milliseconds of latency from the proxy or egress point to the Office 365 URL, and have a client to proxy or egress point latency of about 3 to 7 milliseconds (depending on the amount network traffic you're seeing during that time of day) then you will surely know something is problematic if your last three client to proxy or egress baselines show a latency of 45 milliseconds.

高级方法Advanced methods

如果您确实想知道您的 Internet 请求在 Office 365 中发生的情况,您需要熟悉网络跟踪。If you really want to know what is happening with your Internet requests to Office 365, you need to become familiar with network traces. 您喜欢这些跟踪、HTTPWatch、Netmon、消息分析程序、Wireshark、Fiddler、开发人员仪表板工具或任何其他工具的情况并不重要,只要该工具可以捕获和筛选网络流量。It does not matter which tools you prefer for these traces, HTTPWatch, Netmon, Message Analyzer, Wireshark, Fiddler, Developer Dashboard tool or any other will do as long as that tool can capture and filter network traffic. 在本节中,您将看到运行这些工具中的多个工具,以获得更完整的问题的详细信息是有益的。You'll see in this section that it's beneficial to run more than one of these tools to get a more complete picture of the problem. 在测试时,这些工具中的一些也会在自己的权限中充当代理。When you're testing, some of these tools also act as proxies in their own right. 在配套文章中使用的工具、 Office 365 的性能故障排除计划、包括Netmon 3.4HTTPWatchWireSharkTools used in the companion article, Performance troubleshooting plan for Office 365, include Netmon 3.4, HTTPWatch, or WireShark.

性能基线是此方法的简单部分,其中的许多步骤与解决性能问题时的步骤相同。Taking a performance baseline is the simple part of this method, and many of the steps are the same as when you troubleshoot a performance issue. 创建性能基线的更高级方法要求您采用和存储网络跟踪。The more advanced methods of creating baselines for performance requires you to take and store network traces. 本文中的大多数示例都使用 SharePoint Online,但您应该在订阅的 Office 365 服务中为测试和记录创建一系列常见操作。Most of the examples in this article use SharePoint Online, but you should develop a list of common actions across the Office 365 services to which you subscribe to test and record. 下面是一个比较基准示例:Here is a baseline example:

  • SPO 的基线列表-* * 步骤1: * * 浏览 SPO 网站的主页并执行网络跟踪。Baseline list for SPO - ** Step 1: ** Browse the home page of the SPO website and do a network trace. 保存跟踪。Save the trace.

  • SPO 的基线列表-步骤2: 通过企业级搜索搜索术语(如公司名称)并执行网络跟踪。Baseline list for SPO - Step 2: Search for a term (such as your company name) via Enterprise Search and do a network trace. 保存跟踪。Save the trace.

  • SPO 的基线列表-第3步: 将大型文件上传到 SharePoint Online 文档库并执行网络跟踪。Baseline list for SPO - Step 3: Upload a large file to a SharePoint Online document library and do a network trace. 保存跟踪。Save the trace.

  • SPO 的基线列表-步骤4: 浏览 OneDrive 网站的主页并执行网络跟踪。Baseline list for SPO - Step 4: Browse the home page of the OneDrive website and do a network trace. 保存跟踪。Save the trace.

此列表应包括用户对 SharePoint Online 所执行的最重要的常见操作。This list should include the most important common actions that users take against SharePoint Online. 请注意,最后一步是跟踪到 OneDrive for business,生成 SharePoint Online 主页(通常由公司自定义)和 OneDrive for Business 主页(很少自定义)的加载项之间的比较。Notice that the last step, to trace going to OneDrive for Business, builds-in a comparison between the load of the SharePoint Online home page (which is often customized by companies) and OneDrive for Business home page, which is seldom customized. 当遇到速度较慢的 SharePoint Online 网站时,这是一个非常基本的测试。This is a very basic test when it comes to a slow-loading SharePoint Online site. 你可以将此差异的记录构建到测试中。You can build a record of this difference into your testing.

如果您处于性能问题的中间部分,则许多步骤与采用比较基准时相同。If you are in the middle of a performance problem, many of the steps are the same as when taking a baseline. 网络跟踪变得至关重要,因此我们将处理下一步的重要跟踪。Network traces become critical, so we'll handle how to take the important traces next.

若要解决性能问题,现在您需要在遇到性能问题时进行跟踪。To tackle a performance problem, right now , you need to be taking a trace at the time you are experiencing the performance issue. 您需要具有适当的工具来收集日志,并且您需要一个行动计划,即,要采取的故障排除操作的列表,以收集您可以使用的最佳信息。You need to have the proper tools available to gather logs, and you need an action plan, that is, a list of troubleshooting actions to take to gather the best information that you can. 要做的第一件事就是记录测试的日期和时间,以便可以将文件保存在反映计时的文件夹中。The first thing to do is record the date and time of the test so that the files can be saved in a folder that reflect the timing. 接下来,缩小问题步骤本身。Next, narrow down to the problem steps themselves. 以下是您将用于测试的确切步骤。These are the exact steps you will use for testing. 不要忘记基本操作:如果问题仅与 Outlook 一起使用,请确保记录该问题行为只发生在一个 Office 365 服务中。Don't forget the basics: if the issue is only with Outlook, make sure to record that the problem behavior happens in only one Office 365 service. 缩小此问题的范围将有助于您将注意力集中在可以解决的内容上。Narrowing down the scope of this issue will help you to focus on something you can resolve.

另请参阅See also

管理 Office 365 终结点Managing Office 365 endpoints