通过 TraceProcessor 使用流式处理

项目
07/13/2023

默认情况下，在处理跟踪时，TraceProcessor 将数据加载到内存中来访问数据。这种缓冲方法易于使用，但在内存使用情况方面可能会占用大量资源。

TraceProcessor 还提供了 trace.UseStreaming()，它支持以流式处理方式访问多种类型的跟踪数据（从跟踪文件中读取数据时就进行处理，而不将该数据缓冲到内存中）。例如，syscall 跟踪可能相当大，在跟踪中缓冲整个 syscall 列表可能会消耗大量资源。

访问缓冲数据

以下代码演示通过 trace.UseSyscalls() 以正常的缓冲方式访问 syscall 数据：

using Microsoft.Windows.EventTracing;
using Microsoft.Windows.EventTracing.Processes;
using Microsoft.Windows.EventTracing.Syscalls;
using System;
using System.Collections.Generic;

class Program
{
    static void Main(string[] args)
    {
        if (args.Length != 1)
        {
            Console.Error.WriteLine("Usage: <trace.etl>");
            return;
        }

        using (ITraceProcessor trace = TraceProcessor.Create(args[0]))
        {
            IPendingResult<ISyscallDataSource> pendingSyscallData = trace.UseSyscalls();

            trace.Process();

            ISyscallDataSource syscallData = pendingSyscallData.Result;

            Dictionary<IProcess, int> syscallsPerCommandLine = new Dictionary<IProcess, int>();

            foreach (ISyscall syscall in syscallData.Syscalls)
            {
                IProcess process = syscall.Thread?.Process;

                if (process == null)
                {
                    continue;
                }

                if (!syscallsPerCommandLine.ContainsKey(process))
                {
                    syscallsPerCommandLine.Add(process, 0);
                }

                ++syscallsPerCommandLine[process];
            }

            Console.WriteLine("Process Command Line: Syscalls Count");

            foreach (IProcess process in syscallsPerCommandLine.Keys)
            {
                Console.WriteLine($"{process.CommandLine}: {syscallsPerCommandLine[process]}");
            }
        }
    }
}

访问流式处理数据

使用大型 syscall 跟踪，尝试在内存中缓冲 syscall 数据可能会相当消耗资源，甚至可能无法实现。以下代码演示如何以流式处理方式访问相同的 syscall 数据，将 trace.UseSyscalls() 替换为 trace.UseStreaming().UseSyscalls()：

using Microsoft.Windows.EventTracing;
using Microsoft.Windows.EventTracing.Processes;
using Microsoft.Windows.EventTracing.Syscalls;
using System;
using System.Collections.Generic;

class Program
{
    static void Main(string[] args)
    {
        if (args.Length != 1)
        {
            Console.Error.WriteLine("Usage: <trace.etl>");
            return;
        }

        using (ITraceProcessor trace = TraceProcessor.Create(args[0]))
        {
            IPendingResult<IThreadDataSource> pendingThreadData = trace.UseThreads();

            Dictionary<IProcess, int> syscallsPerCommandLine = new Dictionary<IProcess, int>();

            trace.UseStreaming().UseSyscalls(ConsumerSchedule.SecondPass, context =>
            {
                Syscall syscall = context.Data;
                IProcess process = syscall.GetThread(pendingThreadData.Result)?.Process;

                if (process == null)
                {
                    return;
                }

                if (!syscallsPerCommandLine.ContainsKey(process))
                {
                    syscallsPerCommandLine.Add(process, 0);
                }

                ++syscallsPerCommandLine[process];
            });

            trace.Process();

            Console.WriteLine("Process Command Line: Syscalls Count");

            foreach (IProcess process in syscallsPerCommandLine.Keys)
            {
                Console.WriteLine($"{process.CommandLine}: {syscallsPerCommandLine[process]}");
            }
        }
    }
}

流式处理的工作原理

默认情况下，在第一次通过跟踪时提供所有流式处理数据，而来自其他源的缓冲数据不可用。上面的示例演示如何将流式处理与缓冲相结合，即在流式处理 syscall 数据之前缓冲线程数据。因此，必须读取跟踪两次，一次用于获取缓冲线程数据，另一次用于通过现在可用的缓冲线程数据访问流式处理 syscall 数据。为了以这种方式结合使用流式处理和缓冲，此示例将 ConsumerSchedule.SecondPass 传递给 trace.UseStreaming().UseSyscalls()，这会导致在第二次通过跟踪时发生 syscall 处理。通过在第二次通过跟踪时运行，syscall 回调可以在处理每个 syscall 时访问来自 trace.UseThreads() 的挂起结果。如果没有此可选参数，则 syscall 流式处理会在第一次通过跟踪时运行（只会通过一次跟踪），并且 trace.UseThreads() 的挂起结果还不可用。在这种情况下，回调仍可以从 syscall 访问 ThreadId，但它无法访问线程的进程（因为处理链接数据的线程是通过其他可能尚未处理的事件提供的）。

缓冲与流式处理之间的一些主要差异：

缓冲返回一个 IPendingResult<T>，它保存的结果仅在处理跟踪前可用。处理跟踪后，可以使用 foreach 和 LINQ 等技术来枚举结果。
流式处理返回 void，并改用回调参数。每个项都可用时，它会调用一次回调。由于未对数据进行缓冲处理，因此永远不会出现可使用 foreach 或 LINQ 枚举的结果列表，流式处理回调需要缓冲要保存的任何数据部分，以便在处理完成后使用。
挂起的结果可用时，调用 trace.Process() 之后会显示处理缓冲数据的代码。
处理流式处理数据的代码在调用 trace.Process() 之前显示，作为对 trace.UseStreaming.Use...() 方法的回调。
流式处理使用者可以选择仅处理流的一部分，并通过调用 context.Cancel() 来取消未来的回调。始终为缓冲使用者提供一个完整的缓冲列表。

代码	说明
trace.UseStreaming().UseContextSwitchData()	流相关的上下文切换数据（来自精简和非精简事件，具有比原始非精简事件更准确的 SwitchInThreadId）。
trace.UseStreaming().UseScheduledTasks()	流相关的计划任务数据。
trace.UseStreaming().UseSyscalls()	流相关的系统调用数据。
trace.UseStreaming().UseWindowInFocus()	流相关的焦点窗口数据。

独立流式处理事件

此外，trace.UseStreaming() 为许多不同的独立事件类型提供分析的事件：

代码	说明
trace.UseStreaming().UseLastBranchRecordEvents()	流分析的最后分支记录 (LBR) 事件。
trace.UseStreaming().UseReadyThreadEvents()	流分析的就绪线程事件。
trace.UseStreaming().UseThreadCreateEvents()	流分析的线程创建事件。
trace.UseStreaming().UseThreadExitEvents()	流分析的线程退出事件。
trace.UseStreaming().UseThreadRundownStartEvents()	流分析的线程流程启动事件。
trace.UseStreaming().UseThreadRundownStopEvents()	流分析的线程流程停止事件。
trace.UseStreaming().UseThreadSetNameEvents()	流分析的线程集名称事件。

代码	说明	包含于
trace.UseStreaming().UseCompactContextSwitchEvents()	流分析的精简上下文切换事件。	trace.UseStreaming().UseContextSwitchData()
trace.UseStreaming().UseContextSwitchEvents()	流分析的上下文切换事件。在某些情况下，SwitchInThreadId 可能不准确。	trace.UseStreaming().UseContextSwitchData()
trace.UseStreaming().UseFocusChangeEvents()	流分析的窗口焦点更改事件。	trace.UseStreaming().UseWindowInFocus()
trace.UseStreaming().UseScheduledTaskStartEvents()	流分析的计划任务启动事件。	trace.UseStreaming().UseScheduledTasks()
trace.UseStreaming().UseScheduledTaskStopEvents()	流分析的计划任务停止事件。	trace.UseStreaming().UseScheduledTasks()
trace.UseStreaming().UseScheduledTaskTriggerEvents()	流分析的计划任务触发事件。	trace.UseStreaming().UseScheduledTasks()
trace.UseStreaming().UseSessionLayerSetActiveWindowEvents()	流分析的会话层集活动窗口事件。	trace.UseStreaming().UseWindowInFocus()
trace.UseStreaming().UseSyscallEnterEvents()	流分析的 syscall 进入事件。	trace.UseStreaming().UseSyscalls()
trace.UseStreaming().UseSyscallExitEvents()	流分析的 syscall 退出事件。	trace.UseStreaming().UseSyscalls()

后续步骤

在本教程中，你了解了如何使用流式处理在使用较少内存的情况下立即访问跟踪数据。

下一步是从跟踪中查看访问所需的数据。若要了解，请查看示例。请注意，并非所有跟踪都包含所有支持的数据类型。

Share via

通过 TraceProcessor 使用流式处理

访问缓冲数据

访问流式处理数据

流式处理的工作原理

相关流式处理数据

独立流式处理事件

相关数据的基础流式处理事件

后续步骤

反馈

其他资源