Procedura: scorrere le directory dei file con la classe ParallelHow to: Iterate File Directories with the Parallel Class

In molti casi, l'iterazione di file è un'operazione che può essere facilmente parallelizzata.In many cases, file iteration is an operation that can be easily parallelized. L'argomento Procedura: Scorrere le directory dei file con PLINQ illustra il modo più semplice per eseguire questa attività per molti scenari.The topic How to: Iterate File Directories with PLINQ shows the easiest way to perform this task for many scenarios. L'operazione può però diventare complessa quando il codice deve gestire molti tipi di eccezioni che possono verificarsi durante l'accesso al file system.However, complications can arise when your code has to deal with the many types of exceptions that can arise when accessing the file system. L'esempio seguente mostra uno degli approcci al problema.The following example shows one approach to the problem. Viene usata un'iterazione basata su stack per attraversare tutti i file e le cartelle in una directory specificata e il codice può intercettare e gestire diverse eccezioni.It uses a stack-based iteration to traverse all files and folders under a specified directory, and it enables your code to catch and handle various exceptions. Naturalmente, la modalità di gestione delle eccezioni dipende dell'utente.Of course, the way that you handle the exceptions is up to you.

EsempioExample

Nell'esempio seguente le directory vengono iterate in modo sequenziale, ma i file vengono elaborati in parallelo.The following example iterates the directories sequentially, but processes the files in parallel. Questo è probabilmente l'approccio migliore quando si ha un rapporto file/directory elevato.This is probably the best approach when you have a large file-to-directory ratio. È anche possibile parallelizzare l'iterazione nella directory e accedere a ogni file in modo sequenziale.It is also possible to parallelize the directory iteration, and access each file sequentially. Probabilmente parallelizzare entrambi i cicli non rappresenta una scelta efficace, a meno che il computer di destinazione non abbia un numero elevato di processori.It is probably not efficient to parallelize both loops unless you are specifically targeting a machine with a large number of processors. Tuttavia, come in tutti i casi, è necessario testare l'applicazione attentamente per determinare l'approccio migliore.However, as in all cases, you should test your application thoroughly to determine the best approach.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Security;
using System.Threading;
using System.Threading.Tasks;

class Program
{
   static void Main()
   {            
      try {
         TraverseTreeParallelForEach(@"C:\Program Files", (f) =>
         {
            // Exceptions are no-ops.
            try {
               // Do nothing with the data except read it.
               byte[] data = File.ReadAllBytes(f);
            }
            catch (FileNotFoundException) {}
            catch (IOException) {}
            catch (UnauthorizedAccessException) {}
            catch (SecurityException) {}
            // Display the filename.
            Console.WriteLine(f);
         });
      }
      catch (ArgumentException) {
         Console.WriteLine(@"The directory 'C:\Program Files' does not exist.");
      }   

      // Keep the console window open.
      Console.ReadKey();
   }

   public static void TraverseTreeParallelForEach(string root, Action<string> action)
   {
      //Count of files traversed and timer for diagnostic output
      int fileCount = 0;
      var sw = Stopwatch.StartNew();

      // Determine whether to parallelize file processing on each folder based on processor count.
      int procCount = System.Environment.ProcessorCount;

      // Data structure to hold names of subfolders to be examined for files.
      Stack<string> dirs = new Stack<string>();

      if (!Directory.Exists(root)) {
             throw new ArgumentException();
      }
      dirs.Push(root);

      while (dirs.Count > 0) {
         string currentDir = dirs.Pop();
         string[] subDirs = {};
         string[] files = {};

         try {
            subDirs = Directory.GetDirectories(currentDir);
         }
         // Thrown if we do not have discovery permission on the directory.
         catch (UnauthorizedAccessException e) {
            Console.WriteLine(e.Message);
            continue;
         }
         // Thrown if another process has deleted the directory after we retrieved its name.
         catch (DirectoryNotFoundException e) {
            Console.WriteLine(e.Message);
            continue;
         }

         try {
            files = Directory.GetFiles(currentDir);
         }
         catch (UnauthorizedAccessException e) {
            Console.WriteLine(e.Message);
            continue;
         }
         catch (DirectoryNotFoundException e) {
            Console.WriteLine(e.Message);
            continue;
         }
         catch (IOException e) {
            Console.WriteLine(e.Message);
            continue;
         }

         // Execute in parallel if there are enough files in the directory.
         // Otherwise, execute sequentially.Files are opened and processed
         // synchronously but this could be modified to perform async I/O.
         try {
            if (files.Length < procCount) {
               foreach (var file in files) {
                  action(file);
                  fileCount++;                            
               }
            }
            else {
               Parallel.ForEach(files, () => 0, (file, loopState, localCount) =>
                                            { action(file);
                                              return (int) ++localCount;
                                            },
                                (c) => {
                                          Interlocked.Add(ref fileCount, c);                          
                                });
            }
         }
         catch (AggregateException ae) {
            ae.Handle((ex) => {
                         if (ex is UnauthorizedAccessException) {
                            // Here we just output a message and go on.
                            Console.WriteLine(ex.Message);
                            return true;
                         }
                         // Handle other exceptions here if necessary...

                         return false;
            });
         }

         // Push the subdirectories onto the stack for traversal.
         // This could also be done before handing the files.
         foreach (string str in subDirs)
            dirs.Push(str);
      }

      // For diagnostic purposes.
      Console.WriteLine("Processed {0} files in {1} milliseconds", fileCount, sw.ElapsedMilliseconds);
   }
}
Imports System.Collections.Generic
Imports System.Diagnostics
Imports System.IO
Imports System.Security
Imports System.Threading
Imports System.Threading.Tasks

Module Example
   Sub Main()
      Try
         TraverseTreeParallelForEach("C:\Program Files", 
                                     Sub(f)
                                        ' Exceptions are No-ops.         
                                        Try 
                                           ' Do nothing with the data except read it.
                                           Dim data() As Byte = File.ReadAllBytes(f)
                                        ' In the event the file has been deleted.
                                        Catch e As FileNotFoundException
                                          
                                        ' General I/O exception, especially if the file is in use.
                                        Catch e As IOException   
                                          
                                        ' Lack of adequate permissions.
                                        Catch e As UnauthorizedAccessException

                                        ' Lack of adequate permissions.
                                        Catch e As SecurityException

                                        End Try
                                        ' Display the filename.
                                        Console.WriteLine(f)
                                     End Sub)
      Catch e As ArgumentException
         Console.WriteLine("The directory 'C:\Program Files' does not exist.")
      End Try
      ' Keep the console window open.
      Console.ReadKey()
   End Sub

   Public Sub TraverseTreeParallelForEach(ByVal root As String, ByVal action As Action(Of String))
      'Count of files traversed and timer for diagnostic output
      Dim fileCount As Integer = 0
      Dim sw As Stopwatch = Stopwatch.StartNew()

      ' Determine whether to parallelize file processing on each folder based on processor count.
      Dim procCount As Integer = System.Environment.ProcessorCount

      ' Data structure to hold names of subfolders to be examined for files.
      Dim dirs As New Stack(Of String)

      If Not Directory.Exists(root) Then Throw New ArgumentException()

      dirs.Push(root)

      While (dirs.Count > 0)
         Dim currentDir As String = dirs.Pop()
         Dim subDirs() As String = Nothing
         Dim files() As String = Nothing

         Try
            subDirs = Directory.GetDirectories(currentDir)
         ' Thrown if we do not have discovery permission on the directory.
         Catch e As UnauthorizedAccessException
            Console.WriteLine(e.Message)
            Continue While
         ' Thrown if another process has deleted the directory after we retrieved its name.
         Catch e As DirectoryNotFoundException
            Console.WriteLine(e.Message)
            Continue While
         End Try

         Try
            files = Directory.GetFiles(currentDir)
         Catch e As UnauthorizedAccessException
            Console.WriteLine(e.Message)
            Continue While
         Catch e As DirectoryNotFoundException
            Console.WriteLine(e.Message)
            Continue While
         Catch e As IOException
            Console.WriteLine(e.Message)
            Continue While
         End Try

         ' Execute in parallel if there are enough files in the directory.
         ' Otherwise, execute sequentially.Files are opened and processed
         ' synchronously but this could be modified to perform async I/O.
         Try
            If files.Length < procCount Then
               For Each file In files
                  action(file)
                  fileCount += 1
               Next
            Else
               Parallel.ForEach(files, Function() 0, Function(file, loopState, localCount)
                                                        action(file)
                                                        localCount = localCount + 1
                                                        Return localCount
                                                     End Function,
                                Sub(c)
                                   Interlocked.Add(fileCount, c)
                                End Sub)
            End If
         Catch ae As AggregateException
            ae.Handle(Function(ex)

                              If TypeOf (ex) Is UnauthorizedAccessException Then

                                  ' Here we just output a message and go on.
                                  Console.WriteLine(ex.Message)
                                  Return True
                              End If
                              ' Handle other exceptions here if necessary...

                              Return False
                          End Function)
         End Try
         ' Push the subdirectories onto the stack for traversal.
         ' This could also be done before handing the files.
         For Each str As String In subDirs
            dirs.Push(str)
         Next

         ' For diagnostic purposes.
         Console.WriteLine("Processed {0} files in {1} milliseconds", fileCount, sw.ElapsedMilliseconds)
      End While
   End Sub
End Module

In questo esempio, l'I/O di file viene eseguito in modo sincrono.In this example, the file I/O is performed synchronously. In presenza di file di grandi dimensioni o connessioni di rete lente, potrebbe essere preferibile accedere ai file in modo asincrono.When dealing with large files or slow network connections, it might be preferable to access the files asynchronously. È possibile combinare le tecniche di I/O asincrone con l'iterazione parallela.You can combine asynchronous I/O techniques with parallel iteration. Per altre informazioni, vedere Task Parallel Library e programmazione asincrona .NET Framework tradizionale.For more information, see TPL and Traditional .NET Framework Asynchronous Programming.

Nell'esempio viene utilizzata la variabile locale fileCount per gestire il conteggio del numero totale di file elaborati.The example uses the local fileCount variable to maintain a count of the total number of files processed. Dal momento che è possibile accedere alla variabile contemporaneamente da più attività, l'accesso a essa viene sincronizzato chiamando il metodo Interlocked.Add.Because the variable might be accessed concurrently by multiple tasks, access to it is synchronized by calling the Interlocked.Add method.

Si noti che se viene generata un'eccezione nel thread principale, i thread avviati dal metodo ForEach possono rimanere in esecuzione.Note that if an exception is thrown on the main thread, the threads that are started by the ForEach method might continue to run. Per arrestare questi thread, è possibile impostare una variabile booleana nei gestori di eccezioni e controllarne il valore in ogni iterazione del ciclo parallelo.To stop these threads, you can set a Boolean variable in your exception handlers, and check its value on each iteration of the parallel loop. Se il valore indica che è stata generata un'eccezione, usare la variabile ParallelLoopState per arrestare o interrompere il ciclo.If the value indicates that an exception has been thrown, use the ParallelLoopState variable to stop or break from the loop. Per altre informazioni, vedere Procedura: arrestare o interrompere un ciclo Parallel.For.For more information, see How to: Stop or Break from a Parallel.For Loop.

Vedere ancheSee Also

Parallelismo dei datiData Parallelism