Troubleshooting Netlib that Comes with MDAC2.8 SP1/NET1.0, SNAC, SQL Server 2005 and NET 2.0 with ETW Tracing

Ever have issue with GNE (general network error)? Using ETW tracing can help. For feature description about ETW tracing for data access components, please refer to http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/data_access_tracing.asp. Note that all commands used in this blog are shipped with OS by default since WINDOWS 2000.

[Steps]

1. Setup registry.

C:tempreg add HKEY_LOCAL_MACHINESOFTWAREMicrosoftBidInterfaceLoader /v ":Path" /t REG_SZ /d msdadiag.dll /f

 

2. Compose a ctrl.guid file to choose which components to trace.

(a) For netlib driver that comes with MDAC, you need to add the following line in ctrl.guid file,

 

{BD568F20-FCCD-B948-054E-DB3421115D61} 0x00000000 0 DBNETLIB.1

 

MDAC need to be 2.8 to have ETW traces.

 

(b) For netlib driver that comes with SNAC, you need to add following line in ctrl.guid file,

 

{BA798F36-2325-EC5B-ECF8-76958A2AF9B5} 0x0003F007 0 SQLNCLI.1

 

This will turn on all traces, including function entry points and many others. In most cases, limited trace points by replacing the control bit mask 0x0003F007 with 0x00020002 can collect trace points that can log error code.

 

(c) For netlib driver that comes with sqlclient of NET2.0, you should add the following line in your ctrl.guid file,

 

{C9996FA5-C06F-F20C-8A20-69B3BA392315} 0xFFFFFFFF 0 System.Data.SNI.1

 

You can modify the control bit mask the same way as for SNAC, for example use 0x00020002 for error tracing.

 

(d) For netlib component of SQL Server 2005, you need to add following line in ctrl.guid file,

 

{AB6D5EEB-0132-74AB-C5F5-B23E1644DADA} 0x0003F007 0 SQLSERVER.SNI.1

 

The control bit mask can be modified the same way as for SNAC.

3. Start the trace.

C:tempLogman start MyTrace -pf ctrl.guid -ct perf -o Out.etl –ets

4. Restart your application process to repro the problem.

 

5. Stop the tracing

C:templogman stop MyTrace -ets

 

6. Generate reports

C:tempmofcomp all_data.mof

C:tempTraceRPT /y Out.etl

This should generate summary.txt and dumpfile.csv. The dumpfile.csv contains all traced points.

 

7. Clean up registry

C:tempreg delete HKEY_LOCAL_MACHINESOFTWAREMicrosoftBidInterfaceLoader /v ":Path" /f

[Trace output examples]

(a) MDAC: The trace string is prefixed with DBNETLIB,

 

DBNETLIB, TextA, 0x000013C4, 127991531896572714, 45, 15, 3, "<Connect|ERR> socket 0x274c{WINERR}", 0, 0

(b) SNAC: The trace string is prefixed with SQLNCLI

SQLNCLI, TextA, 0x00001EF0, 127991236299678815, 45, 105, 2, "<SNI_Packet::SNI_Packet|SNI> 14#{SNI_Packet} created by 7#{SNI_Conn}", 0, 0

(c) NET2.0: The trace string is prefixed with System.Data.SNI,

 

System.Data.SNI, TextA, 0x00000510, 127989406145563747, 15, 45, 1, "enter_05 <SNI_Conn::InitObject|API|SNI> ppConn: 05A2ECF0{SNI_Conn**} fServer: 0{BOOL}", 0, 0

(d) SQL Server: The trace string is prefixed with SQLSERVER.SNI,

 

SQLSERVER.SNI, TextW, 0x00001644, 127991236297874490, 30, 30, 1, "enter_03 <SNI_Conn::InitObject|API|SNI> ppConn: 00A6FAFC{SNI_Conn**} fServer: 1{BOOL}", 0, 0

 

Note that the time stamp in the forth column of a trace point is window's FILETIME. For example, “127991236299678815” is “Thu Aug 3 17:07:09.967 2006 (GMT-7)”

[Inspection]

In most cases, finding the root cause of an issue from the trace output requires extensive diagnosis. But in some cases, the error code can be of great help to pinpoint the issue directly. The error code before “{WINERR}”, or simply error code around “ERR” is the places that is of interest. If the error code can be correlated with the error event in application, for example the time stamp, you can send/post us the trace snippet that contains the first error with a few surround tracing points as context for diagnosis. In certain scenario, we might ask for specific trace points by providing search pattern. Advanced filtering in post processing is out of scope of this blog.

Apart from recovering the error code, very often, we use ETW trace to identify performance bottleneck or discover synchronization bugs in multithreaded application.

 

Nan Tu

Software Design Engineer, SQL Server Protocols

Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights