Examine very large database migration best practices

Completed

The following guidelines contained are based on real customer projects and the learnings derived from these projects. The guidelines identify scenarios that have been unsuccessful in the past. An example is the recommendation not to use UNIX servers or virtualized servers as R3load export servers:

  • Often, the export performance is a gating factor on the overall downtime. Often the current hardware is more than 4-5 years old and is prohibitively expensive to upgrade.
  • It's therefore important to get the maximum export performance that is practical to achieve.
  • Previous projects have spent man-weeks or even man-months trying to tune R3load export performance on UNIX or virtualized platforms, before giving up and using Intel R3load servers.
  • Dual-socket commodity Intel servers are inexpensive and deliver substantial performance gains, in some cases orders of magnitude greater than minor tuning improvements possible on UNIX or virtualized servers.
  • Customers often have existing VM farms but most often these don't support modern offload or SR-IOV technologies. Often the VMware version is old, unpatched, or not configured for high network throughput and low latency. R3load export servers require very fast thread performance and extremely high network throughput. R3load export servers may run for 10-15 hours at nearly 100% CPU and network utilization. This isn't the typical use case of most VMware farms and most VMware deployments were never designed to handle a workload such as R3load.

Tip

Do not invest time into optimizing R3load export performance on UNIX or virtualized platforms. Doing so will waste not only time but will cost much more than buying low-cost Intel servers at the start of the project. VLDB migration customers are therefore requested to ensure the project team has fast modern R3load export servers available at the start of the project. This will lower the total cost and risk of the project.

Best practices

  • Survey and inventory the current SAP landscape. Identify the SAP Support Pack levels and determine if patching is required to support the target DBMS. In general, the operating system compatibility is determined by the SAP kernel and the DBMS compatibility is determined by the SAP_BASIS patch level.
  • Build a list of SAP OSS Notes that need to be applied in the source system, such as updates for SMIGR_CREATE_DDL. Consider upgrading the SAP kernels in the source systems to avoid a large change during the migration to Azure (for example. If a system is running an old 7.41 kernel, update to the latest 7.45 on the source system to avoid a large change during the migration).
  • Develop and document the high availability and disaster recovery solution. The documentation should break up the solution into the DB layer, ASCS layer, and SAP application server layer. Separate solutions might be required for standalone solutions such as TREX or liveCache.
  • Develop a sizing and configuration document that details the Azure VM types and storage configuration. How many premium disks, how many datafiles, how are datafiles distributed across disks, usage of storage spaces, NTFS format size = 64 kb. Also, document backup/restore and DBMS configuration such as memory settings, max degree of parallelism, and traceflags.
  • Develop a network design document including VNet, Subnet, NSG and UDR configuration.
  • Document and implement security and hardening concept. Remove Internet Explorer, create an Active Directory container for SAP service accounts and servers, and apply a firewall policy blocking all but a limited number of required ports.
  • Create an OS/DB migration design document detailing the package and table splitting concept, number of R3loads, SQL Server traceflags, sorted/unsorted, Oracle RowID setting, SMIGR_CREATE_DDL settings, Perfmon counters (such as BCP rows/sec and BCP throughput kb/sec, CPU, memory), RSS settings, Accelerated Networking settings, log file configuration, BPE settings, TDE configuration.
  • Create a “Flight Plan” graph showing progress of the R3load export/import on each test cycle. This allows the migration team to validate if tunings and changes improve R3load export or import performance. The X axis is the number of packages complete, and the Y axis is the elapsed time. This flight plan is also critical during the production migration so that the planned progress can be compared against the actual progress and any problem identified early.
  • Create a performance testing plan. Identify the top ~20 online reports, batch jobs and interfaces. Document the input parameters (such as date range, sales office, plant, company code, etc.) and runtimes on the original source system. Compare to the runtime on Azure. If there are performance differences run SAT, ST05, and other SAP tools to identify inefficient statements.
  • Audit deployment and configuration, and ensure that cluster timeouts, kernels, network settings, NTFS format size are all consistent with the design documents. Set perfmon counters on important servers to record basic health parameters every 90 seconds. Verify that the SAP servers are in a separate AD container and that the container has a group policy applied to it with firewall configuration.
  • Check that the lead OS/DB migration consultant is licensed! Request the consultant name, s-user, and certification date. Open an OSS message to BC-INS-MIG and ask SAP to confirm the consultant is current and licensed.
  • If possible, have the entire project team associated with the VLDB migration project within one physical location and not geographically dispersed across several continents and time zones.
  • Make sure that there's a proper fallback plan is in place and that it's part of the overall schedule.
  • Select fast thread count Intel CPU models for the R3load export servers. Don't use “Energy Saver” CPU models as they have lower performance and don't use 4-socket servers. Intel Xeon E5 Platinum 8158 is a good example.

Best practices to avoid problems

  • Don't subcontract one consulting organization to do the export and subcontract another consulting organization to do the import. Occasionally the source system is outsourced and managed by one consulting organization or partner and a customer wishes to migrate to Azure and switch to another partner. Due to the tight coupling between export and import tuning and configuration, it's unlikely assigning these tasks to different organizations will produce a good result.
  • Don't economize on Azure hardware resources during the migration and go live. Azure VMs are charged per minute and can be reduced in size easily. During a VLDB migration, use the most powerful VM available. Customers have successfully gone live on 200-250% oversized systems, then stabilized while running oversized systems. After you monitor system utilization for 4-6 weeks, VMs with excess capacity are reduced in size or shutdown to lower costs.