EMC Corp. has announced the EMC Data Domain Global Deduplication Array (GDA), a fast inline deduplication storage system for enterprise backup applications. The Global Deduplication Array, based on a new multi-controller extension of the Data Domain architecture, offers inline global deduplication and a global namespace for all data stored in the dual controller system. With throughput up to 12.8 terabytes per hour (TB/hour), it establishes consistently high benchmarks across the spectrum of common data center backup metrics. The Global Deduplication Array provides up to 14.2 petabytes (PB) of logical backup capacity, driving new levels of simplicity for data center backup consolidation across workloads as diverse as very large databases, VMware images, and unstructured data.
Unlike most multi-controller deduplication systems, the inline Global Deduplication Array is tightly coupled with backup software, enabling inline deduplication performance, dynamic distribution of load and simplicity of operation. The Global Deduplication Array distributes parts of the deduplication process to the backup servers to reduce network load and increase the throughput performance of the GDA controllers. It offers more than 3x faster backup throughput per controller than competitive deduplication configurations and is the fastest inline deduplication system available. This distributed deduplication processing throughput is anchored by the native speed advantages of the Intel Xeon multi-core CPUs in the GDA controllers and the Data Domain SISL (Stream-Informed Segment Layout) scaling architecture that minimizes the number of disk accesses required in the deduplication process. At initial release, the platform supports Symantec NetBackup and Backup Exec through backup server-based OpenStorage plug-in software. Later in 2010, it will also support EMC NetWorker using integrated software.
The Global Deduplication Array presents a single inline deduplication storage pool to the backup application across two EMC Data Domain DD880 controllers. Large datacenter backup jobs are dynamically and transparently load balanced across the controllers, simplifying capacity management, performance management and backup administration.
CPU improvements to increase deduplication speed inline while minimizing reliance on disk accesses for performance. Data Domain systems have delivered consistent improvement in throughput performance by nearly 90 times and in capacity by more than 225 times over the last 6 years. Based on Intel's CPU roadmap, increased throughput is expected to continue growing significantly in the future. -- High performance inline deduplication for simplicity, to minimize system resources, administration, and internal system process contention. -- Green storage efficiency for a smaller system footprint and lower power consumption. -- Data Domain Data Invulnerability Architecture defends against data integrity issues by providing continuous verification during storage and recovery of data.