Related Topics 7 Levels of Application Integration Links |
Contents Network Attached
Storage (NAS) Network Attached
Storage (NAS) vs Storage Area Network (SAN) Glossary
|
|||||||||||||||||||||||
Importance of Storage Storage was once a peripheral – a mere feature of the server. Those days are long gone, and that's a good thing because information shouldn't always be hidden behind, or bottled up inside a single server. It should be positioned so that it can be made quickly but securely available to other applications and/or departments within the enterprise. So, now that storage is out from behind the server's shadow, we need to recognize its true value. Here's why:
The value of data
Data – translated by applications and infrastructure into information – has grown in value to the point where, for many enterprises, it is the most valuable corporate asset. It's a competitive differentiator – the underpinning of all customer-facing applications like CRM and CSM. And with the advent of the Web, it has expanded in importance to become mission-critical to the very enterprise as viewed through the portal. In this environment, storage is now the single most important IT resource for assuring:
An enterprise core competency
It would be a gross oversimplification to say that storage was once a simple matter of plugging an array into a SCSI port. Then again, compared to the array of alternatives available now from FC-AL to FC Fabric, the IP derivatives and soon InfiniBand and beyond, the days of one-size-fits-all SCSI are also long gone. Data storage as an enterprise core competency is becoming exceeding complex. Here's a brief, and by no means exhaustive, list of the technologies now directly involved in or significantly touching upon data storage:
Growing storage staff
There are benefits to be realized today from implementing storage networks, but because of the shortage of IT personnel with storage expertise, enterprises are hesitant to move forward. Recruiting, training and retaining skilled storage management staff must become a core IT competency if the real benefits from storage innovation are to be realized. Conclusion
The nature of the storage environment has changed radically in the last few years. It is it now characterized by unprecedented growth in the volume of data to be managed and a quantum leap in complexity and the sheer number of available combinations and permutations. Add to that the growing value of data to the enterprise, and the overwhelming importance of storage and storage networking becomes obvious. |
||||||||||||||||||||||||
NAS/SAN ConvergenceVendors are expected to make significant strides in merging network-attached storage (NAS) and storage area network (SAN) environments, according to a recent report on network storage by the Yankee Group, a market research firm in Boston.
Vendors will expand NAS features to ease interoperability with SANs, provide content delivery management, and improve management tools. Meanwhile, next-generation NAS systems, being developed by start-ups aiming for higher-performing systems than what are currently available, will integrate global distributed file systems and provide dedicated data management capabilities, according to the report. "I think this is the year that the SAN versus NAS debate goes away," says Jamie Gruener, senior analyst at the Yankee Group, who adds that end users will be able to use both technologies to get the best of both architectures.
The network storage market (including NAS and SAN) is expected to grow from $9.4 billion in 2001 to $24.2 billion in 2005, according to the Yankee Group's predictions. In turn, NAS will jump from about $2 billion in 2001 to $8.6 billion in 2005. However, predictions for 2002 show NAS increasing only slightly to an estimated $2.46 billion (see figure).
"When you take the ease-of-use associated with NAS and add it to a SAN, life gets a lot easier. It also accelerates the overall network storage market, so we're bullish. But I don't predict an overly aggressive growth path for 2002; that will come in 2003 and 2004," says Gruener.
NAS vendors will also expand the use of NAS connected to back-end disk arrays, deliver the ability to handle file and block semantics, and feature distributed global file systems as part of content delivery over WANs.
NAS + SAN Large vendors such as EMC and Network Appliance are already bridging the gap between NAS and SAN. EMC's Celerra NAS device attaches to its Symmetrix arrays and uses those arrays for storing files. In February, EMC announced the latest version of its NAS head—the Celerra Data Mover 510—which scales to 52TB and supports up to 224 network connections (see sidebar, "EMC enhances Celerra NAS," p. 13). Celerra is a NAS head that connects to back-end storage; EMC's Clariion IP4700 is an integrated NAS head and storage.
"Separating the NAS head from the storage gives you better scalability," says Paul Ross, director of network storage marketing for EMC. "The tradeoff is that it costs more." He says that an integrated NAS device—with the NAS head and storage in the same box—offers better price/performance. "The problem is that when you scale outside of what the box can hold, you have to buy a new box."
Last October, Compaq announced its NAS-SAN bridge, the StorageWorks NAS Executor E7000, which connects a NAS head to Compaq's disk arrays. Compaq dubbed the entire system "Universal Network Storage," because it has one network storage pool serving both blocks and files, with the E7000 providing file access to the SAN. The E7000 is traditional NAS-type storage, providing multi-protocol file support and also storage virtualization through pooling and snapshots. This functionality is complemented at the SAN level with Data Resource Manager and Enterprise Volume Manager for snapshots at the controller level.
This month, Compaq announced that it will expand its NAS-SAN fusion to include non-Compaq SAN connections to its NAS head (see sidebar, "Compaq expands NAS, eyes NAS-SAN convergence," p. 14).
IBM is also attaching its NAS 300G gateway product to other vendors' SANs. So far, IBM has tested the 300G with Hitachi Data Systems' Freedom Storage Lightning 9900 arrays and plans to extend interoperability to Hitachi's Thunder 9200. Acting as an IP gateway to a SAN, the 300G allows clients and servers on the IP network to access files directly from their SAN.
By year-end, Storage Computer plans to introduce a new product that will converge its CyberBorg direct-attached/SAN system with NAS functionality. The new system will also have iSCSI connectivity for block-level management.
Auspex Systems is converging NAS and SAN in its NSc3000 controller, which began shipping this month. The NSc3000 is essentially a NAS system that sits between the network and a switch. It plugs into the switch and communicates with the SAN to provide file services.
The 5U controller works with most leading SAN storage subsystems. The unit also leverages the virtual storage capabilities of Auspex's Xtreme Virtual Partitions to stripe LUNs from multiple controllers into a single logical volume, which increases throughput by providing more SAN paths to a given file system.
Vendors are also extending the ability of NAS devices to understand file and block semantics. For example, Network Appliance has been shipping its block-oriented SnapManager device for Microsoft Exchange since last September. In addition, Procom Technology's Duet software enables both file- and block-level data access in the same storage system. The software is an option for Procom's Netforce 3000 NAS filers, which were certified by Microsoft in January to work with Windows 2000 and NT 4.0.
A number of start-ups are poised to enter the NAS-SAN convergence market this year, such as Pirus Networks, to compete with the market leaders such as Compaq, EMC and Network Appliance. |
||||||||||||||||||||||||
Storage Area Network (SAN)A SAN, or storage area network, is a dedicated network that is separate from LANs and WANs. It generally serves to interconnect the storage-related resources that are connected to one or more servers. It is often characterised by its high interconnection data rates (Gigabits/sec) between member storage peripherals and by its highly scalable architecture. Though typically spoken of in terms of hardware, SANs very often include specialised software for their management, monitoring and configuration. SANs can provide many benefits.
Centralising data storage operations and their management is certainly one of
the chief reasons that SANs are being specified and deployed today.
Administrating all of the storage resources in high-growth and
mission-critical environments can be daunting and very expensive. SANs
can dramatically reduce the management costs and complexity of these
environments while providing significant technical advantages. Click here for more details on SAN. |
||||||||||||||||||||||||
Network Attached Storage (NAS) Network Attached Storage is designed to separate storage resources from network and application servers, in order to simplify storage management and improve the reliability, performance and efficiency of the network, thus increasing the overall productivity of the organization. Network Attached Storage servers are self-contained, intelligent devices that attach directly to your existing LAN. A file system is located and managed on the NAS device and data is transferred to clients over industry standard network protocols (TCP/IP or IPX) using industry standard file sharing protocols (SMB/CIFS, NCP, NFS, AFP or HTTP). This intelligence on the NAS device enables true data sharing among heterogeneous network clients. Click here for more details on NAS. |
||||||||||||||||||||||||
Network Attached Storage (NAS) vs Storage Area Network (SAN) A NAS (Network Attached Storage) appliance is a self-contained,
intelligent storage device that attaches directly to the LAN and transfers
data over network protocols (TCP/IP or IPX) using industry standard file
sharing protocols (SMB, CIFS, NCP,AFP, NFS, HTTP). Network clients
communicate directly with the storage server A SAN (Storage Area Network) is a discrete network of servers
and storage devices (RAID, Tape Libraries, etc.) attached together via a high
speed I/O interconnect, such as Fibre Channel. Data is transferred via serial
I/O rather than network protocols, and raw data requests are made directly to
disk and not over the LAN. All storage transactions are processed on a
separate network with dedicated bandwidth for data.
|
||||||||||||||||||||||||
Storage Management Recent announcements from Computer Associates and VERITAS have
focused on storage management products that approach users in a services
fashion. As Web services continues to mature and gain credibility within
enterprises, storage management vendors have an opportunity to maintain the
availability and recoverability of Web services as they often contain
transactional and business critical data that will have to be stored for
legal, audit, or reference reasons. In the future, this will be accomplished
through the use of integrated storage management tools that tie enterprise
management and storage management together to form a robust and dynamic
platform for IT users. As Web services is dynamic and operates in real-time,
the storage management tools that manage the data generated by it will have
to provide redundant data access and high-performance backup in near
real-time. Computer Associates, International (CA) and VERITAS are just two
of a plethora of storage vendors that have recently announced architectures
that may provide a future platform for Web services storage management. With
its dominance in the systems management world, CA has developed BrightStor
architecture, which includes an extensible architecture, open interfaces, non-CA
solution integration, and an open SDK built on top of CA common services. CA
defines its common services as event management, automation, single console,
enterprise discovery, business process views, and instrumentation technology.
VERITAS announced its Adaptive Software Architecture that contains software
services for data protection, storage infrastructure, high availability, and
active SRM (Storage Resource Management). |
||||||||||||||||||||||||
Glossary |
||||||||||||||||||||||||
|
A to Z of Storage Terms ACCESS CONTROL: General term for a group of security techniques such as using passwords or smart cards to limit access to a computer or network only to authorized users. AVAILABILITY: The accessibility of a computer system or network resource. BACKUP/RESTORE: The act of copying files and databases to protect them in the event of a system failure or similar catastrophe and retrieving them at a later date. BBU (Battery Backup Unit): A battery-operated power supply used as an auxiliary source of electricity in the event of power failure. The battery guarantees no lost writes and orderly transitions or shutdowns during power outages. BCV (Business Continuance Volumes): Business Continuance Volumes are copies of active production volumes that can be used to run simultaneous tasks in parallel with one another. This gives customers the ability to do concurrent operations, such as data warehouse loads and refreshes or point-in-time backups, without affecting production systems. BUS: A transmission channel in a computer or on a network that carries signals to and from devices attached to the channel. BUSINESS CONTINUANCE: The technique of ensuring that a business is able to weather a natural or man-made catastrophe through the deployment of fault-tolerant and redundant hardware and software systems. CACHING: A method of temporarily storing frequently accessed data in RAM or an special area of a hard disk drive, to speed processing. With sufficient storage-processor and backup memory, a storage system also supports write caching temporary storage where data is held for a short time before being written on disk for permanent storage. CHANNEL: A high bandwidth connection between a processor and other processors or devices. CHECKSUM: A number of bits that is transmitted with data so that the receiving device can verify the accuracy of the data that it received. If the number of bits that arrives is the same that is sent, the transmission is believed to be complete. CLUSTER: A collection of high-performance, interconnected computer servers working together as a single processing resource in an application environment to provide scalable, high availability to both users and applications. CONNECTIVITY: The ability of hardware devices or software to communicate with other hardware or software. CROSS-PLATFORM: Systems that are operating-system independent and can operate across different system platforms. DATA INTEGRITY: The accuracy of data after being transmitted or processed. DATA MART: A repository of data, often a scale-down data warehouse, usually tailored to the needs of a specific group within an organization DATA MINING: Using advanced statistical tools to identify commercially useful patterns in databases. DATA WAREHOUSE: A very large repository of data comprising nearly all of a company’s information. DEBUG: To detect, locate, and correct problems in a program or malfunctions in software. (Troubleshoot in a hardware context.) DEFRAG: To improve file access by rearranging data so that whole files are stored in contiguous sectors on a hard disk. DEVICE: A computer subsystem such as a printer, serial port, disk drive, or video adapter. Frequently, devices require their own controlling software (device drivers) to communicate with the computer system. DISASTER RECOVERY: Preventative measures using redundant hardware, software, data centers and other facilities to ensure that a business can continue operations during a natural or man-made disaster and if not, to restore business operations as quickly as possible when the calamity has passed. DISK CONTROLLER: The hardware that controls the writing and reading data to and from and to a disk drive. It can be used with floppy disks or hard drives. It can be hard-wired or built into a plug-in interface board. DISK MIRRORING: Disk mirroring provides the highest data availability for mission-critical applications by creating two copies of data on separate disk drives. The technique ensures both the highest availability and highest system performance. DISK STRIPING: Combining a set of same-size disk partitions from 2 to 32 separate disks into a single volume that virtually "stripes" these disks in a way that the operating system recognizes as a single drive. Disk striping enhances performance by enabling multiple I/O operations in the same volume to proceed simultaneously. DISK STRIPING with PARITY: Preserving parity information across a disk stripe so that if one disk partition fails, its data can be re-created with information stored across the remaining portions of the disk stripe. E-INFOSTRUCTURE: A shared foundation of technologies, tools, services, and intellectual capital that enables an uninterrupted flow of information EMC PROVEN E-INFOSTRUCTURE: An EMC program that recognizes leading corporations that operate in the 24-hour Internet workday and that adhere to the highest levels of information availability and customer satisfaction. ENTERPRISE STORAGE: A combination of intelligent storage systems, software and services. Together, these products and services enable an enterprise to store, retrieve, manage, protect and share information from all major computing environments, including UNIX, Windows 2000 and mainframe platforms ERROR CORRECTION CODING (ECC): An encoding method that detects and corrects errors at the receiving end of data transmission. ECC is used by most modems. ESN (Enterprise Storage Network): It’s a specialized, open network that is designed to offer universal data access for every major computing platform, operating system, and application in the world across any combination of SCSI, Ultra SCSI, Fibre Channel, and ESCON® technologies. It integrates Symmetrix Enterprise Storage systems, EMC Connectrix, advanced, highly resilient network technology, and enterprise storage software with consulting and services into one complete package. An EMC ESN enables corporations to accelerate data access, boost network performance, automate storage management, and fully exploit the power of information regardless of its location. FABRIC: A Fibre Channel topology with one or more switching devices. FAILOVER: Data is immediately and nondisruptively routed to an alternate data path or device in the event of an event of an adapter, cable, channel controller or other device. FAST DUMP/RESTORE (FDR): A family of mainframe-based backup/restore utilities that use Symmetrix with existing mainframe infrastructures to provide a comprehensive suite of fast, nondisruptive information protection solutions for both mainframe and open systems environments. FAULT TOLERANCE: A computer or operating system’s ability to respond to a catastrophic event like a power outage or hardware failure so that no data is lost or corrupted. FIBREALLIANCE: The FibreAlliance is an open association of industry-leading Fibre Channel vendors committed to accelerating the adoption rate of storage area networks (SANs). Members are working to develop a framework specification within which multiple vendors can develop integrated management environments for enterprise SAN customers. FIBER CHANNEL ARBITRATED LOOP (FC-AL): FC-AL places up to 126 devices on a loop to share bandwidth. Typically, this is done using a star layout that is logically a loop, employing a Fibre Channel hub. This allows IT managers to add or remove devices without having to bring the entire loop down. FIBRE CHANNEL (FC): Fibre channel is nominally a one-gigabit-per-second data transfer interface technology, although the specification allows data transfer rates from 133 megabits per second up to 4.25 gigabits per second. Data can be transmitted and received at one-gigabit-per-second simultaneously. Common transport protocols, such as Internet Protocol (IP) and Small Computer System Interface (SCSI), run over Fibre Channel. Consequently, high-speed I/O and networking can stem from a single connectivity technology. HARD DISK: A mass storage device for computer data that consists of a hermetically sealed enclosure that holds stacked, rotating, magnetizable disks accessed by multiple read/write heads. HARDWARE RAID: Dual-storage processors that improve data availability and performance create data protection information and transfer it to the disk drives. They are located in an external storage subsystem, freeing the CPU from performing RAID parity, striping, and rebuild overhead calculations. This intelligent circuit board controls the disk drives. HBA (Host Bus Adapter): An SCSI-2 adapter that plugs into a host and lets the host communicate with a device. The HBA usually performs the lower level of the SCSI protocol and normally operates in the initiator role. HOST: A computer server, typically networked, that runs applications used by or from other computers (e.g., web servers, file servers, and application servers). HOT SPARE: In RAID systems, a spare drive in the disk array that is configured as a backup for rebuilding data in the event another drive fails. HOT SWAPPING: The process of removing and replacing a failed system component while the system remains online. HUB: A device joining communications lines at a central location, providing a common connection to all devices on the network. INFORMATION MANAGEMENT: The entire process of defining, evaluating, protecting, and distributing data within an organization. INFRASTRUCTURE: The basic, fundamental architecture of a computer system. The infrastructure determines how the system functions and how flexible it is in meeting future demands. INTELLIGENT: A device is intelligent when it is controlled by one or more processors integral to the device. ISA (Intelligent Storage Architecture): EMC's Intelligent Storage Architecture consolidates information management functions including backup/restore, disaster recovery, migration, and information sharing into a single enterprise storage system. This provides a single consistent platform from which to manage, access, and share information. INTEROPERABILITY: The ability of hardware and software made by a variety of different manufacturers to work seamlessly together. JBOD (Just a Bunch of Disks): A group of hard disks, usually without intelligence (processors). LINK: A connection between two Fibre Channel ports. LUN (Logical Unit Number): An encoded 3-bit identifier used on an SCSI bus to distinguish among up to eight devices (logical units) with the same SCSI ID. An LUN is an indivisible unit presented by a storage device to its host. LUNs are assigned to each disk drive in an array so the host can address and access the data on those devices. LUN Masking: An array security feature that lets a server access only its own and no other LUNs on a Fibre Channel. Each LUN can specify what host or combination of hosts has access to that LUN. MAINFRAME: A computer primarily used by Global 2000 corporations for large-scale commercial applications. A mainframe is capable of supporting many users from many terminals. MODULARITY: An approach to developing hardware or software that breaks projects into smaller units (or modules) that are deliberately designed as standalone units that can work with other sections of the program. The same module can perform the same task in another or several other programs or components. Modifying the way that module works will have no adverse affects on the other components of a program. MULTIPATHING: Multipathing allows for two or more data paths to be simultaneously used for read/write operations, enhancing performance by automatically and equally dispersing data access across all the available paths. NAME SERVICES LOGIN: Worldwide-exclusive names that allow a device to log into the switch. NONVOLATILE: Data in memory, cache and other electronic repositories are protected by a battery backup system to prevent their loss in the event of a power failure. OLTP: Online Transaction Processing is a system that processes transactions the instant the computer receives them and updates master files immediately. OLTP is essential for good financial record keeping and inventory tracking. PARITY: A data-error-checking procedure where the number of 1s must always be the same—either even or odd— for each group of bits submitted without error. Parity information is saved and compared with each subsequent calculations of whether the number is odd or even. PARITY BIT: An extra bit used in checking for errors in transferred groups of data bits. In modem communications, it is used to check the accuracy of each transmitted character. In RAM, a parity bit is used to check the accuracy with which each byte is stored. PB (PetaByte): 1 quadrillion bytes or one thousand terabytes. PORT: On a computer, it is a physical connecting point to which a device is attached. PROTOCOL: A set of rules or standards intended to enable computers to communicate. RAID (Redundant Array of Independent Disks): Data is stored on multiple magnetic or optical disk drives to increase output performance and storage capacities and to provide varying degrees of redundancy and fault tolerance. Instead of storing valuable data on a single hard disk that could fail at any time, RAID makes sure a backup copy of all information always exists by spreading data among multiple hard disks. RAID Levels: Different levels offer trade-offs among speed, reliability, and cost. Level 0 is disk striping only for better performance. Its data transfer and I/O rates are very high, but it provides no safeguards against data failure. Level 1 uses disk mirroring. All data is duplicated on two drives, offering the highest data reliability. Its data transfer rate is higher than single disk for read and similar for write. Its I/O rate is twice that of single disk for read but similar for write. Level 1/0 is a combination of Levels 1 and 2, mirroring and striping. It offers the same data reliability as RAID 1. Its data-transfer and I/O rates are very high, but slower than RAID 0 for writes Level 3 stripes data across three or more drives. All drives operate in parallel to achieve the highest data transfer rate. Parity bits are stored on separate, dedicated drives. Its I/O rate is similar to single disk. Level 5 is the most widely used. Data is striped across three or more drives for high performance. The parity bits from two drives are stored on a third drive. Its data reliability is similar to RAID 3. Its data transfer and I/O rates are very high for read, but slower than single disk for write. READ-ONLY: Data can be retrieved (read) but not altered (written). REDUNDANT: Backup arrays, drives, disks or power supplies that duplicate functions performed elsewhere. ROBUST: Able to function or continue to function well in a variety of unanticipated situations. SCALABILITY: The capacity of hardware, software and networks to change size according to the number of users that they accommodate. Most often, scalability refers to the capacity to expand rather than shrink. SCSI: Small Computer System Interface. The standard set of protocols for host computers communicating with attached peripherals. SCSI allows connection to as many as six peripherals including printers, scanners, hard drives, zip drives, and CD-ROM drives. SCSI-2: An enhanced ANSI standard for SCSI standard for SCSI buses. It offers increased data width, increased speed, or both. SCSI bus: A parallel bus that carries data and control signals from SCSI devices to an SCSI controller. SOFTWARE RAID: Uses the server processor to perform RAID calculations. Host CPU cycles that read and write data from and to disk are taken away from applications. Software RAID is less costly than dedicated hardware RAID storage processors, but its data protection is less efficient and reliable. SWITCH: A network device that selects a path or circuit for sending a data between destinations. TERABYTE (TB): A thousand billion bytes or one thousand gigabytes. THROUGHPUT: In computers, it is a measurement of the amount of work that can be processed within a set time period. In networking, it is a measurement of the amount of data that can be successfully transferred with a set time period. VOLUME: A virtual disk into which a file system, database management system or other application places data. A volume can be a single disk partition or multiple partitions on one or more physical drives. WORKLOAD BALANCING: It’s a technique that ensures no one data path can become overloaded while others have underutilized bandwidth causing an I/O bottleneck. When one or more paths become busier than others, I/O traffic shifts from the busy paths to the others, further enhancing throughput over the already efficient multipathing method. WRITE-CACHE: A form of temporary storage in which data is stored (or cached) in memory before being written to a hard disk for permanent storage. Caching enhances overall system performance by decreasing the number of times the central processor reads and writes to a hard disk. WRITE-MODE: The state in which a program can write (record) information in a file. In the write-mode, the user is permitted to make changes in existing information. ZONING: Several devices are grouped by function or by location. All devices connected to a connectivity product may include configuration of one or more zones. Devices in the same zone can see each other; devices in different zones cannot. |
Home | Network | Security | Software | Solutions | Products
Updated on July 23, 2002
© Copyright 2002 Allan Low. All rights reserved. Reproduction of
this Web Site, in whole or in part, in any form or medium without express
written permission from the author is prohibited.