Tuesday, August 11, 2015

Big Data: Store data properly

How to save content to exist forever? And as can be accessed at any time and without delay it? Wide Area Storage is a promising approach.
The comprehensive analysis of data in some parts of the company long has been a reality. Internet-based marketing systems record "auto-magically" masses of information about potential customers and their preferences. Flashable digital movie cameras can be discharged within a night and used again and are so light years away from the days when every single picture was burned on an extremely expensive medium.

Not enough: then they had to be processed and edited manually with incomparably more effort. Companies generate, store and analyze increasingly HD video rather than text, which brings a hundred fold higher level of accuracy of data per user and per product itself. So for example, detect when the American racing series NASCAR 18 HD cameras the action on the racetrack, allow direct data access, search and analysis.

What is Big Data?
Only 14 percent of people know what the word "Big Data! is meant as the Bitkom has found. Expert Advisor: Carlo Velten sums up the phenomenon together in five theses.

   
Big Data is more
As IT.

Ownership and exploitation rights
About databases are decisive for competitiveness and innovation factor.

The market is still
In its infancy. Until emerge clearly defined market categories, will take years.

In the coming two to three years
The infrastructure provider who is analytics specialists and consultants will make big business.

Success or failure
Depend not only of the legal framework and public investment, but also on the confidence-building handling of customer data.

But the data are hoarding where?

In the light of technological advances such as reusable recording media, higher resolution cameras or fine granulated data collection and analysis of video takes the issue of “big data"to an immense volume. Conventional storage technologies come at the long-term provision of this data quickly reach their limits, especially also the most effective emergency access should be guaranteed at all times the full potential of the data. Because maybe turns out to be the terabytes, which carries the 3D seismic data of an oil field in itself, in the next decade as a major oil artery or a genomic profile of today provides the crucial clue for cancer cure tomorrow.

The limitations of traditional storage solutions can be personalized with object and cloud storage continues to move backward technologies. However, they can also create new operational and functional constraints. A new storage generation has the strengths of Object Storage, while maintaining the operational and functional flexibility. Wide Area Storage enables a wider use of big data while maintaining the integrity and longevity of data.

The natural limitations of RAID
But where exactly are the boundaries of traditional storage systems to overcome Object storage solutions? RAID is known to be the basis of traditional storage systems and has proven to be particularly effective for data integrity in a single group of four to twelve discs.

But records in petabytes-size need either Disk groups of more than twelve discs or the data have multiple RAID groups are distributed. The former variant increases the risk of data loss due to hardware failure and the last variant ensures a rapid increase in the cost and complexity of managing data consistency and integrity across multiple disk units.

The data growth also makes the error rate of disk drives to a real problem. If, for example, data is read by a full 3-Tbyte disk RAID array with 10 disks, a critical probability of data loss is due to a random bit error of one to three. RAID has no mechanism for proactive detection of bit errors. In addition to RAID all disks are present locally, normally on the same controller. This RAID provides a limited safety against node failures and no security against disasters at the site of the business unit.

Replication conceals unevenness of RAID

Replication is an adequate response to the problem areas of RAID. Replication is the simple definition the cross-copying of data between two or more locations in order to ensure the use of data in case of failure can. Thus, the integrity, recover ability and accessibility of data is significantly increased.

Unfortunately, replication involves all own downsides: It lowers the level of usable space and leads to new complications, which increase the cost of the storage environment tremendously higher. So replicas must always be kept far enough away from the primary data, so that they enjoy adequate protection in the event of a disaster.

The farther, the better, however, would be a fallacy, because the sync of files is necessary for proper recovery point objectives (RPO). And this requires a replication-competent, but unfortunately quite high and therefore costly network bandwidth. All in all, replication, additional disaster recovery protection, but the cost of the storage infrastructure can in extreme cases, to double.

Object Storage everything balanced?


ObjectStore offers a completely different approach to storage management. While traditional storage systems represent data in a hierarchical directory for folders and files, presenting Object Storage data in a flat object namespace (namespace) of simple key words and value pairs. This approach allows administrators to resize digital records almost limitless.

Data collection is done by means of simple network-based protocols such as HTTP. In this type of data retrieval on high-performance network switches and routers can be outsourced, with the effect that data without any virtual "Overload" numerous storage nodes can be distributed. In addition, the capacity of the systems can be expanded without downtime, performance degradation, modification or migration measures.

Another advantage of the network-friendly protocols and the distribution logic is the simple transfer of data to different data centers around the globe. While the data access over long distances automatically brings with latencies to the network protocols of Object storage systems are optimized for long distances, including network-level compression, geographical load balancing and local caching.

Security algorithms: Erasure Codes

During the first algorithms Object Storage generation already have simple forms of data protection by means of simultaneous copies of data across three or more nodes, younger implementations are equipped with much more sophisticated security algorithms. They are known as the "Erasure codes". The space communication, has uses them for decades, in order to preserve the integrity of the communication transmission.

Where RAID data separated into a fixed number of data blocks and checksums, the algorithms to convert data into solid but entirely different codes, which are separated for storage and reassembled during the call. Since each code is unique, any desired subset can be used to code to restore the data. These algorithms allow regulations that protect against the loss of disks, nodes, or even entire data centers - on a single system and with far less loss of redundancy than RAID or replication solutions. Data integrity is stronger here created by individual codes as by whole disk rows with customizable security levels within the same storage system.

Companies can have their rules for the longevity of data according to their different requirements to adapt to the backup - to copy without hardware changes and without data from the system.

Limits pure Object Storage solutions
Object Storage is similar to the Park Service in a fancy hotel. The car will comfortably parked by an employee and the employee knows exactly how he parked the car in order to exploit the existing parking space possible. The parking ticket is the key to retrieve his car. If you lose the parking ticket, you have to show at least his car documents ,including ID card to identify himself as the owner.

What is the parking ticket at the car, is the use in the object store. Alternative forms to address the data (for example, paths, search index) must be stored by the application outside of the object store. That makes it very difficult to share data across multiple applications, as long as they do not use the same object index. Also, Ad Hoc use of data by the user is enormously difficult by the key mechanism, because the data cannot be accessed from an ordinary file and folder structure. To make matters worse:

Object Store is not compatible with the fastest growing segment of the data - unstructured data.
Drastically, but not unjustified: Object storage systems are delivered to the underlying applications in terms of file access and information lifecycle management.

IT Trend Technologies - The Importance

Desktop virtualization has when it comes to relevance public, even in smaller companies.

IT Trend technologies - the main advantages
Technologies its advantages will be seen, are also used.

IT Trend technology - currently in use
Cloud Computing is currently playing a subordinate role for smaller companies.

IT Trend technologies - the use and planned
BYOD will probably continue to increase, particularly in larger companies.

IT Trend technologies - this plan of the company
Cloud computing will play an increasingly important role in large enterprises.

The future of Object Storage
The essential for the successful use of Object Storage has the ability to manage unstructured data in the object store. The most common way for companies to manage unstructured data, by means of a Network Attached Storage Systems (NAS).

To use its advantages for Object Storage, leads to interesting results. First, can migrate unstructured data in the object store by providing a classic filesystem namespace companies. This increases durability and reduces the cost of their provision. Second, CIFS and NFS protocols with numerous operating systems compatible; This guarantees to the object store access. Users can grab as well independently and as needed for data. Last but not least, IT administrators many of the traditional operational Best Practices take for data management and security to complete because the object store is displayed in a file system. This new type of Object Storage operates under Wide Area Storage (WAS).

Another crucial feature of Wide Area Storage is the ability for active data lifecycle management through the use of established storage management strategies. Solutions for "Hierarchical Storage Management" (HSM) have already asked this aptitude in traditional storage systems to the test. Companies that use HSM, see Object Storage as a logical extension of its existing rules.

Wide Area Storage as an archive solution
What can also serve as an attractive long-term archive. For Object Storage offers the same safety levels as tape - but with much lower latency. Object Store can also serve as a connection to Object Storage based cloud solutions serve. The variable operational capability opens a wide range of off-site solutions for long-term data retention.

In addition to the archiving potential Object Storage architectures are designed so that they are automatically multi-site recovery capability. Since data is distributed using standard networks via nodes, the nodes can be designed as a mix of local, regional office or several locations. If one entire data center, so the data can be in the branch office to recover anyway.

As a result, users have an automatic multi-site backup without the need for installation, configuration and coordination dedicated replication capabilities. The multi-site distribution also offers affordable data access from any location. Users can access data from the local node and written data is distributed across all branches - without the administrative headaches of management bidirectional replication.

Last but not least provides Wide Area Storage the possibility of using both file system-based clients as well as applications that have been developed specifically for the use of Object Storage. This guarantees the widest access data within an enterprise.
With the best of both worlds Wide Area Storage companies opened new application scenarios for a more extensive use of Big Data - without diluting.

No comments:

Post a Comment