More Videos...
 

Efficient data layouts for cost-optimized Map-Reduce operations

Efficient data layouts for cost-optimized Map-Reduce operations The MapReduce programming model accepted by Hadoop and other Big Data technologies is a powerful tool to address Big Data analysis problem. It is becoming ubiquitous, but still there are issues in concern with its performance and efficiency. It offers high scalability and fault tolerance in large scale data processing, but gives low efficiency. Hence, how to enhance efficiency with high level of scalability and fault tolerance is a major challenge. The efficiency problem, especially I/O costs can be addressed in two ways: by optimizing I/O operations in Map-Reduce and by utilizing the features of modern hardware such as SSD (Solid State Disk) that can help in minimizing computations in Map-Reduce considerably. This paper explores various existing data layout structures that can improve the efficiency of map-reduce operations and help in overcoming its pitfalls.

Recent Projects

More +