Research - Xiaojun Ruan

Prospective students: If you would like to work with me for your capstone projects, theses, or independent study (CS490, CS690, CS693, or CS699), email me your resume and transcripts at xiaojun.ruan@csueastbay.edu for more details. I am looking forward to teaming with highly-motivated, diligent, and passionate students.

My research topics lie in the following fields.

Parallel and Distributed Computing

Cloud Computing

Computer Security

Data Storage Systems

Machine Learning and AI

Past Research Projects

Cloud Computing

Virtual Machine (VM) Migration and Allocation in Clouds

The last decade witnesses a dramatic advance of cloud computing research and techniques. One of the key faced challenges in this field is how to reduce the massive amount of energy consumption in cloud computing data centers. To address this issue, many power-aware virtual machine (VM) allocation and consolidation approaches are proposed to reduce energy consumption efficiently. However, most of those existing efficient cloud solutions save energy cost at a price of the significant performance degradation. We propose a mathematical model to calculate the optimized working utilization levels for host computers to work at. Because the performance and power data need to be measured on real platform, to make our design practical, we design a strategy named ”PPRGear” which is based on the sampling of the utilization levels with distinct performance-to-power ratios. In addition, we present a framework for virtual machine allocation and migration which leverages the performance-to-power ratios for various host types. By achieving the optimal balance between host utilization and energy consumption, our framework is able to make sure that host computers run at the most power-efficient utilization levels (i.e., the levels with highest Performance-to-Power ratios) so that the energy consumption can be tremendously reduced with ignorable sacrifice of performance. Our extensive experiments with real world traces show that compared with three baseline energy-efficient VM allocation and selection algorithms, our framework is able to reduce the energy consumption up to 69.31% for various host computer types with fewer migration and shutdown times and little performance degradation for cloud computing data centers.

Adaptive Preshuffling in Hadoop Cluster

MapReduce has become an important distributed processing model for large-scale data-intensive applications like data mining and web indexing. Hadoop–an open-source implementation of MapReduce is widely used for short jobs requiring low response time. We proposed a new preshuffling strategy in Hadoop to reduce high network loads imposed by shuffle-intensive applications. Designing new shuffling strategies is very appealing for Hadoop clusters where network interconnects are performance bottleneck when the clusters are shared among a large number of applications. The network interconnects are likely to become scarce resource when many shuffle-intensive applications are sharing a Hadoop cluster. We implemented the push model along with the preshuffling scheme in the Hadoop system, where the 2-stage pipeline was incorporated with the preshuffling scheme. We implemented the push model and a pipeline along with the preshuffling scheme in the Hadoop system. Using two Hadoop benchmarks running on the 10-node cluster, we conducted experiments to show that preshuffling-enabled Hadoop clusters are faster than native Hadoop clusters. For example, the push model and the preshuffling scheme powered by the 2-stage pipeline can shorten the execution times of the WordCount and Sort Hadoop applications by an average of 10% and 14%, respectively.

Computer Security

ES-MPICH2: Message Passing Interface with Enhanced Security

An increasing number of commodity clusters are connected to each other by public networks, which have become a potential threat to security sensitive parallel applications running on the clusters. To address this security issue, we developed a Message Passing Interface (MPI) implementation to preserve confidentiality of messages communicated among nodes of clusters in an unsecured network. We focus on MPI rather than other protocols, because MPI is one of the most popular communication protocols for parallel computing on clusters. Our MPI implementation—called ES-MPICH2—was built based on MPICH2 developed by the Argonne National Laboratory. Like MPICH2, ES-MPICH2 aims at supporting a large variety of computation and communication platforms like commodity clusters and high-speed networks. We integrated encryption and decryption algorithms into the MPICH2 library with the standard MPI interface and; thus, data confidentiality of MPI applications can be readily preserved without a need to change the source codes of the MPI applications. MPI-application programmers can fully configure any confidentiality services in MPICHI2, because a secured configuration file in ES-MPICH2 offers the programmers flexibility in choosing any cryptographic schemes and keys seamlessly incorporated in ES-MPICH2. We used the Sandia Micro Benchmark and Intel MPI Benchmark suites to evaluate and compare the performance of ES-MPICH2 with the original MPICH2 version. Our experiments show that overhead incurred by the confidentiality services in ES-MPICH2 is marginal for small messages. The security overhead in ES-MPICH2 becomes more pronounced with larger messages. Our results also show that security overhead can be significantly reduced in ES-MPICH2 by high-performance clusters. The executable binaries and source code of the ES-MPICH2 implementation are freely available at http:// www.eng.auburn.edu/~xqin/software/es-mpich2/.

Storage Systems

Flash Translation Layer (FTL) Caching Algorithm for Flash Based Storage Systems

Most researches of Solid State Drives (SSDs) architectures rely on Flash Translation Layer (FTL) algorithms and wear-leveling; however, internal parallelism in Solid State Drives has not been well explored. In this research, we proposed a new strategy to improve SSD write performance by enhancing internal parallelism inside SSDs. A SDRAM buffer is added in the design for buffering and scheduling write requests. Because the same logical block numbers may be translated to different physical numbers at different times in FTL, the on-board SDRAM buffer is used to buffer requests at the lower level of FTL. When the buffer is full, same amount of data will be assigned to each storage package in SSDs to enhance internal parallelism. To accurately evaluate performance, we use both synthetic workloads and realworld applications in experiments. We compare the enhanced internal parallelism scheme with the traditional LRU strategy since it is unfair to compare an SSD having buffer with an SSD without a buffer. The simulation results demonstrate that the writing performance of our design is significantly improved compared with the LRU-cache strategy with the same amount of buffer sizes.

Energy-Efficient Computing

Buffer-Disk Architecture for Energy Conservation in Parallel Disk Systems

Cluster storage systems are essential building blocks for many high-end computing infrastructures. Although energy conservation techniques have been intensively studied in the context of clusters and disk arrays, improving energy efficiency of cluster storage systems remains an open issue. To address this problem, we describe an approach to implementing an energy-efficient cluster storage system or ECOS for short. ECOS relies on the architecture of cluster storage systems in which each I/O node manages multiple disks - one buffer disk and several data disks. Given an I/O node, the key idea behind ECOS is to redirect disk requests from data disks to the buffer disk. To balance I/O load among I/O nodes, ECOS might redirect requests from one I/O node into the others. Redirecting requests is a driving force of energy saving and the reason is two-fold. First, ECOS makes an effort to keep buffer disks active while placing data disks into standby in a long time period to conserve energy. Second, ECOS reduces the number of disk spin downs/ups in I/O nodes. The idea of ECOS was implemented in a Linux cluster, where each I/O node contains one buffer disk and two data disks. Experimental results show that ECOS improves the energy efficiency of traditional cluster storage systems where buffer disks are not employed. Adding one extra buffer disk into each I/O node seemingly has negative impact on energy saving. Interestingly, our results indicate that ECOS equipped with extra buffer disks is more energy efficient than the same cluster storage system without the buffer disks. The implication of the experiments is that using existing data disks in I/O nodes to perform as buffer disks can achieve even higher energy efficiency.

Energy-Efficient Scheduling Algorithm Using Dynamic Voltage and Frequency Scaling for Parallel Applications

Scheduling parallel applications on large-scale clusters is technically challenging due to communication latencies and high energy consumption. As such, shortening schedule lengths and saving energy are two major concerns in the design of economical and environmentally friendly clusters. Although the existing dynamic voltage scaling technique (a.k.a., DVS) can be employed to reduce energy consumption of parallel applications running on clusters, DVS can inevitably lead to increased execution times of parallel tasks by lowering processor voltages. To solve this performance problem while improving energy efficiency of clusters, we propose a scheduling algorithm called TADVS to judiciously exploit processor idle times among parallel tasks to provide energy savings for both high-performance clusters and mobile clusters. The TADVS algorithm first aims to discover idle time intervals incurred by tasks with precedence constraints. Then, TADVS applies DVS to lower voltages when processors are sitting idle due to the precedence constraints. Therefore, TADVS makes use of DVS to conserve energy provided that the schedule lengths of parallel applications are not increased. Experimental results clearly show that TADVS is capable of reducing energy dissipation in largescale clusters without adversely affecting system performance.