Push IO

Server-Push Data Access Architecture

Introduction
Motivation
Publications
Related Links
Trace files

What is Server-Push Data Access Architecture?

The main idea behind Server-Push Data Access architecture is using a dedicated server to push data from its source to destination in time, before the data is requested. The dedicated server predicts future data accesses of a CPU or of a computing node and delivers that data. The prediction can be performed either by observing past history of data accesses or by running a thread ahead of a main computing thread or by using programmer/compiler generated hints. The server adjusts when to push data by observing time delays between the source and destination of predicted data. The data can be pushed either from memory to cache memories or from disks to memory.

Motivation

The advance of computing and data access technologies is unbalanced. Following the Moore Æs law, CPU performance has been improving rapidly (52% until 2004 and 25% since then ), while memory performance only has been improved 7% per year on average during the last 20 years. Disk access performance is even worse, where data seek time is more than 1.E+06 times slower than CPU cycle time. Caching and prefetching are the commonly used methods to mask the performance disparity between computing and data access performance. Caching holds data temporarily, while prefetching fetches the data to a cache closer to the computing processor before it is requested. Various prefetching strategies have been proposed during the past two decades, however, their performance varies largely from application to application, and is generally poor. One noticeable reason for such poor performance is that current prefetching is based on client-initiated prefetching, where the computing processor or node initiates prefetching. While letting a computing processor/node prefetch required data sounds to be a straightforward solution, client-initiated prefetching has many limitations. For instance, predicting what data to fetch require computing power, aggressive (accurate) prediction algorithms requires more computing power, which leads either to untimely prefetching (poor prefetching accuracy) or to degrade computing process performance. In some cases, predicted prefetching instructions are given lower priority than original data accesses. False or late predictions are useless. To address these limitations, we propose to using a dedicated server to perform predictions for the computing processor/node and then push the data at predicted locations to its destination by the time it is required. The purpose of dedicated server is to provide data push service and to prefetch based on data access prediction. This separates computing and data access operations effectively, which has several benefits. First, a dedicated server can adapt to complex prediction algorithms for more aggressive prediction and can push data into multiple computation processors/nodes. This is especially beneficial for HEC, where parallel processing is often achieved with the SPMD model . Second, the dedicated server is flexible to choose prediction algorithms dynamically to predict future accesses based on data access history or on pre-execution. Instead of looking for a single algorithm to predict all data access patterns, which does not exist, the server can adaptively choose a prediction methods. This, again, is very beneficial to HEC where few of the so called ôgrand challenge applicationsö, which often run repeatedly. Third, the server uses temporal data access information to predict when to push data. This gives an opportunity to modify prefetch distance adaptively and to avoid fetching data too late.

We are exploring to apply Server-Push Data Access Architecture at two levels: memory level and disk level. At memory level prefetching, we use Data Push Server (DPS) and at disk level for parallel computing, we use File Access Server (FAS) to perform prediction and push service. More details on DPS are available here. More details on FAS are available at Push-IO Wiki.

Publications

Surendra Byna, Yong Chen, William Gropp, Xian-He Sun, and Rajeev Thakur, "POSTER: The Server-Push I/O Architecture for High-End Computing", to appear in the International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing 2007), Nov. 2007.

Yong Chen, Surendra Byna, and Xian-He Sun. Data Access History Cache and Associated Data Prefetching Mechanisms, to appear in the International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing 2007), Nov. 2007.

Xian-He Sun, Surendra Byna, and Yong Chen, Server-based Data Push Architecture for Multi-processor Environments, Journal of Computer Science and Technology (JCST), Sept. 2007.

Xian-He Sun, Surendra Byna, and Yong Chen, "Improving Data Access Performance with Server Push Architecture", in Proceedings of the NSF Next Generation Software Program Workshop (in conjunction with IPDPS '07), March 2007.

Surendra Byna, Xian-He Sun, and Yong Chen, "Server-based Data Push for Multi-processor Environments", IIT CS TR-2006-031, September 2006 (contact).

Xian-He Sun and Surendra Byna, "Data-access Memory Servers for Multi-processor Environments", IIT CS TR-2005-001, November 2005 (contact).