[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Condor-users] CFP: IEEE Transactions on Parallel and Distributed Systems, Special Issue on Many-Task Computing on Grids and Supercomputers



Call for Papers

---------------------------------------------------------------------------------------
IEEE Transactions on Parallel and Distributed Systems
Special Issue on Many-Task Computing on Grids and Supercomputers
http://dsl.cs.uchicago.edu/TPDS_MTC/  

=======================================================================================
The Special Issue on Many-Task Computing (MTC) will provide the scientific community a 
dedicated forum, within the prestigious IEEE Transactions on Parallel and Distributed 
Systems Journal, for presenting new research, development, and deployment efforts of 
loosely coupled large scale applications on large scale clusters, Grids, Supercomputers, 
and Cloud Computing infrastructure. MTC, the focus of the special issue, encompasses 
loosely coupled applications, which are generally composed of many tasks (both 
independent and dependent tasks) to achieve some larger application goal.  This special 
issue will cover challenges that can hamper efficiency and utilization in running 
applications on large-scale systems, such as local resource manager scalability and 
granularity, efficient utilization of the raw hardware, parallel file system contention 
and scalability, data management, I/O management, reliability at scale, and application 
scalability. We welcome paper submissions on all topics related to MTC on large scale 
systems.  For more information on this special issue, please see 
http://dsl.cs.uchicago.edu/TPDS_MTC/.

Scope
---------------------------------------------------------------------------------------
This special issue will focus on the ability to manage and execute large scale 
applications on today's largest clusters, Grids, and Supercomputers. Clusters with tens 
of thousands of processor cores are readily available, Grids (i.e. TeraGrid) with a 
dozen sites and 100K+ processors, and supercomputers with up to 200K processors (i.e. 
IBM BlueGene/L and BlueGene/P, Cray XT5, Sun Constellation), are all now available to 
the broader scientific community for open science research. Large clusters and 
supercomputers have traditionally been high performance computing (HPC) systems, as 
they are efficient at executing tightly coupled parallel jobs within a particular 
machine with low-latency interconnects; the applications typically use message passing 
interface (MPI) to achieve the needed inter-process communication. On the other hand, 
Grids have been the preferred platform for more loosely coupled applications that tend 
to be managed and executed through workflow systems, commonly known to fit in the 
high-throughput computing (HTC) paradigm.

Many-task computing (MTC) aims to bridge the gap between two computing paradigms, HTC 
and HPC. MTC is reminiscent to HTC, but it differs in the emphasis of using many 
computing resources over short periods of time to accomplish many computational tasks 
(i.e. including both dependent and independent tasks), where the primary metrics are 
measured in seconds (e.g. FLOPS, tasks/s, MB/s I/O rates), as opposed to operations 
(e.g. jobs) per month. MTC denotes high-performance computations comprising multiple 
distinct activities, coupled via file system operations. Tasks may be small or large, 
uniprocessor or multiprocessor, compute-intensive or data-intensive. The set of tasks 
may be static or dynamic, homogeneous or heterogeneous, loosely coupled or tightly 
coupled. The aggregate number of tasks, quantity of computing, and volumes of data may 
be extremely large. MTC includes loosely coupled applications that are generally 
communication-intensive but not naturally expressed using standard message passing 
interface commonly found in HPC, drawing attention to the many computations that are 
heterogeneous but not "happily" parallel.

There is more to HPC than tightly coupled MPI, and more to HTC than embarrassingly 
parallel long running jobs. Like HPC applications, and science itself, applications 
are becoming increasingly complex opening new doors for many opportunities to apply 
HPC in new ways if we broaden our perspective. Some applications have just so many 
simple tasks that managing them is hard. Applications that operate on or produce 
large amounts of data need sophisticated data management in order to scale. There 
exist applications that involve many tasks, each composed of tightly coupled MPI 
tasks. Loosely coupled applications often have dependencies among tasks, and typically 
use files for inter-process communication. Efficient support for these sorts of 
applications on existing large scale systems will involve substantial technical 
challenges and will have big impact on science.

Today's existing HPC systems are a viable platform to host MTC applications. However, 
some challenges arise in large scale applications when run on large scale systems, 
which can hamper the efficiency and utilization of these large scale systems.  These 
challenges vary from local resource manager scalability and granularity, efficient 
utilization of the raw hardware, parallel file system contention and scalability, data 
management, I/O management, reliability at scale, application scalability, and 
understanding the limitations of the HPC systems in order to identify good candidate 
MTC applications. Furthermore, the MTC paradigm can be naturally applied to the emerging 
Cloud Computing paradigm due to its loosely coupled nature, which is being adopted by 
industry as the next wave of technological advancement to reduce operational costs while 
improving efficiencies in large scale infrastructures.   

For an interesting discussion in a blog by Ian Foster on the difference between MTC and 
HTC, please see his blog at http://ianfoster.typepad.com/blog/2008/07/many-tasks-comp.html.  
The proposed editors also published several papers highly relevant to this special issue. 
One paper is titled "Toward Loosely Coupled Programming on Petascale Systems", and was 
published in IEEE/ACM Supercomputing 2008 (SC08) Conference; the second paper is titled 
"Many-Task Computing for Grids and Supercomputers", which was published in the IEEE 
Workshop on Many-Task Computing on Grids and Supercomputers 2008 (MTAGS08). To see last 
year's workshop program agenda, and accepted papers and presentations, please see 
http://dsl.cs.uchicago.edu/MTAGS08/. To see this year's workshop web site, see 
http://dsl.cs.uchicago.edu/MTAGS09/.

Topics
---------------------------------------------------------------------------------------
Topics of interest include, but are not limited to:
*       Compute Resource Management in large scale clusters, large Grids, Supercomputers, 
        or Cloud Computing infrastructure 
        o       Scheduling
        o       Job execution frameworks
        o       Local resource manager extensions
        o       Performance evaluation of resource managers in use on large scale systems
        o       Challenges and opportunities in running many-task workloads on HPC systems
        o       Challenges and opportunities in running many-task workloads on Cloud 
        Computing infrastructure
*       Data Management in large scale Grid and Supercomputer environments: 
        o       Data-Aware Scheduling
        o       Parallel File System performance and scalability in large deployments
        o       Distributed file systems
        o       Data caching frameworks and techniques
*       Large-Scale Workflow Systems
        o       Workflow system performance and scalability analysis
        o       Scalability of workflow systems
        o       Workflow infrastructure and e-Science middleware
        o       Programming Paradigms and Models
*       Large-Scale Many-Task Applications
        o       Large-scale many-task applications
        o       Large-scale many-task data-intensive applications
        o       Large-scale high throughput computing (HTC) applications
        o       Quasi-supercomputing applications, deployments, and experiences 

Paper Submission and Publication
---------------------------------------------------------------------------------------
Authors are invited to submit papers with unpublished, original work of not more than 
14 pages of double column text using single spaced 9.5 point size on 8.5 x 11 inch 
pages and 0.5 inch margins 
(http://www2.computer.org/portal/c/document_library/get_file?uuid=02e1509b-5526-4658-afb2-fe8b35044552&groupId=525767). 
Papers will be peer-reviewed, and accepted papers will be published in the IEEE digital 
library. Submitted articles must not have been previously published or currently 
submitted for journal publication elsewhere. As an author, you are responsible for 
understanding and adhering to our submission guidelines. You can access them by clicking 
on the following web link: http://www.computer.org/mc/tpds/author.htm. Please thoroughly 
read these before submitting your manuscript. 

Please submit your paper to Manuscript Central at http://cs-ieee.manuscriptcentral.com/. 
Please feel free to contact the Peer Review Publications Coordinator, Annissia Bryant at 
tpds@xxxxxxxxxxxx or the guest editors at foster@xxxxxxx, iraicu@xxxxxxxxxxxxxxx, or 
yozha@xxxxxxxxxxxxx if you have any questions. For more information on this special issue, 
please see http://dsl.cs.uchicago.edu/TPDS_MTC/. 


Important Dates
---------------------------------------------------------------------------------------
*       Abstract Due:                   December 14th, 2009
*       Papers Due:                     December 21st, 2009
*       First Round Decisions:          February 22nd, 2010
*       Major Revisions if needed:      April 19th, 2010
*       Second Round Decisions:         May 24th, 2010
*       Minor Revisions if needed:      June 7th, 2010
*       Final Decision:                 June 21st, 2010
*       Publication Date:               November, 2010



Guest Editors and Potential Reviewers
---------------------------------------------------------------------------------------
Special Issue Guest Editors
*       Ian Foster, University of Chicago & Argonne National Laboratory
*       Ioan Raicu, Northwestern University
*       Yong Zhao, Microsoft


-- 
=================================================================
Ioan Raicu, Ph.D.
NSF/CRA Computing Innovation Fellow
=================================================================
Center for Ultra-scale Computing and Information Security (CUCIS)
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Rd, Tech M384 
Evanston, IL 60208-3118
=================================================================
Cel:   1-847-722-0876
Tel:   1-847-491-8163
Email: iraicu@xxxxxxxxxxxxxxxxxxxxx
Web:   http://www.eecs.northwestern.edu/~iraicu/
       https://wiki.cucis.eecs.northwestern.edu/
=================================================================
=================================================================


-- 
=================================================================
Ioan Raicu, Ph.D.
NSF/CRA Computing Innovation Fellow
=================================================================
Center for Ultra-scale Computing and Information Security (CUCIS)
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Rd, Tech M384 
Evanston, IL 60208-3118
=================================================================
Cel:   1-847-722-0876
Tel:   1-847-491-8163
Email: iraicu@xxxxxxxxxxxxxxxxxxxxx
Web:   http://www.eecs.northwestern.edu/~iraicu/
       https://wiki.cucis.eecs.northwestern.edu/
=================================================================
=================================================================