Home
Data Challenge Definitions

Data Challenge


The purpose of our Data Challenge is to deploy a CPU consuming application generating large data flows - millions of files summing up to a few TB - to test the grid infrastructure and services. It is also a mean to run a scientific challenge that could not be solved without the grid. Indeed, the grid added value lies not only in the computing resources made available, but also already in the permanent storage of the data. The Data Challenge WISDOM is the first biomedical data challenge in the EGEE project.
 



Computing Element


The Computing Element (CE) is the service representing a computing resource. Its main functionality is job management (job submission, job control, etc.). The CE may be used by a generic client: an end-user interacting directly with the Computing Element, or the Workload Manager, which submits a given job to an appropriate CE found by a matchmaking process. For job submission, the CE can work in push model (where the job is pushed to a CE for its execution) or pull model (where the CE is asking the Workload Management Service for jobs). Besides job management capabilities, a CE must also provide information describing itself. A CE is identified by a string like 'hostname':'port'/'batch_queue_name'.
 



Storage Element


The Storage Element (SE) is the service which allows a user or an application to store data for future retrieval. Even if it is foreseen for the future, currently there is no enforcement or policies for volatile and permanent space. All data in a SE must therefore be considered permanent and it is user responsability to manage the available space in a SE. The SE may control simple disk servers, large disk arrays, or Mass Storage Systems (MSS).
 



Resource Broker


The Resource Broker (RB) is a middleware that supplies distributed clients with job execution at the more likely Computing Element (CE) in a heterogeneous computing environment. Client applications are provided with a set of API for sending requests and receiving response to/from RB servers. The RB serveris responsible for carrying out tasks to satisfy the client requests. These tasks include interacting with the Replica Catalog (RC) to resolve Logical data set names as well as to find a preliminary set of stes where the required data are stored, performing job submission and cancellation by interacting with the Job Submission Service (JSS), listing the more likely resources to execute a job at, and retrieving job outputs on behalf of the clients.
 



Instance


An instance is an independant and unique set of jobs defined by a software, a target, a parameter settings and a ligands database. 2 software, 10 targets, 4 parameter settings and 3 ligands database are used during the data challenge.
Possible abreviations are S[1-2]T[1-10]P[1-4]D[1-3].
 



Job status


A job can find itself in one of several possible states. Also, only some transitions between states are allowed. These transitions are the following:

  • Submitted : The job has been submitted by the user but not yet processed by the Network server
  • Waiting : The job has been accepted by the Network server but not yet processed by the Workload Manager
  • Ready : The job has been assigned to a Computing Element but not yet transferred into it
  • Scheduled : The job is waiting in the Computing Element's queue
  • Running : The job is running
  • Done : The job is finished
  • Aborted : The job has been aborted by the WMS (e.g. becasue it was too long, or the proxy certificate expired,...)
  • Cancelled : The job has been cancelled by the user
  • Cleared : The Output Sandbox has been transferred to the User Interface



Copyright © 2005 - LPC Clermont-Ferrand Webmaster: M. Reichstadt