{"corpus_id":23188126,"paper_sha":"50c035b9230afb642df81e7053f3300b137879d6","doi":"10.1088/1742-6596/608/1/012040","arxiv_id":null,"pmid":null,"pmcid":null,"mag_id":2284364308,"dblp_id":null,"acl_id":null,"title":"Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing","year":2015,"publication_date":"2015-05-22","venue":"","journal":{"name":"Journal of Physics: Conference Series","pages":null,"volume":"608"},"journal_issn":null,"journal_title":null,"publication_types":[],"pubmed_pub_types":null,"s2_fields_of_study":["Physics","Computer Science","Engineering"],"reference_count":9,"citation_count":26,"influential_citation_count":0,"is_open_access":true,"arxiv_categories":null,"arxiv_license":null,"arxiv_journal_ref":null,"mesh_headings":null,"chemicals":null,"comments_corrections":null,"source_flags":1,"s2_open_access_pdf_url":"https://iopscience.iop.org/article/10.1088/1742-6596/608/1/012040/pdf","s2_open_access_landing_url":"https://www.semanticscholar.org/paper/50c035b9230afb642df81e7053f3300b137879d6","s2_open_access_license":"CCBY","s2_open_access_status":"GOLD","pmc_open_access_pdf_url":null,"pmc_open_access_landing_url":null,"pmc_open_access_license":null,"pmc_open_access_status":null,"unpaywall_open_access_pdf_url":null,"unpaywall_open_access_landing_url":null,"unpaywall_open_access_license":null,"unpaywall_open_access_status":null,"abstract":"The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(102) sites, O(105) cores, O(108) jobs per year, O(103) users, and ATLAS data volume is O(1017) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center \"Kurchatov Institute\" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. We will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.","claims":[{"public_id":"cl_c47cbdab3f809ce2bed4e17b1f645587","status":"active","text":"Integrating PanDA with HPC platforms introduces new challenges in managing heterogeneity and supporting workflow.","confidence":0.86,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_c47cbdab3f809ce2bed4e17b1f645587"},{"public_id":"cl_030a50f30cc0f56ea75861e5b325d5e3","status":"active","text":"PanDA can operate as a portal independent of the underlying computing-facility infrastructure for high-energy and nuclear physics and other data-intensive science applications.","confidence":0.9,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_030a50f30cc0f56ea75861e5b325d5e3"},{"public_id":"cl_ca7ce43615d35bac61358b9258add000","status":"active","text":"PanDA manages a heterogeneous distributed infrastructure spanning about 10^2 sites, 10^5 cores, 10^8 jobs per year, 10^3 users, and roughly 10^17 bytes of ATLAS data.","confidence":0.97,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_ca7ce43615d35bac61358b9258add000"},{"public_id":"cl_6c3a33bff191cfc1f11953359bf0d58e","status":"active","text":"The BigPanDA project was launched to expand PanDA to all available computing resources, including commercial and academic clouds and Leadership Computing Facilities.","confidence":0.93,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_6c3a33bff191cfc1f11953359bf0d58e"},{"public_id":"cl_e368e7c75fdbe0f2efbc27bdfa1bca81","status":"active","text":"The integration approach reuses existing PanDA components as much as possible when adapting the system to OLCF and other supercomputers.","confidence":0.92,"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/claims/cl_e368e7c75fdbe0f2efbc27bdfa1bca81"}],"concepts":[{"public_id":"co_35c5c0cf497eb2a14059bb4331f476d8","status":"active","name":"supercomputers","description":"Very large computing systems used to run PanDA beyond traditional distributed resources.","types":["computing platform"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_35c5c0cf497eb2a14059bb4331f476d8"},{"public_id":"co_50a9cbd7ba451b6253629809ba889b30","status":"active","name":"ATLAS experiment","description":"A large high-energy physics experiment at the Large Hadron Collider that generates and processes very large data volumes.","types":["scientific experiment"],"aliases":["ATLAS"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_50a9cbd7ba451b6253629809ba889b30"},{"public_id":"co_5ed96949142cd9b5b35c6379ebcd7592","status":"active","name":"commercial and academic clouds","description":"Cloud computing resources from commercial providers and academic institutions that are targeted for PanDA integration.","types":["computing resource"],"aliases":["clouds"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_5ed96949142cd9b5b35c6379ebcd7592"},{"public_id":"co_69e3f8496a89bcf1ca5aced7ba1a7a9d","status":"active","name":"workflow","description":"The sequence and coordination of data-processing tasks managed by the system.","types":["process"],"aliases":["workflows"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_69e3f8496a89bcf1ca5aced7ba1a7a9d"},{"public_id":"co_7d4ed45613ee728550298c90bf3b720a","status":"active","name":"heterogeneous distributed computational infrastructure","description":"A computing environment composed of geographically distributed resources with diverse hardware and administrative settings.","types":["computational infrastructure"],"aliases":["heterogeneous distributed infrastructure"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_7d4ed45613ee728550298c90bf3b720a"},{"public_id":"co_8f6bfa2d42df52508cde598f57dd4c4c","status":"active","name":"heterogeneity","description":"Variation in resource types and execution environments that complicates workload management.","types":["property"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_8f6bfa2d42df52508cde598f57dd4c4c"},{"public_id":"co_a99a30b57385ee369ef7d08ed0220099","status":"active","name":"HPC platforms","description":"High-performance computing systems that provide large-scale parallel computing resources.","types":["computing platform"],"aliases":["high-performance computing platforms"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_a99a30b57385ee369ef7d08ed0220099"},{"public_id":"co_c23488118bb927353cb36f911f1b3ebd","status":"active","name":"data-intensive science applications","description":"Scientific applications characterized by large-scale data processing and movement.","types":["application domain"],"aliases":[],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_c23488118bb927353cb36f911f1b3ebd"},{"public_id":"co_c79fbb5902e8e5a623263543bfd2500b","status":"active","name":"Leadership Computing Facilities","description":"Large high-performance computing facilities used as part of the expanded PanDA resource environment.","types":["computing facility"],"aliases":["LCF"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_c79fbb5902e8e5a623263543bfd2500b"},{"public_id":"co_cd92183848a2180c88edfe23b50d6d28","status":"active","name":"PanDA","description":"A workload management system used to schedule and manage large-scale data processing workflows.","types":["workload management system","software system"],"aliases":["Production and Data Analysis Workload Management System","PanDA WMS"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_cd92183848a2180c88edfe23b50d6d28"},{"public_id":"co_e61bd7c2369246f24126f5916ee0cdc5","status":"active","name":"Oak Ridge Leadership Computing Facility","description":"A leadership computing site where PanDA is being tailored and run.","types":["computing facility"],"aliases":["OLCF"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_e61bd7c2369246f24126f5916ee0cdc5"},{"public_id":"co_e9e348df065bab3ada1ec76f4295a9b3","status":"active","name":"High Energy and Nuclear Physics","description":"A scientific application domain that uses PanDA for data-intensive computing.","types":["research field"],"aliases":["HEP"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_e9e348df065bab3ada1ec76f4295a9b3"},{"public_id":"co_faad3a8538f01d7a053f7399315e85b0","status":"active","name":"BigPanDA project","description":"The project titled Next Generation Workload Management and Analysis System for Big Data, intended to extend PanDA to broader computing resources.","types":["project"],"aliases":["Next Generation Workload Management and Analysis System for Big Data"],"contributors":[{"id":1,"public_id":"12632b8b5f","public_label":"Anonymous (12632b8b5f)","roles":["extraction"],"url":"https://sah.borca.ai/u/12632b8b5f"}],"url":"https://sah.borca.ai/concepts/co_faad3a8538f01d7a053f7399315e85b0"}],"external_ids":{"DOI":"10.1088/1742-6596/608/1/012040","ArXiv":null,"PubMed":null,"PubMedCentral":null,"MAG":2284364308,"DBLP":null,"ACL":null},"open_access":{"is_open_access":true,"pdf_url":"https://iopscience.iop.org/article/10.1088/1742-6596/608/1/012040/pdf","landing_url":"https://www.semanticscholar.org/paper/50c035b9230afb642df81e7053f3300b137879d6","source":"semantic_scholar","pdf_url_source":"semantic_scholar_open_access_pdf","license":"CCBY","status":"GOLD","reason":null},"reference_availability":{"status":"available","references_indexed":true,"full_text_available":false,"full_text_source":null,"count_basis":"semantic_scholar_metadata","extraction_status":"not_applicable","reason":null},"source":{"provider":"episteme2","base_corpus":"semantic_scholar_dump","freshness_mode":"unknown","basis":["semantic_scholar_metadata","postgres_metadata"],"limits":["paper metadata is based on indexed upstream scholarly datasets","claims and concepts are available only for extracted papers","absence of claims or concepts means no extracted graph data is available in this response"],"status":"available","degraded":false,"degraded_reasons":[],"diagnostics":{"status":"available","degraded":false,"degraded_reasons":[],"metadata_status":"available","graph_status":"available","abstract_status":"available"},"source_flags":1},"paper_id":630557,"paper_uid":"d1cecd74-9c58-4c04-93e9-1b0fb85f989f","canonical_identity":{"paper_id":630557,"paper_uid":"d1cecd74-9c58-4c04-93e9-1b0fb85f989f","identity_status":"available","lookup_basis":"semantic_scholar_external_id","compatibility_path":"corpus_id"},"url":"https://sah.borca.ai/papers/23188126"}