Smart Data Placement for Big Data Pipelines: An Approach based on the Storage-as-a-Service Model

A. Khan,Nikolay Nikolov,M. Matskin,R.-C. Prodan,Hui Song,D. Roman,A. Soylu

Published 2022 in International Conference on Utility and Cloud Computing

ABSTRACT

The development of big data pipelines is a challenging task, especially when data storage is considered as part of the data pipelines. Local storage is expensive, hard to maintain, comes with several challenges (e.g., data availability, data security, and backup). The use of cloud storage, i.e., Storageas-a-Service (StaaS), instead of local storage has the potential of providing more flexibility in terms of such as scalability, fault tolerance, and availability. In this paper, we propose a generic approach to integrate StaaS with data pipelines, i.e., computation on an on-premise server or on a specific cloud, but integration with StaaS, and develop a ranking method for available storage options based on five key parameters: cost, proximity, network performance, the impact of server-side encryption, and user weights. The evaluation carried out demonstrates the effectiveness of the proposed approach in terms of data transfer performance and the feasibility of dynamic selection of a storage option based on four primary user scenarios.

PUBLICATION RECORD

  • Publication year

    2022

  • Venue

    International Conference on Utility and Cloud Computing

  • Publication date

    2022-12-01

  • Fields of study

    Computer Science, Engineering

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

REFERENCES

Showing 1-16 of 16 references · Page 1 of 1