Data intensive computing (DIC) provides a high performance computing approach to process large volume of data. In this study, a new formalization is introduced to present the two-stage DIC task execution in a stream manner. A novel heuristic algorithm is proposed for the scheduling problem due to the NP complexity. The theoretical approximation ratio bounds for the heuristic are analyzed and confirmed by the experimental evaluation. Overall, we observe that the proposed method conducts average 1.2 times makespan than the theoretic bound of the optimal solution. Besides, the proposed method outperforms the GA and FIFO scheduling schemes with overall improvements.
- Data intensive computing
ASJC Scopus subject areas
- Theoretical Computer Science
- Computer Networks and Communications
- Computational Theory and Mathematics
- Applied Mathematics