Abstract
This research study focuses on analysing the role of distributed resource management in enhancing the scalability and reliability of the linked systems. This study presents a detailed analysis on the architectures, benefits, and inherent drawbacks of the Hadoop Distributed File System (HDFS) and Yet Another Resource Negotiator (YARN). YARN offers flexible resource scheduling through Fair and Capacity schedulers, while HDFS offers fault-tolerant, scalable storage through a block-based, replicated, and locality-optimized design. Although robust, limitations like resource contention in YARN and the Name Node's single point of failure in HDFS still exist. In order to address the evolving challenges in modern computing, this study also explores the potential research domains like serverless architecture for dynamic scaling, latency-conscious edge computing, and AI-based resource forecasting.
References
- Zhang, Xiaojie, and Saptarshi Debroy. "Resource management in mobile edge computing: A comprehensive survey." ACM Computing Surveys 55, no. 13s (2023): 1-37.
- Huang, Dong, Bingsheng He, and Chunyan Miao. "A survey of resource management in multi-tier web applications." IEEE Communications Surveys & Tutorials 16, no. 3 (2014): 1574-1590.
- Moreira, José E., and Vijay K. Naik. "Dynamic resource management on distributed systems using reconfigurable applications." IBM Journal of Research and Development 41, no. 3 (1997): 303-330.
- Hussain, Hameed, Saif Ur Rehman Malik, Abdul Hameed, Samee Ullah Khan, Gage Bickler, Nasro Min-Allah, Muhammad Bilal Qureshi et al. "A survey on resource allocation in high performance distributed computing systems." Parallel Computing 39, no. 11 (2013): 709-736.
- Zahoor, Saniya, and Roohie Naaz Mir. "Resource management in pervasive Internet of Things: A survey." Journal of King Saud University-Computer and Information Sciences 33, no. 8 (2021): 921-935.
- Cheng, D., Rao, J., Jiang, C., & Zhou, X. (2015, May). Resource and deadline-aware job scheduling in dynamic hadoop clusters. In 2015 IEEE International Parallel and Distributed Processing Symposium. IEEE: 956-965.
- White, T. (2012). Hadoop: The definitive guide. " O'Reilly Media, Inc.".
- Krauter, Klaus, Rajkumar Buyya, and Muthucumaru Maheswaran. "A taxonomy and survey of grid resource management systems for distributed computing." Software: Practice and Experience 32, no. 2 (2002): 135-164.
- Borthakur, Dhruba, Jonathan Gray, Joydeep Sen Sarma, Kannan Muthukkaruppan, Nicolas Spiegelberg, Hairong Kuang, Karthik Ranganathan et al. "Apache hadoop goes realtime at facebook." In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp. 1071-1080. 2011.
- Moreira, José E., and Vijay K. Naik. "Dynamic resource management on distributed systems using reconfigurable applications." IBM Journal of Research and Development 41, no. 3 (1997): 303-330.
- Chen, Chen-Chun, Kai-Siang Wang, Yu-Tung Hsiao, and Jerry Chou. "ALBERT: an automatic learning based execution and resource management system for optimizing Hadoop workload in clouds." Journal of Parallel and Distributed Computing 168 (2022): 45-56.
- Huang, Dan, Jun Wang, Qing Liu, Nong Xiao, Huafeng Wu, and Jiangling Yin. "Enhancing proportional IO sharing on containerized big data file systems." IEEE Transactions on Computers 70, no. 12 (2020): 2083-2097.
- Rao, B. Thirumala, and L. S. S. Reddy. "Survey on improved scheduling in Hadoop MapReduce in cloud environments." arXiv preprint arXiv:1207.0780 (2012).
- Yao, Yi, Han Gao, Jiayin Wang, Bo Sheng, and Ningfang Mi. "New scheduling algorithms for improving performance and resource utilization in hadoop YARN clusters." IEEE Transactions on Cloud Computing 9, no. 3 (2019): 1158-1171.
- Van Do, Tien, Binh T. Vu, Nam H. Do, Lóránt Farkas, Csaba Rotter, and Tamás Tarjányi. "Building block components to control a data rate in the Apache Hadoop compute platform." In 2015 18th International Conference on Intelligence in Next Generation Networks, IEEE, (2015): 23-29
