MapReduce is a simple but scalable distributed programming paradigm for data-intensive computation, and plays a fundamental role in the state-of-the-art big data processing and analytics stacks like Apache Hadoop and Spark software ecosystems. Most existing MapReduce implementations assume a homogeneous computing environment like cloud platforms. However, more and more applications are based on a lot of sensors and actuators as the front end and cloud platforms as the back end, forming an emerging environment called fog computing. As the computing nodes are heterogeneous, it is still a challenge to extend MapReduce to fog computing for efficient big data processing and analytics.
This project aims to explore this problem through implementing a basic MapReduce platform in a simulated fog computing environment. A very basic MapReduce Java implementation will be provided as the starting point. The students will extend the functionality of this basic version and apply it in a small-scale fog computing environment. The environment consists of a few sensors or actuators and cloud services. A web-based interface should also be developed to visualise the data processing progress and some statistics results.
Upon completion of the project, the students are expected to achieve
• A basic implementation of the MapReduce paradigm in a small-scale fog computing environment.
• A Web-based user interface to show the status of the system and performance statistics when applications are executed.
Lab allocations have not been finalised