The University of Auckland

Project #87: Data Analytics with Apache Spark MLlib and FPGA-based Accelerators for IoT-based Applications



The amount of data generated in Internet of Things (IoT) has increased considerably and reached what we categorise the big data. This requires efficient methods to deal big data analytics tasks. Furthermore, machine learning (ML)-based methods have been applied in data analytics applications in both research and industry; however, the studies of using ML in large-scale analytics are still limited due to challenges of characteristics of big data, such as high volume, variety, velocity and veracity. 

Spark MLlib, a package of Apache Spark (Spark) programming framework, offers fast, flexible, and scalable implementations of numerous ML algorithms. It also supports parallel and distributed processing on clusters of computers in big data analytics systems, which helps significantly improve performance for these systems. Moreover, hardware (e.g. FPGA-based) accelerators proved as one of the most effective ways to accelerate ML-based analytics tasks, as well as in reduction of the energy consumption in these system.  

The project will specifically look at big data analytics in IoT context, where the processing is performed using Spark MLlib deployed on a cluster of commodity computers along with employing FPGA-based hardware accelerators. 


Identify problems that need to be resolved with IoT-based big data analytics.

Develop a formal big data analytics model with the use of Spark MLlib.

Demonstrate the proposed model executing on a parallel/distributed computing platform on a real cluster of commodity computers.

Investigate how processing can be accelerated by extending above mentioned platform extended with hardware (FPGA-based) accelerators.


Knowledge of the basic of machine learning; knowledge of the FPGAs if the project goes in the direction of hardware/FPGA accelerators







Lab allocations have not been finalised