Skip to Main Content

INF 618 Big Data and Analytics: Reading list

INF 618 Big Data and Analytics


Department:  PhD in Computer Science 

Module Description: This module provides students with an opportunity to gain an in depth understanding of the theories and issues on analytics and big data. In addition to covering Big Data technologies, such as Map Reduce concepts, Hadoop and HDFS. The course will cover how big data is collected, stored (Relational Algebra operators vs SQL syntax, Data Mining using SQL), and analysed (statistical, visualization, classification, and clustering techniques). Students will also be exposed to special types of datasets, including graphs and time series. Students will also learn about the main challenges faced when dealing with big data. Practical case studies will be used for illustration.


Module texts

  • Leskovec, J., Rajaraman, A. and Ullman, J. D. (2020). Mining of massive datasets. 3rd edn. Cambridge: Cambridge University Press. 

Recommended readings

             http://jakevdp.github.io/blog/2014/03/11/frequentism-and-bayesianism-a-practical-intro/

             http://jakevdp.github.io/blog/2014/06/06/frequentism-and-bayesianism-2-when-results-differ/

             http://jakevdp.github.io/blog/2014/06/12/frequentism-and-bayesianism-3-confidence-credibility/

             http://jakevdp.github.io/blog/2014/06/14/frequentism-and-bayesianism-4-bayesian-in-python/

Ask a Librarian for help to find and evaluate resources