Programme Structure for 2024/2025
Curricular Courses | Credits | |
---|---|---|
Big Data
6.0 ECTS
|
Mandatory Courses | 6.0 |
Artificial Intelligence
6.0 ECTS
|
Mandatory Courses | 6.0 |
Data Analysis Programming
6.0 ECTS
|
Mandatory Courses | 6.0 |
Big Data
By the end of the course, students should be able to:
LO1. Understand and identify the problems associated with processing large amounts of information.
LO2. Understand the concepts and ecosystem of Big Data.
LO3. Know how to design and implement fault-tolerant data storage solutions in a distributed environment.
LO4. Know how to extract, manipulate and load large amounts of information from unstructured data sources.
LO5. Know how to manipulate and process non-relational databases.
LO6. Understand and apply distributed programming and computing models.
LO7. Understand and know how to apply techniques for handling JSON structures and real-time data flow.
LO8. Develop creativity, technological innovation and critical thinking.
LO9. Develop self-learning, peer review, teamwork, written and oral expression.
S1. The concept of Big Data, the applicable problems and the respective ecosystem.
S2. Introduction to non-relational databases and MongoDB.
S3. Computing architecture for Big Data: (1) redundant and fault-tolerant and (2) distributed to support large volumes of data. Example of the Hadoop platform and its distributed file system.
S4. The MapReduce programming model.
S5. Database design in MongoDB.
S6. Handling JSON structures and real-time data.
S7. The ETL (Extract, Transform and Load) process applied to datasets with denormalized real data and the development of Big Data processing applications in Spark and MongoDB environments.
This course follows the model of assessment throughout the semester (ATS), which does not include a final exam with a weighting of 100%.
The ATS consists of the following elements:
- 8 weekly assignments [2.5% * 8 = 20% in total]
- 2 mini-tests [15% each * 2 = 30% in total]
- Laboratory project [50%]
The laboratory project can be carried out individually or in groups. It consists of drawing up a practical project which will then be the subject of an individual oral discussion.
If the student fails the regular exam (<10 marks), the student will sit the 1st or 2nd exam, worth 50% of the mark, and it is compulsory to pass the laboratory project or carry out an individual project (50%). If the student does not pass the laboratory project or the individual project (if applicable), they have failed the course.
Title: 1. Nudurupati, S. (2021). Essential PySpark for Scalable Data Analytics: A beginner’s guide to harnessing the power and ease of PySpark 3. Packt Publishing.
2. Sardar, T. H. (2023). Big data computing: Advances in technologies, methodologies, and applications. CRC Press.
3. Tandon, A., Ryza, S., Laserson, U., Owen, S., & Wills, J. (2022). Advanced analytics with PySpark: Patterns for learning from data at scale using Python and Spark. O’Reilly Media.
Authors:
Reference: null
Year:
Artificial Intelligence
Upon completion of the course, students should:
LO1: Recognize the advantages and challenges of using Artificial Intelligence (AI) techniques and approaches, demonstrating critical awareness of informed and uninformed search methods.
LO2: Select and justify the most appropriate technological approaches and algorithms, including search methods, representation, and reasoning logics.
LO3: Apply the concepts and techniques discussed in the design and development of AI-based systems, as well as in the modeling of examples based on real scenarios.
LO4: Develop, implement, and evaluate solutions involving predicate logic and logic programming.
LO5: Understand the fundamentals of genetic algorithms, being able to implement and adapt them to solve specific problems.
LO6: Work autonomously and in groups to develop projects that apply the acquired knowledge, demonstrating the ability to adapt and solve complex problems in the AI field.
S1: Fundamental notions of AI with emphasis on the search-based approach.
S2: Search algorithms: depth first and breadth first, A*, greedy BFS, Dijkstra.
S3: Fundamental notions relating to knowledge, representation and the architecture of knowledge-based systems.
S4: First-order predicate logic: representation and deduction.
S5: Declarative knowledge represented in Logic Programming.
S6: Genetic algorithms.
Assessment throughout the semester consists of 3 assessment blocks (AB), and each AB consists of one or more assessment moments. It is organised as follows:
- AB1: 4 mini-tasks [7.5% each mini-task * 4 = 30%]
- AB2: 2 mini-tests [20% each mini-test * 2 = 40%]
- AB3: 1 project in Artificial Intelligence [30%]
Assessment by exam:
- 1st Season [100%]
- 2nd Season [100%]
All blocks of periodic assessment (BA1, BA2 and BA3) have a minimum mark of 8.5. In any BA, it may be necessary to hold an individual oral discussion to assess knowledge.
Assessment by examination consists of a written exam covering all the knowledge set out in the syllabus of the course, with a weighting of 100 per cent.
Attendance at classes is not compulsory.
Title: Bishara, M. H. A., & Bishara, M. H. A. (2019). Search algorithms types: Breadth and depth first search algorithm
Brachman, R., & Levesque, H. (2004). Knowledge representation and reasoning. Morgan Kaufmann
Clocksin, W. F., & Mellish, C. S. (2003). Programming in Prolog. Springer Berlin Heidelberg.
Russell, S. & Norvig, P. (2010). Artificial Intelligence: A Modern Approach (3rd ed.). Prentice Hall.
S., V. C. S., & S., A. H. (2014). Artificial intelligence and machine learning (1.a ed.). PHI Learning.
Authors:
Reference: null
Year:
Data Analysis Programming
LO1 Introduction to the Python programming language.
LO2 Understand the principles of data science and data mining and understand and be able to apply the Cross-Industry Standard Process for Data Mining (CRISP-DM) in practical cases, including understanding the business and data and preparing the data for modeling.
LO3 Can execute and debug Python applications and use the fundamental libraries in practical cases of data preparation, exploration, visualization, and analysis of data features.
LO4 Understand machine learning algorithms in supervised prediction problems (classification, regression, time series) and unsupervised clustering problems, and be able to apply and evaluate their performance in practical problems, using the Python language, in the context of the CRISP-DM methodology.
LO5 Understand and be able to apply ethical and privacy considerations in data analysis and discuss future trends in this domain.
S1 Introduction to the programming language (Python 3)
S2 Python development environments
S3 Control primitives, variables, expressions, and declarations. Objects and object classes. Functions, modules, and packages
S4 File operations. Interpretation of JSON, XML data. Database operations. Web scrapping
S5 Introduction to data science, the data cycle, and data mining. CRISP-DM model
S6 Data preparation and cleaning techniques
S7 Exploratory analysis and data visualization
S8 Selection and engineering of relevant data features
S9 Prediction methods in machine learning (classification, regression, time series). Essential algorithms, including their evaluation with performance metrics
S10 Clustering methods in machine learning. Essential algorithms
S11 Ethical considerations: privacy, security, and responsible data handling
S12 Emerging technologies and their impact on data analysis
Course unit with Periodic Assessment, not including a Final Exam. Weight of assessment:
Individual assignments, 80% mandatory (25%)
Laboratory project (group of 2), with individual oral discussion (50%)
2 multiple-choice mini-tests (25%)
If students fail in the regular period (grade < 10), they can take the 1st or 2nd term exam, which will count for 50% of the grade. It is mandatory to pass the group project or to carry out an individual project (50%).
Title: Larose, D., Larose, C., Data Mining and Predictive Analytics, 2015, 2nd Edition, John Wiley & Sons
Hastie, T.; Tibshirani, R., Friedman, J., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2017, 2nd ed. New York: Springer
Ethem Alpaydin, Introduction to Machine Learning, 2010, MIT Press.ISBN 026201243X
Reitz. K., Schlusser, T., The Hitchhiker's Guide to Python: Best Practices for Development,, 2016, 1st Edition, ISBN-13: 978-1491933176, https://docs.python-guide.org/
Martins, J. P., Programação em Python: Introdução à programação utilizando múltiplos paradigmas, 2015, IST Press, ISBN: 9789898481474
Authors:
Reference: null
Year:
Title: Zelle, J., Python Programming: An Introduction to Computer Science, 2016, Franklin, Beedle & Associates Inc, ISBN-13 : 978-1590282755
Matthes, E., Python Crash Course, 2Nd Edition: A Hands-On, Project-Based Introduction To Programming, 2019, No Starch Press,US, ISBN-13 : 978-1593279288
Beazley, D., Jones, B., Python Cookbook: Recipes for Mastering Python 3, 2013, O'Reilly Media, ISBN-13 : 978-1449340377
Neto, J. P., Programação, Algoritmos e Estruturas de Dados, 2016, Escolar Ed., 3ª Edição. ISBN: 9789725924242
Authors:
Reference: null
Year: