
The purpose of the Big Data Lab is to provide comprehensive education in large-scale and distributed information systems, preparing students for highly skilled roles in emerging IT industries such as cloud computing, healthcare informatics, finance, data integration, and data analytics. The lab focuses on system development, testing, maintenance, data security and privacy, data integration, networking, cybersecurity, and application development. Additionally, it incorporates machine learning techniques and artificial intelligence applications, as well as text mining and natural language processing, reflecting the evolving landscape of big data. This is complemented by courses like Machine Learning, Artificial Intelligence, and Text Mining, alongside Data Communications & Networking, ensuring students are well-equipped for the challenges and opportunities in the rapidly growing field of big data and analytics. The lab consists of Lenovo P3 Tower - i7-13700 vPro, NVIDIA T1000 8GB GPU, 32GB RAM, 3x 512GB M.2 SSD.

Courses
CSC111: Assembly-Language Programming
Faculty: Dr. Simon Shamoun
Spring 2025
This course offers a comprehensive study of computer organization, covering memory and addressing, number systems and conversion, assemblers, base registers, and relocation. It explores fixed-point and floating-point arithmetic, string processing, indexing, iteration, and Boolean operations. Additionally, it delves into subroutines, macros, and I/O channel programming.
CSC156: Introduction to Machine Learning
Faculty: Professor Corey Elwosky
Fall 2025
The course introduces the mathematical, algorithmic and practical aspects of machine learning. Students will learn how to design applications that learn from data and past experience. Applications include classification, clustering, prediction, decision making. Among topics covered in the class are: regression, neural networks, decision trees, support vector machines, model and feature selection, ensemble methods, boosting, clustering, graphical models.
CSC158: Introduction to Artificial Intelligence
Faculty: Dr. Simona Doboli
Spring 2025
This course explores how computers perform tasks traditionally requiring human intelligence, covering heuristic search, robotics, pattern recognition, game playing, theorem proving, question-answer systems, and natural language processing.
CSC149: Introduction to Text Mining
Faculty: Dr. Simona Doboli
Fall 2025
The course covers techniques used in text retrieval and text analysis applications such as search engines, text categorization and clustering, topic extraction, summarization, sentiment analysis. Topics include: natural language processing techniques for extracting relevant terms out of text data, vector space and probabilistic methods for computing similarity between documents, document ranking, clustering and classification methods for text analysis.
CSC175: Data Communications & Networking
Faculty: Dr. Simon Shamoun
Spring 2025
This course provides a technical introduction to data communication, covering the OSI Reference Model, layer services, protocols, LANs, packet switching, X.25, ISDN, file transfer, virtual terminals, system management, and distributed processing.
Lab Resources
VMware vSphere
The datacenter hosts a cloud computing infrastructure supported supported by a VMware vSphere cluster of 21 servers. VMware vSphere is a virtualization platform for running virtual machines (VMs) in a large scale computing environment. The vSphere Client is used to access a vCenter Server for accessing and managing the virtual machines.
VPN remote access
The Computer Science VPN allows students and faculty to access resources of the Big Data Lab from off campus over the Internet. An OpenVPN client and a client profile from csconnect.hofstra.edu are required to connect to the VPN.