The full deployment of sixth-generation (6G) networks is inextricably connected with a holistic network redesign able to deal with various emerging challenges, such as integration of heterogeneous technologies and devices, as well as support of latency and bandwidth demanding applications. In such a complex environment, resource optimization, and security and privacy enhancement can be quite demanding, due to the vast and diverse data generation endpoints and associated hardware elements. Therefore, efficient data collection mechanisms are needed that can be deployed at any network infrastructure. In this context, the network data analytics function (NWDAF) has already been defined in the fifth-generation (5G) architecture from Release 15 of 3GPP, that can perform data collection from various network functions (NFs). When combined with advanced machine learning (ML) techniques, a full-scale network optimization can be supported, according to traffic demands and service requirements. In addition, the collected data from NWDAF can be used for anomaly detection and thus, security and privacy enhancement. Therefore, the main goal of this paper is to present the current state-of-the-art on the role of the NWDAF towards data collection, resource optimization and security enhancement in next generation broadband networks. Furthermore, various key enabling technologies for data collection and threat mitigation in the 6G framework are identified and categorized, along with advanced ML approaches. Finally, a high level architectural approach is presented and discussed, based on the NWDAF, for efficient data collection and ML model training in large scale heterogeneous environments.