Description
Social Media is becoming part of our life. A large number of people are engaged in this platform. The result is a fresh and large amount of data generation. The user-generated social media data carries significant information related to different aspects such as events, knowledge, updates, news, etc. The fresh user-generated social media content can be in format text, image, and video. Social media content can be useful for different human welfare application development. Therefore, the proposed work aims to use machine learning algorithms with social media data to detect natural disaster events. The Natural Disaster Detection system involves machine learning and text-processing techniques for this task.
In this context, two publicly available disaster-based social media datasets have been considered. The first dataset is termed the Disaster Response Message dataset and the second is known as the Fake News Detection dataset. These datasets can be downloaded from link 1 and link 2. Additionally, the implementation of the simulation has been done using Python technology. The text preprocessing techniques are used to clean the data. Next, the Term Frequency and Inverse Document Frequency (TF-IDF) technique has been applied to select the features from the dataset. Then the dataset is divided into two parts i.e. training sample (70%) and validation set (30%).
In this work, the unsupervised learning algorithm has been selected to experiment because, in the real world, the social media data has no category. Additionally, unsupervised learning can be directly implementable to the data. Next, a modified Fuzzy C Means (FCM) clustering algorithm has been implemented to make the cluster of the data. Next, the modification of the selected centroid has been done to improve the detection quality. Furthermore, a self-adoptive learning concept has been implemented to improve learning of the implemented FCM algorithm for detecting the target events. Next, the Natural Disaster Detection performance has been measured in terms of accuracy, precision, recall, and f-score. Finally, results are visualized. Additionally, conclusions based on experiments are discussed and future work is reported.
Reviews
There are no reviews yet.