HomePublications ➤ n2011subject

Subject Specific Stream Classification Preprocessing Algorithm for Twitter Data Stream

Nisansa de Silva, Danaja Maldeniya, Chamilka Wijeratne
arXiv preprint arXiv:1705.09995

Micro-blogging service Twitter is a lucrative source for data mining applications on global sentiment. But due to the omnifariousness of the subjects mentioned in each data item, it is inefficient to run a data mining algorithm on the raw data. This paper discusses an algorithm to accurately classify the entire stream in to a given number of mutually exclusive collectively exhaustive streams upon each of which the data mining algorithm can be run separately yielding more relevant results with a high efficiency.

Keywords: Natural Language Processing | Machine Learning / Deep Learning | Big Data | Data Mining | Twitter | WordNet | Social Media |