Wednesday, September 14, 2011

Proposal Defense at CDCSIT

Clustering the text using the normal vector space model could not handle the
semantic relevancy of words so due to lack of such features in traditional vector
space model the concept of enhanced vector method is proposed. The research has
not been performed yet in opinion mining task in Nepal which is the leading task
for Nepali researcher who wants to work in Nepali language for the text
clustering. The algorithms which work in English language may not work in other
language. The clustering task enables the analyst to observe those clusters having
maximum number of documents which saves the time in this busy world for the
opinion to be analyzed by the analyst.
None of the papers seems to be focused on the classical vector space model
although it was simple, easier computation due to its inability of finding of
semantic similarity.
If the classical Vector Space Model is used for clustering purpose, simply
syntactic structure is taken into consideration. Seeing in the below example it
cannot clearly find that the meaning of first two sentence is same because the only
syntactic concept is major concern for classical vector space model. But in this
thesis, semantic texts are grouped into a group which is performed using fuzzy set
theory.
The Enhanced Vector Space Model has to be checked whether it works properly
or not. The combination of fuzzy set and the classical Vector Space Model is
another study that is going to be performed in this thesis.