0704-883-0675     |      dataprojectng@gmail.com

CLUSTERING NEWS ARTICLES USING K-MEANS AND N-GRAMS

  • Project Research
  • 1-5 Chapters
  • Abstract : Available
  • Table of Content: Available
  • Reference Style: APA
  • Recommended for : Student Researchers
  • NGN 3000

ABSTRACT

Document clustering is an automatic unsupervised machine learning technique that aimed at grouping related set of items into clusters or subsets. The target is creating clusters with high internal coherence, but different from each other substantially. Simply, items within the same cluster should be highly similar, while maintaining high dissimilarity with items within other clusters. Automatic clustering of documents has played a very significant role in many fields including data mining and information retrieval. This thesis aimed to improve the overall efficiency of a document clustering technique using N-grams and efficient similarity measure. The thesis improves the purity and accuracy of the obtained clusters. The preprocessing method is based on N-grams (sequence of N consecutive characters) which do not give consideration to stop-words or other special punctuations but creates and overlap among the content of a document which further gives room to ignore errors thereby increasing the quality of the clusters to a great extent. This approach clusters the news articles based on their N-grams representation, thereby reducing noise and increase the probability of occurrences of the sequences within the articles document. The proposed clustering technique has parameters which can be changed accordingly at the document representation level in order to improve the efficiency and quality of the generated clusters. The results from the experiment using R programming environment were carried out on real datasets of the Reuters21578 and 20Newsgropus proved the effectiveness of the proposed clustering technique at different levels of N-grams in terms of the accuracy and purity of the generated clusters. The results also showed that the proposed clustering technique perform averagely better than the baseline technique both in terms of accuracy and purity with a best results when the window of N-grams = 3.




FIND OTHER RELATED TOPICS


Related Project Materials

INFLUENCE OF INTERNET-ADDICTION ON ACADEMIC PROCRASTINATION AMONG STUDENTS OF NIGERIAN UNIVERSITIES

ABSTRACT

This study examined the influence of Internet-Addiction on Academic Procrastination among Students of Nigerian Universities (a c...

Read more
BOARD ATTRIBUTES, OWNERSHIP STRUCTURE AND REPORTING QUALITY IN A POST-IFRS REGIME: A DIFFERENCE-INDIFFERENCE REGRESSION TECHNIQUE

ABSTRACT

The study examined board attributes, ownership structure and reporting quality in a post-IFRS regime. The difficulties to contro...

Read more
THE INFLUENCE OF DIGITAL MARKETING ON SALES STRATEGIES

THE INFLUENCE OF DIGITAL MARKETING ON SALES STRATEGIES

This study explores the influen...

Read more
MASS MEDIA AND THE MANAGEMENT OF HUMAN RIGHTS ABUSES IN NIGERIA

ABSTRACT

From all indications, it  has  become  obvious  that  the  mass...

Read more
MOTIVATION AS A CORRELATE ON STUDENTS EMBARKING ON BIOLOGY RELATED BUSINESS AFTER GRADUATION IN ANINRI LOCAL GOVERNMENT AREA ENUGU STATE

Background of the Study

Education is a process, which may be formal or informal, that enlightens, build...

Read more
ROLE OF EMOTIONAL INTELLIGENCE AND WORK LIFE BALANCE IN JOB STRESS

ABSTRACT

The study was focused on the role of emotional intelligence and work-life balance on job stress using Dataplus...

Read more
PARENTAL STATUS AND STUDENT ACADEMIC PERFORMANCE

ABSTRACT

 This study was undertaken to investigate the influence of parental status on academic performance of stu...

Read more
CONSERVATIVE PRACTICES OF INFORMATION RESOURCES BY LIBRARIES FOR ENHANCED COLLECTION USEFULNESS IN UNIVERSITY LIBRARIES IN ENUGU STATE

Background of the Study

The earliest and most fundamental responsibility of academic library is the co...

Read more
INVESTIGATING THE ROLE OF EARLY CHILDHOOD EDUCATION IN DISASTER RESPONSE PREPAREDNESS

ABSTRACT: This study Investigated the Role of Early Childhood Education in Disaster Response Preparedness. T...

Read more
AN EVALUATION OF JOB SATISFACTION AMONG TEACHERS IN SECONDARY SCHOOLS

BACKGROUND OF THE STUDY

Job satisfaction is an attitude that arises from the balance and summation of n...

Read more
Share this page with your friends




whatsapp