New Feature Selection Method of Uyghur Text Classification

Journal Title: 河南科技大学学报(自然科学版) - Year 2016, Vol 37, Issue 3

Abstract

In order to deal with the insufficient consideration of the traditional chi-square statistic method in thefrequency and category distribution of feature items,a new Chi-square statistic feature selection method combined with the cosine similarity was proposed. Firstly,the mean term frequency-inverse document frequency( TF-IDF) was used to represent the features,and the selected feature items was balanced by introducing a adjustment formula. Thus the traditional chi-square statistic method was modified. Then the noise text was eliminated further by cosine similarity. Finally,a demonstration experiment was established on the collected Uyghur data set. The results show that the improved chi-square test method has better robustness. The classification performance is superior to the traditional chi-square statistic method.

Authors and Affiliations

Yan HE, HALIDAN• Abudureyimu, ALIYA• Aierken, Bingbing WU

Keywords

Related Articles

Stability of θ-Heun Method for Stochastic Differential Equations

In order to improve the stability of the numerical method for solving stochastic differential equation, the θ-Heun method was obtained by improving the Heun method. For a stochastic differential equation with multiplicat...

Effects of Biochar Application on Growth, Yield Components and Soil Enzyme Activities in Dry Cultivation of Peanut

A field experiment was conducted to investigate the effects of biochar application on the growth and yield component of peanut, as well as soil enzyme activities in hilly area of western Henan. The results show that bioc...

Multi-fingered Dexterous Humanoid Mechanical Hand Driven by Pneumatic Fan-shaped Flexible Joints

Based on the bionics theory, a kind of pneumatic driven fan-shaped flexible joint was presented to meet the high demand of service robot for flexibility and safety of man-machine interaction, which was used in the constr...

Exact Solutions and Backlund Transform of Five Dispersion Equation

By using the extended Clarkson and Kruskal (CK) method, the five order dispersion equation with variable coefficients was transformed into the five order dispersion equation with constant coefficients, and the equivalent...

Download PDF file
  • EP ID EP461277
  • DOI 10.15926/j.cnki.issn1672-6871.2016.03.010
  • Views 64
  • Downloads 0

How To Cite

Yan HE, HALIDAN• Abudureyimu, ALIYA• Aierken, Bingbing WU (2016). New Feature Selection Method of Uyghur Text Classification. 河南科技大学学报(自然科学版), 37(3), 42-46. https://europub.co.uk./articles/-A-461277