hetongguo 发表于 2011-9-20 11:42 
what kind of help? please specify it.
I am currently working for a software company.
I need to do the data mining for their new project, and i am the only one that have the statistics background. The new project is like .... searching something from the web sites as well as discussion forums, then search the keywords. However, the problem is we have a huge amount of data; and I think the most tough problem is how to select the data. Take an example, we need to exact the keyword " what is the best sushi in Beijing " from a discussion forum,and what we use Java to do it .
there are many ppl talking about sushi in Beijing on the forum....so hard to figure out what they are talking about because someone says " good" but actually means bad.
So.... my duty is data mining, so I find it really hard to figure out what kind of data we need ( like some data are just outliers ... etc.)
that s why i am here to try to get help from some body who has great experience on this area, coz i just graduate from the University, my background is statistical related, but i never do those data mining stuffs when i am in the university.
Thanks