楼主: kissky
907 0

大数据的大问题:人才奇缺 [推广有奖]

  • 4关注
  • 55粉丝

VIP

学科带头人

95%

还不是VIP/贵宾

-

威望
1
论坛币
41814 个
通用积分
4.3365
学术水平
74 点
热心指数
95 点
信用等级
53 点
经验
43995 点
帖子
1312
精华
1
在线时间
1614 小时
注册时间
2006-11-26
最后登录
2022-11-13
毕业学校
UIBE

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Big Data's Big Problem: Little Talent

    By BEN ROONEY

It seems that the markets are as much in love with "Big Data"—the ability to acquire, process and sort vast quantities of data in real time—as the technology industry.

Hilary Mason, chief scientist for the URL shortening service Bit.ly, outlines the key skills that data scientists must have.

The first Big Data initial public offering hit the market last week to roaring approval. Splunk Inc., SPLK -0.13% which helps businesses organize and make sense of all the information they gather, soared 109% on its first day of trading. Big Data, big price.

And this week, in cities in the U.S. and the U.K., Big Data Week events are being held to proselytize the unbelievers.

Big Data refers to the idea that an enterprise can mine all the data it collects right across its operations to unlock golden nuggets of business intelligence. And whereas companies in the past have had to rely on sampling, Big Data, or so the promise goes, means you can use your entire corpus of digitized corporate knowledge. It is, by all accounts, the next big thing.

However, according to a report published last year by McKinsey, there is a problem. "A significant constraint on realizing value from Big Data will be a shortage of talent, particularly of people with deep expertise in statistics and machine learning, and the managers and analysts who know how to operate companies by using insights from Big Data," the report said. "We project a need for 1.5 million additional managers and analysts in the United States who can ask the right questions and consume the results of the analysis of Big Data effectively." What the industry needs is a new type of person: the data scientist.

According to Pat Gelsinger, president and chief operating officer of EMC Corp., the giant U.S. data company, this isn't an unprecedented problem. "IBM started a generation of Cobol programmers," he said, referring to one of the first dominant programming languages. "Thirty years ago we didn't have computer-science departments; now every quality school on the planet has a CS department. Now nobody has a data-science department; in 30 years every school on the planet will have one."

Hilary Mason, chief scientist for the URL shortening service bit.ly, says a data scientist must have three key skills. "They can take a data set and model it mathematically and understand the math required to build those models; they can actually do that, which means they have the engineering skills…and finally they are someone who can find insights and tell stories from their data. That means asking the right questions, and that is usually the hardest piece."

It is this ability to turn data into information into action that presents the most challenges. It requires a deep understanding of the business to know the questions to ask. The problem that a lot of companies face is that they don't know what they don't know, as former U.S. Defense Secretary Donald Rumsfeld would say. The job of the data scientist isn't simply to uncover lost nuggets, but discover new ones and more importantly, turn them into actions. Providing ever-larger screeds of information doesn't help anyone.

One of the earliest tests for biggish data was applying it to the battlefield. The Pentagon ran a number of field exercises of its Force XXI—a device that allows commanders to track forces on the battlefield—around the turn of the century. The hope was that giving generals "exquisite situational awareness" (i.e. knowing everything about everyone on the battlefield) would turn the art of warfare into a science. What they found was that just giving bad generals more information didn't make them good generals; they were still bad generals, just better informed.

At conference in London this week on the subject, the data scientist was called, only half-jokingly, "a caped superhero."

So where can companies find these superheros? Not from universities, it seems. Nigel Shadbolt, who doubles up as the professor of artificial intelligence at the University of Southampton as well as co-director (along with Tim Berners-Lee) of the U.K.'s Open Data Institute, said the courses don't yet exist. "Bits of it do exist in various departments around the country, and also in businesses, but as an integrated discipline it is only just starting to emerge."

Nor can they be found in recruitment agencies. Rob Grimsey, a director of IT recruitment agency Harvey Nash, said they had limited experience in recruiting data scientists—"which might be a statement in itself about how common these kind of roles are," he added.

One of the problems with Big Data is the fact that it has to deal with real data from the real world, which tends to be messy and difficult to represent. Conventional relational databases are excellent at handling stuff that comes in discreet packets, such as your social security number or a stock price. They are less useful when it comes to, say, the content of a phone call, a video, or an email. Out in the real world, most data is unstructured. Handling this sort of real, messy, scrappy data, isn't so simple.

"People have been doing data mining for years, but that was on the premise that the data was quite well behaved and lived in big relational databases," said Mr. Shadbolt. "How do you deal with data sets that might be very ragged, unreliable, with missing data?"

In the meantime, companies will have to be largely self-taught, said Nick Halstead, CEO of DataSift, one of the U.K. start-ups actually doing Big Data. When recruiting, he said that the ability to ask questions about the data is the key, not mathematical prowess. "You have to be confident at the math, but one of our top people used to be an architect".

But Fernando Lucini, chief architect for Autonomy Corp., a U.K. software maker recently acquired by Hewlett-Packard Co., HPQ -2.61% is much more optimistic. Mr. Lucini said the industry is fretting unnecessarily and should have more confidence in its own abilities. Most of these problems can be tackled through algorithms, he said, which coincidentally is the promise of Autonomy. "The problem can be solved by better tools. The tools need to help you understand the data. They can do the heavy lifting for you so that anyone in a business can use them and ask the questions they need to answer."
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:大数据 Intelligence Mathematical Conventional unstructured google love ability linked

本版微信群
加JingGuanBbs
拉您进交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-5-3 02:08