李飞飞的移民史和 ImageNet 的诞生史
![](https://img9.doubanio.com/icon/u2540043-24.jpg)
最近李飞飞教授出版了她的自传 The Worlds I See, 这是她的个人成长史,也是一部移民的奋斗史。
李飞飞教授在计算机视觉领域做出过卓越贡献,因为她创建的数据集 ImageNet,使得后来的深度学习在图片识别领域有了测试基准,从而大放异彩。
李飞飞教授出生在北京,成长在成都。后来她父亲于1989年先移居到美国,三年后,她和母亲得以成行,与她父亲在美国团聚。
In the initial phase, which I quickly realized was spearheaded by my mother, my father would find work and secure a place to live. In the second phase, to follow shortly thereafter, we would join him.
因为成都没有飞往美国的航班,所以她和母亲先赶到上海,再搭乘飞往美国的航班。离开上海前,李飞飞特地前往浦江饭店,因为传言这里是爱因斯坦曾经下榻的酒店。就在访问上海前后,爱因斯坦获得了诺贝尔奖。当然在年少的李飞飞看来,还有另一层东西激励着她,那就是爱因斯坦也是移民。
Maybe this won’t be so bad, I thought. Einstein was an immigrant, too, after all.
当她们飞抵美国,等了很久,不见她父亲来接机。她和母亲非常焦灼,因为无法联系上她父亲,也没有钱买回中国的机票,更不用说她当时的英语口语不足以和人交流。
My mother had exactly twenty U.S. dollars in her pocket, we had no return ticket, and I quickly found that the couple of years I’d spent learning basic English in school were all but useless in practice.
终于等到她父亲来接机,之所以耽误了很久,是因为他的二手车在路上抛锚了。这种事在囊肿羞涩的移民中经常发生。
到了美国还没喘一口气,她就赶去学校上课。当然第一天的学校生活就让她爱上了美国。
On an otherwise imposing first day, I instantly knew one thing: I would love American teachers.
之后,李飞飞考上了普林斯顿攻读物理,获得了全奖。第一次让这个经济紧张的家庭如释重负。
Every milestone of her life had been a reminder that she was on the wrong side of divides she had no hope of bridging, conditioning her over decades to feign a confidence I knew she never truly felt.
等到她从普林斯顿毕业,刚好赶上第一波互联网泡沫。连华尔街也疯狂找人,想赶上这一风口 。李飞飞接到了不少华尔街投行的职位,于是她回去和母亲谈起这件事。如果去华尔街工作,高薪立即可以让这个移民家庭脱离困境。但是她母亲只问了两个问题就让李飞飞认识到这并不是她真正想要的。
“Fei-Fei, is it what you want?”
“You know what I want, Mom. I want to be a scientist.”
“So what are we even talking about?”
https://letters.acacess.com/weekly-123
在上一期 DPS 周刊中,我们介绍了李飞飞教授的移民奋斗史 -- The Worlds I See 。其实她的成名之作当属 ImageNet,这座计算机视觉的里程碑的诞生却一波三折,它甚至差点夭折。
当李飞飞最早和身边的人聊起 ImageNet 的点子,很多人都觉得她异想天开,甚至包括她的祖师爷 Jitendra Malik:
He paused for a moment, then continued. “Frankly, I think you’ve taken this idea way too far.” I took a shallow breath. “The trick to science is to grow with your field. Not to leap so far ahead of it.”
幸好李飞飞遇到了 Kai Li 教授,以及他的学生 Jia Deng。专注于分布式计算的他们帮助李飞飞解决工程上的难题,毕竟在当年要构建一个包含上百万张图片的数据集,工程上并不容易。
最早他们是请学生一张张搜索图片,然后手工下载。Jia Deng 计算了一下工程量之后,发现要19年才能下载完。于是他写了爬虫,自动从 Google 上检索图片,然后下载。不久这个爬虫就被 Google 封锁,直到他使用动态 IP 来破解 Google 的封锁。
解决了图片下载的挑战之后,如何标注这些数据是第二大难题。在与 Jia Deng 的闲聊中,另一名研究生 Min 得知了这一挑战,于是向李飞飞建议使用 Amazon 的众包服务 -- Mechanical Turk -- 将标注工作外包给全世界使用 MTurk 的人。
I instantly forgot about my haste as my ears perked up. Jia has a social life?
就这样经过两年不懈的努力,ImageNet 终于完成建设,包含了近1500万张图片。
After two more years on the knife-edge of our finances—an agonizing stretch in which even a minor bump in the road might have sunk us for good—ImageNet was finally maturing into the research tool Jia and I had always envisioned.
恰逢 Black Friday,我们开通了首年订阅优惠,只要75折。点此获得优惠
他们于2009年将论文 ImageNet: A Large-Scale Hierarchical Image Database 投向了计算机视觉的顶会之一 – Conference on Computer Vision and Pattern Recognition (CVPR),没想到只获得了海报展示的机会。你能相信吗?这篇计算机视觉史上最重要的论文之一只获得了海报展示机会,而不是口头汇报的机会?
Our first setback was also the most consequential: that ImageNet was relegated to a poster session.
尽管在 CVPR 2009 上,ImageNet 并没有获得多少关注。但是李飞飞他们并没有气馁,相反,她非常坚信自己的成果:
“I don’t think ImageNet will make today’s algorithms better,” I said. “I think it will make them obsolete.”
他们也没有雪藏这个数据集,而是把他们做成了一个挑战赛,任何人都可以报名参加。于是他们联合之前的赛事 PASCAL VOC,让 ImageNet 成为其中的一个分支。
Mark was a rising star in the world of computer vision in his own right, and kindly allowed ImageNet to begin its life as a new track within the PASCAL VOC competition, then in its sixth year. It was an especially gracious offer, giving us the chance to learn the ropes within the confines of something already established.
当然“古早”的机器学习算法 -- 比如随机森林,支持向量机等等,拿这么大的数据集毫无办法,所以参赛者也知难而退:
Worst of all, participation was already dropping, and precipitously: registrations fell from 150 to 96 in the second year, and the entries themselves fell from 35 to just 15.
一直等到2012年, Alex Krizhevsky,Ilya Sutskever和Geoffrey Hinton 三人利用基于神经网络的 AlexNet 把 ImageNet 上的识别错误率降低到 15.3%,比第二名高出10.8%。以至于李飞飞和 Jia Deng 都不敢相信自己的眼睛。因为神经网络在当时被视为古早的算法,很多机器学习的教材都是一笔掠过 :)
“All right. Well, first of all, they’re using a really unorthodox algorithm. It’s a neural network, if you can believe it.” My ears perked up even more. If he didn’t have the entirety of my focus a moment ago, he certainly did now. “It’s like… ancient.”
当然 Hinton 知道 ImageNet 也纯属巧合。因为神经网络早已被人摒弃,只有他一人在默默坚持。当时的他苦于找不到足够大的数据集测试自己的算法,知道有一天他向老朋友 Jitendra 抱怨,而 Jitendra 向他提起了 ImageNet。没错,就是前面心存怀疑的祖师爷。
“You really want to impress me, Geoff? Show me they can handle something serious.”
“Like?”
“Like object recognition. In the real world.” Whatever Jitendra thought about ImageNet, I’d known since my days at Caltech that he was a believer in the power of visual categorization. “Have you tried PASCAL VOC?”
“Yeah. No luck. It’s just too small. There aren’t enough examples, so the network doesn’t generalize very well when we show it something new.”
“All right, so you need something bigger. Have you been following Fei-Fei’s lab, by any chance? When you’re ready for a real challenge, take a look at what they’re up to.”
所以 ImageNet 和 AlexNet 相辅相成,没有 ImageNet 这么大量的数据,AlexNet 就不会有惊人的突破;没有 AlexNet 的问世,也就没有 ImageNext 的广为人知。而之后的一切都是人们所熟知的历史。
现在回看,ImageNet 真是命运多舛,哪怕李飞飞少一点坚持,那么整个计算机视觉的发展速度都会不一样。