大数据的最大挑战来自气候变化
日期:2014-07-02 17:12

(单词翻译:单击)

大数据的最大挑战来自气候变化

Global sea levels are about eight inches higher today than they were in 1880, and they are expected to rise another two to seven feet during this century. At the same time, some 5 million people in the U.S. live in 2.6 million coastal homes situated less than 4 feet above high tide.
你知道吗,今天的全球海平面要比1880年的时候高出8英寸,而就在本世纪内,全球海平面预计还将上涨2到7英尺。另外,美国沿海地区有260万户家庭的500余万人口的住宅,在海水满潮时,只高出海平面不到4英尺。
Do the math: Climate change is a problem, whatever its cause.
毫无疑问,气候变化是个大问题,不管导致它的原因是什么。
The problem? Actually making those complex calculations is an extremely challenging proposition. To understand the impact of climate change at the local level, you’ll need more than back-of-the-napkin mathematics.
那么如何计算气候对环境的影响呢?事实上,要进行这些复杂的计算,是一个极具挑战性的课题。要想了解气候变化对一国一地的影响水平,绝对不是在一张餐巾纸上写写画画就能算得出来的。
You’ll need big data technology.
这时你就需要大数据技术了。
Surging Seas is an interactive map and tool developed by the nonprofit Climate Central that shows in graphic detail the threats from sea-level rise and storm surges to all of the 3,000-plus coastal towns, cities, counties and states in the continental United States. With detail down to neighborhood scale—search for a specific location or zoom down as necessary—the tool matches areas with flooding risk timelines and provides links to fact sheets, data downloads, action plans, embeddable widgets, and other items.
“上升的海平面”(Surging Seas)是由非盈利组织“气候中心”(Climate Central)开发的一款互动式地图工具,它用图形的形式详细描绘了海平面上升和风暴潮给美国大陆沿海3000多个城市、城镇和农村造成的威胁。它的细节可以精确到每一个街区——你可以搜索一个特定的地理位置,或是按照需要继续缩小目标范围。这个工具会与存在洪泛风险的地区进行匹配,并且提供相关实时报道、数据下载、行动计划、内嵌小工具和其它相关事项的链接。
It’s the kind of number-crunching that was all but impossible only a few years ago.
这种数据处理方式仅仅在几年前还是不可能实现的。
‘Just as powerful, just as big’
能力有多大,困难就多大
“Our strategy is to tell people about their climate locally in ways they can understand, and the only way to do that is with big data analysis,” said Richard Wiles, vice president for strategic communications and director of research with Climate Central. “Big data allows you to say simple, clear things.”
气候中心的战略沟通副总裁兼研究主任理查德o怀尔斯表示:“我们的战略是以人们能够理解的方式告诉他们当地的气候情况,唯一能实现这个目标的方法就是通过大数据分析。大数据让你能够简单、清晰地表达。”
There are actually two types of big data in use today to help understand and deal with climate change, Wiles said. The first is relatively recently collected data that is so voluminous and complex that it couldn’t be effectively manipulated before, such as NASA images of heat over cities, Wiles said. This kind of data “literally was too big to handle not that long ago,” he said, “but now you can handle it on a regular computer.”
怀尔斯指出,目前主要有两种大数据形式可以用来帮助人们了解和应对气候变化。第一类是某些在近期才收集到的数据,但它们往往数据量极大且非常复杂,搁在以前很难对其进行有效分析,比如美国国家航空航天局(NASA)对各大城市的热成像绘图。怀尔斯表示,这种数据“一直到不久之前,还因为数据量过大而基本上没法处理,但是现在你已经可以在一台普通的电脑上处理它们了。”
The second type of big data is older datasets that may be less-than-reliable. This data “was always kind of there,” Wiles said, such as historic temperature trends in the United States. That kind of dataset is not overly complex, but it can be fraught with gaps and errors. “A guy in Oklahoma may have broken his thermometer back in 1936,” Wiles said, meaning that there could be no measurements at all for two months of that year.
第二类大数据是一些相对较老但可能不那么可靠的数据。怀尔斯表示,这些数据“基本上一直都在那儿”,比如美国的历史气温趋势。这种数据一般不太复杂,但有可能存在不少缺口和误差。比如怀尔斯就指出:“1936年,俄克拉荷马州的某个负责量气温的家伙有可能不小心把温度计弄坏了。”这样的话,当年可能就有两个月根本没有气温记录。
Address those issues, and existing data can be “just as powerful, just as big,” Wiles said. “It makes it possible to make the story very local.”
怀尔斯表示,要解决这些问题,现有的数据可以说“能力有多大,困难就有多大。但是大数据技术使得揭示一城一地的气候变化成为可能。”
Climate Central imports data from historical government records to produce highly localized graphics for about 150 local TV weather forecasters across the U.S., illustrating climate change in each station’s particular area. For example, “Junes in Toledo are getting hotter,” Wiles said. “We use these data all the time to try to localize the climate change story so people can understand it.”
气候中心从政府的历史记录中获取原始数据,然后为美国各地的150余家地方电视台的天气预报节目制作高度本地化的气候图形,以阐释该地区的气候变化。比如怀尔斯指出:“今年六月,托雷多市变热了。我们一直利用这些数据试图让当地人了解气候变化趋势。”
‘One million hours of computation’
100万小时的计算
Though the Climate Central map is an effective tool for illustrating the problem of rising sea levels, big data technology is also helping researchers model, analyze, and predict the effects of climate change.
气候中心的地图是阐释海平面上升情况的一个非常有效的工具。此外,大数据技术还能帮助研究人员模拟、分析和预测气候变化的影响。
“Our goal is to turbo-charge the best science on massive data to create novel insights and drive action,” said Rebecca Moore, engineering manager for Google Earth Engine. Google Earth Engine aims to bring together the world’s satellite imagery—trillions of scientific measurements dating back almost 40 years—and make it available online along with tools for researchers.
谷歌地图引擎(Google Earth Engine)的工程经理瑞贝卡o摩尔介绍道:“我们的目标是助力最好的大数据分析技术,以催生新颖的见解并且促进行动。”谷歌地图旨在将全球的卫星图像进行汇总,其中还包括40年来数以万亿计的观测数据,并将其与其它为研究人员开发的工具一道放在网上。
Global deforestation, for example, “is a significant contributor to climate change, and until recently you could not find a detailed current map of the state of the world’s forests anywhere,” Moore said. That changed last November when Science magazine published the first high-resolution maps of global forest change from 2000 to 2012, powered by Google Earth Engine.
比如在全球荒漠化问题上,摩尔表示:“全球荒漠化是气候变化的一个重要推手,直到不久之前,还没有一份详细的实时地图能够显示全球各地的森林情况。但现在情况不同了,去年11月,《科学》(Science)杂志在谷歌地图引擎的帮助下,发布了首张2000至2012年的高分辨率全球森林变化图。
“We ran forest-mapping algorithms developed by Professor Matt Hansen of University of Maryland on almost 700,000 Landsat satellite images—a total of 20 trillion pixels,” she said. “It required more than one million hours of computation, but because we ran the analysis on 10,000 computers in parallel, Earth Engine was able to produce the results in a matter of days.”
摩尔介绍道:“我们运行的森林测绘算法是由马里兰大学(University of Maryland)的马特o汉森教授开发的,总共利用了70万张美国陆地资源卫星的图像,加起来大约有20万亿个像素点。它需要超过100万小时的计算时间,但由于我们是在10,000台计算机上并行计算的,因此谷歌地球引擎才得以在几天内就得出了结果。
On a single computer, that analysis would have taken more than 15 years. Anyone in the world can view the resulting interactive global map on a PC or mobile device.
如果只用一台计算机计算的话,完成这样一次分析大概需要超过15年的时间。但现在全球各地的任何人都可以在电脑或移动设备上查看这次分析得到的这张互动式全球地图。
‘We have sensors everywhere’
传感器无所不在
Rapidly propelling such developments, meanwhile, is the fact that data is being collected today on a larger scale than ever before.
在这些项目取得快速进展的背后离不开这样一个事实:如今我们对数据的收集程度已经远超以往任何时候。
“Big data in climate first means that we have sensors everywhere: in space, looking down via remote sensing satellites, and on the ground,” said Kirk Borne, a data scientist and professor at George Mason University. Those sensors are continually recording information about weather, land use, vegetation, oceans, ice cover, precipitation, drought, water quality, and many more variables, he said. They are also tracking correlations between datasets: biodiversity changes, invasive species, and at-risk species, for example.
乔治梅森大学的数据学家柯克o波恩教授指出:“大数据技术在气候研究领域的发展,首先意味着传感器已经无所不在。首先是太空中的遥感卫星,其次是地面上的传感器。”这些传感器时刻记录着地球各地的天气、土地利用、植被、海洋、冰层、降水、干旱、水质等信息以及许多变量。同时它们也在跟踪各种数据之间的关联,比如生物多样性的变化、入侵物种和濒危物种等等。
Two large monitoring projects of this kind are NEON—the National Ecological Observatory Network—andOOI, the Ocean Observatories Initiative.
在这一类监控项目中有两个比较有代表性的大型项目,一个是美国国家生态观测站网络(NEON),一个是海洋观测计划(OOI)。
“All of these sensors also deliver a vast increase in the rate and the number of climate-related parameters that we are now measuring, monitoring, and tracking,” Borne said. “These data give us increasingly deeper and broader coverage of climate change, both temporally and geospatially.”
波恩指出:“这些传感器令我们现在正在观测和追踪的气候参数无论在等级还是数量上都有了极大的提高。另外无论是在时间上还是在地理空间上,这些数据对气候变化的覆盖都变得越来越深、越来越广。”
Climate change is one of the largest examples of scientific modeling and simulation, Borne said. Efforts are focused not on tomorrow’s weather but on decades and centuries into the future.
波恩表示,气候变化是科学建模仿真应用得最广泛的例子之一。科学家不仅利用建模仿真来预测明天的天气,而且还用它来预测几十年甚至几百年后的气候。
“Huge climate simulations are now run daily, if not more frequently,” he said. These simulations have increasingly higher horizontal spatial resolution—hundreds of kilometers, versus tens of kilometers in older simulations; higher vertical resolution, referring to the number of atmospheric layers that can be modeled; and higher temporal resolution—zeroing in on minutes or hours as opposed to days or weeks, he added.
他还表示:“大规模的气候模拟现在每天都在运行,有些甚至可能更为频繁。”这些模拟的水平分辨率更高,达到几百公里,而过去的模拟只能达到几十公里。同时它们垂直分辨率也变得更高,这也就表示可以对大气层中更多的层进行建模。另外还有更高的瞬时分辨率,也就是说只需要几分钟或几个小时就可以进行归零校正,而不是几天或几个星期。
The output of each daily simulation amounts to petabytes of data and requires an assortment of tools for storing, processing, analyzing, visualizing, and mining.
每天的气候模拟都会生成几千兆字节的数据,并且需要一系列工具进行存储、处理、分析、挖掘和图像化。
‘All models are wrong, but some are useful’
所有模型都是错的,但有些很有用
Interpreting climate change data may be the most challenging part.
气候变化数据的解读可能是最具有挑战性的部分。
“When working with big data, it is easy to create a model that explains the correlations that we discover in our data,” Borne said. “But we need to remember that correlation does not imply causation, and so we need to apply systematic scientific methodology.”
波恩指出:“搞大数据时,要建立一个模型来解释我们在数据中发现的某种关联是很容易的。但我们得记住,这种关联并不代表原因,所以我们需要应用系统化的科学方法。”
It’s also important to heed the maxim that “all models are wrong, but some are useful,” Borne said, quoting statistician George Box. “This is especially critical for numerical computer simulations, where there are so many assumptions and ‘parameterizations of our ignorance.’
波恩还指出,搞大数据最好要记住统计学家乔治o博克斯的名言:“所有模型都是错的,但有些很有用。”他表示:“这对数字计算机模拟尤为重要,因为其中有很多假设和‘代表了我们的无知的参数’”。
“What fixes that problem—and also addresses Box’s warning—is data assimilation,” Borne said, referring to the process by which “we incorporate the latest and greatest observational data into the current model of a real system in order to correct, adjust, and validate. Big data play a vital and essential role in climate prediction science by providing corrective actions through ongoing data assimilation.”
波恩表示:“要想解决这个问题,以及解决博克斯警告我们的问题,最重要的是做好数据同化。”也就是“把最新最好的观测数据纳入一个真实系统的实时模型中,以对数据进行纠正、调整、确认。通过以不间断的数据同化作为校正措施,大数据在气候预测科学中扮演了至关重要且不可或缺的角色。
‘We are in a data revolution’
我们已经在一场数据革命之中
Earlier this year, the Obama administration launchedClimate.data.gov with more than 100 curated, high-quality data sets, Web services, and tools that can be used by anyone to help prepare for the effects of climate change. At the same time, NASA invited citizens to help find solutions to the coastal flooding challenge at an April mass-collaboration event.
今年早些时候,奥巴马政府推出了官方的气象研究网站Climate.data.gov,上面有100多种精心编辑的高质量数据以及网页服务和工具,任何人都可以利用这些数据与工具来研究气候变化的影响。与此同时,NASA也在今年四月的一次大型协作活动上,邀请普通民众协助其寻找应对沿海洪灾的解决方案。
More recently, UN Global Pulse launched a Big Data Climate Challenge to crowdsource projects that use big data to address the economic dimensions of climate change.
最近,联合国“全球脉动”行动(UN Global Pulse)推出了一项“大数据气候挑战”项目,将一些用大数据研究气候变化对经济的影响的项目通过众包的形式进行了发布。
“We’ve already received submissions from 20 countries in energy, smart cities, forestry and agriculture,” said Miguel Luengo-Oroz, chief scientist for Global Pulse, which focuses on relief and development efforts around the world. “We also hope to see submissions from fields such as architecture, green data centers, risk management and material sciences.”
“全球脉动”行动主要致力于全球各地的扶贫救灾与发展事业,该行动的首席科学家卢恩戈o奥罗兹表示:“我们已经收到了来自20多个国家的在能源、智能城市、林业和农业等领域的意见书。我们也希望收到建筑、绿色数据中心、风险管理和材料科学等领域的意见书。”
Big data can allow for more efficient responses to emerging crises, distributed access to knowledge, and greater understanding of the effects personal and policy decisions have on the planet’s climate, Luengo-Oroz added.
卢恩戈o奥罗兹补充道,大数据还可以用于提高突发灾害的应急工作效率,提供更广泛地获取知识的渠道,以及帮助我们更好地了解私人与政府的决策会对地球的气候造成哪些影响。
“But it’s not the data that will save us,” he said. “It’s the analysis and usage of the data that can help us make better decisions for climate action. Just like with climate change, it is no longer a question of, ‘is this happening?’ We are in a data revolution.”
奥罗兹表示:“然而拯救我们的不是那些数据,而是那些让我们能做出更好的决策来应对气候变化的数据分析与使用方法。这就像气候变化本身一样,现在已经不是‘它开始了吗’的问题。我们已经在一场数据革命之中。”

分享到
重点单词
  • masonn. 泥瓦匠 Mason: 共济会会员
  • propositionn. 建议,命题,主张 vt. 向 ... 提议,向 .
  • computationn. 计算,计算机的使用,计算方法,计算结果
  • reliefn. 减轻,解除,救济(品), 安慰,浮雕,对比 adj
  • initiativeadj. 创始的,初步的,自发的 n. 第一步,首创精神
  • challengingadj. 大胆的(复杂的,有前途的,挑战的) n. 复杂
  • analysisn. 分析,解析
  • predictv. 预知,预言,预报,预测
  • predictionn. 预言,预报
  • implyvt. 暗示,意指,含有 ... 的意义