机器学习换脸视频背后的黑暗推手(2)_英语视频听力

机器学习换脸视频背后的黑暗推手(2)

日期:2020-01-19 11:36

(单词翻译:单击)

At SRI International, Aaron and his team have been working across disciplines to create a multi-pronged approach for detecting deep fakes.
在斯坦福国际研究院，亚伦和他的团队一直在开展跨学科研究，试图发明一种多管齐下检测深度造假视频的工具(]KQoO.Pb2o]@。
The system they've developed is called SAVI.
他们开发的系统名为SAVIUmeD4Mql[FuZ=@_。
"So our group focused on speech.
“我们组重点关注的就是言语b!rVxPpwkByIsg。
And in the context of this SAVI program, we worked with people in the artificial intelligence center who are doing vision.
就这个SAVI项目而言，我们和人工智能中心视觉部门的同事达成了合作3DkGG&*@Wt。
And put our technologies together to collaborate on coming up with a set of tools that can detect things like, here's the face.
通过将我们两边的技术结合到一起，来研发一套能够检测到，比如，这是一张脸Bo-uX7b5!JU。
Here's the identity of the face.
这是这张脸的身份信息|-HuD[KI_5。
It's the same person that was earlier in the video.
他和视频前面出现的那个人其实是同一个人zKOntrQF@-AcRGnWn*。
The lips are moving, okay.
他的嘴唇在动u#r_d%@YtV@cat。好吧xW94i[,S4PfrsCeZ*;。
And then we use our speech technology and say,
那我们就用我们的言语分析技术看看
"Can we verify that this piece of audio and this piece of audio came from the same speaker or a different speaker?"
“我们能不能核实这段音频和这段音频是否源自同一个人，还是不同的人？”
And then put those together as a tool that would say,
然后，我们把它们放到一起，组成一个检测
"If you see a face and you see the lips moving, the voice should be the same or you wanna flag something."
“看到了一张脸，还看到它的嘴唇在动，那它的声音应该是前后一致的，否则就对其进行标记Zb]I)ljSxu。”
However, there is always a worry that making these detection systems more available could unintentionally provide deep fake creators with workarounds.
然而，人们总是担心，降低这些检测系统的使用门槛，反而会无意中正中深度造假视频制作者的下怀e!Zn^;e#Qh。
If released, the methods meant to catch the altered media, could potentially drive the next generation of deep fakes.
因为如果普及这一工具——其手段就是抓取被篡改的媒体文件——反而可能将造假视频推向新的高度&I@oWIcUP_^HAV。
As a result, these detection systems have to evolve.
因此，这些检测系统必须升级Tug4xv^cSRa&A!5[NM。
In its newest iteration, Aaron gave us a run through of how various aspects of the system work, without giving too much away.
亚伦向我们介绍了最新版本的检测系统在保证避免透露太多信息的前提下，它的各个方面是如何运作的_ph5.&|^twHc7_sap。
"This is an explicit lip sync detection.
“这明显就是一个口型同步检测工具=BwMMj4tn-|[M(f#_f。
What we're doing here is we're learning from audio and visual tracks what the lip movement should be given some speech and vice versa.
我们当前的工作就是对照音视频轨道了解特定话语的唇部变化应该是什么样以及特定的唇部变化对应的应该是什么话语(1.lIcsAFCZjxhP1#。
And we're detecting when that deviates from what you would expect to see and hear."
接下来要检测的便是音视频材料的哪些部分偏离了我们的预期，不是我们应该看到或听到的内容5cc,94L|iv)x2Yy2.;d。”
While some techniques can work well on their own, most fair better when combined into a larger detection system.
虽然有些技术自动检测的能力就很强，但大多数还是要整合到更大的检测系统里优势才会比较明显xS#d]huHidxmp。
"So in this video you'll see Barack Obama giving a speech about Tom Vilsack, one of his departing cabinet members.
“下面这段视频是奥巴马围绕即将离任的内阁成员汤姆·维尔萨克发表演讲的视频kyN^v^gkji,Sj;Px。
And we're running this live through our system here, which is processing basically to identify two kinds of information.
我们用我们的系统检测视频的实时音轨，检测的目的主要是识别两种信息370g,qKGJY+^4。
The top one where it says natural is a model that's detecting is this natural or some type of synthesized or generated speech, essentially a deep fake.
上面的长条检测的是‘视频内容是否自然，是否是合成的或者用工具生成的语音，换句话说就是是不是深度造假视频’，检测结果显示这条视频是自然的6C7]*_98bq6#K4z。

In the bottom, is detecting identity based on voice, so we have a model of Barack Obama
下面的长条是根据声音检测声音来源的身份的，因为我们有奥巴马声音的样本，
so it's saying this continues to verify as Obama and this will continue like this until now we get Jordan Peele imitating Barack Obama."
下面的检测结果进一步佐证了确实是奥巴马本人，这种情况一直持续到这里，也就是乔丹·皮尔开始模仿奥巴马这里tC4CS3#iIs|Be1-jD2bq。
"We're entering an era in which our enemies can make it look like anyone is saying anything at any point in time."
"我们已经进入了这样一个时代，我们的敌人能给大家造成一种‘任何人在任何时候都能说任何话’的错觉jy6KE)aIo9]~cuzedSWG。”
"And that whole section here was Jordan Peele. He's natural, but he's not Obama."
“这一整段都是乔丹·皮尔在说话，虽然声音是自然的，但他依然不是奥巴马dUQ9-007A3p(8[s|。”
"I would say for detection of synthesis or voice conversion, we're in the sub 5% error rate for what I would call laboratory conditions.
“我想说的是，就检测合成语音或转换语音而言，我们已经将误差控制在5%的范围以内了，前提是在实验室条件下[l0@2lcAzq。
And probably in the real world, it would be higher than that.
要是在现实世界中的话，误差可能会大一些b&YpHuxBDGH。
That's why having these multi-pronged things is really important."
这就是为什么要多管齐下反复检测，为什么反复检测非常重要的原因J*@QYbNZ3((%[a|rKnAW。”
However, technology is only part of the equation.
然而，技术并非这个方程式的全貌e2RIy02=R=E。
How we as a society respond to these altered pieces of content is as important.
同样重要的是，作为一个社会群体，我们如何应对这些被修改的内容Xbj(Y,[N*UD,R%I。
The media tends to focus on the technological aspects of things rather than the social.
媒体重点关注的往往是这些工具采用的技术，而不是它们在社会层面的含义lk5zlW;U^.PC-a;SlhG1。
The problem is less the deep fakes and more the people who are very willing to believe something that is probably not well done
问题更多的不在于这些深度造假视频本身，而在于有些人很愿意相信这些可能做的并不是特别好的东西，
because it confirms something that they already believe.
只因为这些东西符合他们已经相信的某些东西xO4983~_KVe%。
Reality becomes an opinion rather than fact.
这样一来，现实就不再是事实，而变成一种主观看法了N7CL!nLx~FHYe#=V%b。
And it gives you license to misbelieve reality.
它让你有了光明正大地怀疑事实的借口yfgFh;y@SMN。
It's really hard to predict what will happen.
这样下去会带来怎样的后果我们很难预测gT]!Bfs(=(vU4f。
You don't know if this is going to be something that five years from now people actually nail down or if it's 40 years from now.
你根本就不知道那些后果是五年后的事还是五十年后的事Y.!&x|4+e1s;T-K8。
It's one of those things that is still sort of exciting, interesting and new and you don't know what the limitations are yet.
有很多东西都是你当下依然觉得激动人心，觉得有趣，觉得新鲜，同时又不知道它的边界究竟在哪里的东西，深度造假技术也是一样k,+VdmJ0b=#。

分享到