1 00:00:00,000 --> 00:00:11,169 *32C3 preroll music* 2 00:00:11,169 --> 00:00:15,140 M.C.: Hey! So, can you hear me OK? Yeah. 3 00:00:15,140 --> 00:00:19,779 I am M.C. and I work on Transparency Toolkit along with Brennan Novak 4 00:00:19,779 --> 00:00:25,799 and Kevin Gallagher. Basically, what we try to do is “Watch the Watchers”. 5 00:00:25,799 --> 00:00:31,460 Back in May we released a database of over 27.000 people in the Intelligence 6 00:00:31,460 --> 00:00:37,340 Community called ICWATCH. And this is people who are talking about their work on 7 00:00:37,340 --> 00:00:41,780 classified programs on the public internet. So we collected it using 8 00:00:41,780 --> 00:00:46,310 search terms like the code words mentioned in the Snowden documents. 9 00:00:46,310 --> 00:00:50,710 And today we’re releasing an update to ICWATCH 10 00:00:50,710 --> 00:00:55,970 doubling the data in the database. 11 00:00:55,970 --> 00:01:00,920 *applause* 12 00:01:00,920 --> 00:01:07,309 And that’s already vive, if anyone wants to look at it. 13 00:01:07,309 --> 00:01:12,159 For the people who aren’t familiar with this project and the sorts of things 14 00:01:12,159 --> 00:01:16,810 available on the research methods I’d like to go through an interesting example of 15 00:01:16,810 --> 00:01:20,350 research things that can be found in this database. 16 00:01:20,350 --> 00:01:26,449 So this is Lauren Russell, and she works at L-3, a major intelligence contractor. 17 00:01:26,449 --> 00:01:30,679 But she started her career as an army interrogator in Iraq. She says that 18 00:01:30,679 --> 00:01:36,900 the information that she collected was used to capture dozens of people. 19 00:01:36,900 --> 00:01:40,190 But part of her job was also to assure safe and humane treatment of hundreds 20 00:01:40,190 --> 00:01:45,379 of detainees. So that’s good at least. But then, a few years after that, she went and 21 00:01:45,379 --> 00:01:50,389 worked for a different company called Exelis in Afghanistan. And this job 22 00:01:50,389 --> 00:01:55,580 was quite different. It involved finding people to kill. So she says as part 23 00:01:55,580 --> 00:01:59,840 of this work that she “utilized F3EA methodology to conduct analysis on raw and 24 00:01:59,840 --> 00:02:05,320 fused HUMINT, SIGINT, and COMINT helping to create 125 Targeting Support Packets 25 00:02:05,320 --> 00:02:09,299 then nominated to the Joint Priority Effects List (JPEL) for kinetic targeting.” 26 00:02:09,299 --> 00:02:14,280 So there’s a lot of not very obvious terms and gibberish there. And this is a pretty 27 00:02:14,280 --> 00:02:17,750 common problem by going through these résumés. So I want to break down how you 28 00:02:17,750 --> 00:02:22,849 would interpret that sentence. “Signals Intelligence” is what the NSA does. 29 00:02:22,849 --> 00:02:28,129 It’s collecting data from intercepted communications. COMINT – Communications 30 00:02:28,129 --> 00:02:31,449 Intelligence – is specifically Signals Intelligence from communication data. 31 00:02:31,449 --> 00:02:35,420 So what the NSA does when they read your email. 32 00:02:35,420 --> 00:02:38,580 HUMINT, Human Intelligence is Intelligence on human sources. 33 00:02:38,580 --> 00:02:45,650 So things like data gain from informers or from torture. 34 00:02:45,650 --> 00:02:50,210 The “direct priority of XLES” is a list of people the US military and its allies are 35 00:02:50,210 --> 00:02:54,720 trying to kill and capture in Afghanistan. 36 00:02:54,720 --> 00:02:58,740 F3EA stands for “Find, Fix, Finish, Exploit and Analyze”. It’s a rapid 37 00:02:58,740 --> 00:03:02,990 intelligence collection and analysis methodology used for targeting. And 38 00:03:02,990 --> 00:03:06,670 we recently found out in the Drone Papers that this is often used for 39 00:03:06,670 --> 00:03:12,869 drone targeting. And “Kinetic Targeting” simply means attacking a moving target. 40 00:03:12,869 --> 00:03:16,800 So looking at her profile again: she says that she “F3EA methodology 41 00:03:16,800 --> 00:03:20,819 to conduct analysis on raw and fused HUMINT, SIGINT and COMINT helping to 42 00:03:20,819 --> 00:03:24,899 create 125 Targeting Support Packets then nominated to the direct priority 43 00:03:24,899 --> 00:03:28,670 of XLES for conduct targeting.” Basically what she means is that based on 44 00:03:28,670 --> 00:03:32,759 intercepted communications and information from human sources, possibly gained under 45 00:03:32,759 --> 00:03:38,560 the rest from torture she is deciding who should be killed and captured. 46 00:03:42,755 --> 00:03:48,659 The Intelligence Community has long had an attitude of “Collect It All”. 47 00:03:48,659 --> 00:03:52,670 And General [Keith B.] Alexander started trying to collect all the data 48 00:03:52,670 --> 00:03:58,400 that they could from every source. One of the first projects to this end 49 00:03:58,400 --> 00:04:02,700 was something called Real Time Regional Gateway (RT-RG). It’s a master project to 50 00:04:02,700 --> 00:04:07,949 store, combine, search and analyze data from many different sources at once. 51 00:04:07,949 --> 00:04:11,530 Everything from intercepted communications to data from drones to data from 52 00:04:11,530 --> 00:04:17,930 interrogations to even mundane things like traffic patterns and the prize of potatoes. 53 00:04:17,930 --> 00:04:22,970 They started this program in 2005. The initial version was built by SAIC 54 00:04:22,970 --> 00:04:27,270 for use in Iraq. And these days it’s mostly used in Afghanistan. 55 00:04:27,270 --> 00:04:31,520 It searches the US soil because according to documents published in “Der SPIEGEL” 56 00:04:31,520 --> 00:04:38,479 last year Germany is the 3rd largest contributor to RT-RG. This source 57 00:04:38,479 --> 00:04:41,400 of collection analysis tools are used for some programs that you might have 58 00:04:41,400 --> 00:04:47,130 heard of too, like CoTraveller – the program the NSA has to figure who is 59 00:04:47,130 --> 00:04:52,380 going places with who else. And there is a specific analytic tool. This part of 60 00:04:52,380 --> 00:04:57,579 RT-RG called SIDEKICK that uses relative velocities to calculate this from any 61 00:04:57,579 --> 00:05:01,590 different data sources, so that they can calculate that for people across networks. 62 00:05:01,590 --> 00:05:04,030 Unfortunately, this is really computationally intensive because they 63 00:05:04,030 --> 00:05:09,459 need to pre-compute all of the travel behaviour for all the pairs of selectors. 64 00:05:09,459 --> 00:05:12,500 But it’s feasible for them to do computationally intensive things the time 65 00:05:12,500 --> 00:05:18,199 that it’s built because it’s built on Hadoop and accumulo for distributed data 66 00:05:18,199 --> 00:05:27,380 processing and storage. So they’re quite serious about this. The goals for RT-RG 67 00:05:27,380 --> 00:05:33,150 are quite lofty. One of the creators, in an interview with “Defence News” described 68 00:05:33,150 --> 00:05:37,240 their aim is being able to use intercepted communications and integrate it with 69 00:05:37,240 --> 00:05:42,000 signals with geolocation. So that they can instantly find people and target them. 70 00:05:42,000 --> 00:05:47,200 Another counter-terrorism official told the Wall Street Journal that RT-RG 71 00:05:47,200 --> 00:05:53,079 literally allows them to predict the future. Decorrelation means it’s the 72 00:05:53,079 --> 00:05:56,890 strongest correlation tool ever. So their goals of this seem to be two-fold: First 73 00:05:56,890 --> 00:06:02,990 of all to be able to kill or smite any potential enemies. And 2nd one to be 74 00:06:02,990 --> 00:06:07,970 omniscient. To know everything that’s happening at once. And to correlate it and 75 00:06:07,970 --> 00:06:13,300 use that to predict what will happen in the future. And these goals sound a little bit beyond 76 00:06:13,300 --> 00:06:18,560 what you would expect from someone who is trying to simply protect people or 77 00:06:18,560 --> 00:06:21,569 stop terrorism. It sounds more like they’re trying to become some sort 78 00:06:21,569 --> 00:06:26,539 of God. Who by collecting and analyzing everything know everything that’s 79 00:06:26,539 --> 00:06:32,280 happening everywhere and can just smite any enemies from above. Instantly. 80 00:06:32,280 --> 00:06:37,330 But the thing is they are a God. Their people working on these things are 81 00:06:37,330 --> 00:06:40,289 normal people. And they’ve crazy resources and they intercept 82 00:06:40,289 --> 00:06:44,460 a lot of data. But they also use data that’s freely available to anyone for 83 00:06:44,460 --> 00:06:49,860 a lot of their work. Open Source Intelligence. This is a pamphlet from 84 00:06:49,860 --> 00:06:55,270 a startup called ZeroFox that uses data from Social Media to track ISIS. 85 00:06:55,270 --> 00:07:00,019 And tools like this are quite common. There’s another tool called “LM Wisdom” 86 00:07:00,019 --> 00:07:03,620 that’s made by Lockheed Martin. And they have a wonderful promotion video 87 00:07:03,620 --> 00:07:08,699 on their website explaining exactly how it works – that I’d like to play. 88 00:07:08,699 --> 00:07:11,960 *with lowered voice:* Hopefully this’ll work… 89 00:07:11,960 --> 00:07:15,819 *audio/video starts* Female Narrator: Social Media content has the power 90 00:07:15,819 --> 00:07:19,300 to incite organized movements and sway political outcomes. 91 00:07:19,300 --> 00:07:22,879 Person in Video: “It’s an opposition terrorist organization in Iran.” 92 00:07:22,879 --> 00:07:26,259 Female Narrator: Monitoring and analyzing the massive and rapidly changing 93 00:07:26,259 --> 00:07:31,210 open source intelligence data, or OSINT, and turning it into actionable intelligence 94 00:07:31,210 --> 00:07:37,180 for decision-makers is an imperative. Lockheed Martin’s Wisdom software suite 95 00:07:37,180 --> 00:07:42,199 offers an advanced capability to collect, manage and analyze vast amounts 96 00:07:42,199 --> 00:07:47,620 of open source data. Enabling analysts to understand, measure and anticipate 97 00:07:47,620 --> 00:07:52,039 real-world advance through Social Media. Person in Video: “Think of Wisdom as your 98 00:07:52,039 --> 00:07:58,520 eyes and ears on the web. Wisdom is that tool that would allow it to do this 99 00:07:58,520 --> 00:08:00,400 at scale!” Female Narrator: Wisdom’s advanced 100 00:08:00,400 --> 00:08:05,319 Big Data collection capability and data store automatically identify and harvest 101 00:08:05,319 --> 00:08:09,479 online Social Networking data of operational interest. As well as 102 00:08:09,479 --> 00:08:14,810 socio-cultural data from standard online open sources like newspaper feeds and 103 00:08:14,810 --> 00:08:20,110 structured databases. Wisdom’s high- performance analytic algorithms analyze 104 00:08:20,110 --> 00:08:25,510 the content in near realtime distinguishing noise from high-value information. 105 00:08:25,510 --> 00:08:30,980 Capturing trends, sentiment and influence; turning open source data into predictive, 106 00:08:30,980 --> 00:08:36,030 actionable intelligence. *audio/video stops* 107 00:08:36,030 --> 00:08:37,210 M.C.: Yeah, so… *applause* 108 00:08:37,210 --> 00:08:41,259 …that’s what they’re doing. And they’re not just using this to target terrorists. 109 00:08:41,259 --> 00:08:46,450 It was recently revealed that they are helping Walmart use this to find employees 110 00:08:46,450 --> 00:08:50,230 that are organizing for better working conditions and find the main organizers 111 00:08:50,230 --> 00:08:53,820 and fire them. Using data from Social Media. 112 00:08:53,820 --> 00:08:59,320 So it’s used for Corporate purposes as well. And LM Wisdom wasn’t even made 113 00:08:59,320 --> 00:09:02,620 for surveillance in the first place. I tracked down one of the people 114 00:09:02,620 --> 00:09:09,020 who created it. And at that time he worked for General Electric and was hoping to 115 00:09:09,020 --> 00:09:14,320 make a… to help NBC make tools so that they can figure out which sites 116 00:09:14,320 --> 00:09:19,740 to partner with to make their videos go viral. So it’s not just governments that 117 00:09:19,740 --> 00:09:22,959 are using Open Source Intelligence because there’s no barriers to access it and 118 00:09:22,959 --> 00:09:27,510 there’s many applications. There’s even many people search databases that 119 00:09:27,510 --> 00:09:31,120 have information like people’s address, and phone number, and relatives, 120 00:09:31,120 --> 00:09:35,320 and how old they are. And these include many, many people. Probably everyone 121 00:09:35,320 --> 00:09:39,230 in the US. And they’re used by many people for all sorts of purposes from private 122 00:09:39,230 --> 00:09:47,839 detectives to people that are selling advertisements. If this data is available 123 00:09:47,839 --> 00:09:53,459 already and it’s used for everything from figuring out who to kill to stopping unions 124 00:09:53,459 --> 00:09:57,440 from organizing to trying to sell things to people – why can’t we use it to 125 00:09:57,440 --> 00:10:00,529 understand surveillance programs, too? Why can’t we use it to understand human 126 00:10:00,529 --> 00:10:05,170 rights abuses. Why not use it for accountability? So we started to build 127 00:10:05,170 --> 00:10:09,940 tools to do this and in the near future we’d like to make it possible for anyone 128 00:10:09,940 --> 00:10:14,400 to make something like ICWATCH or other databases in less than a day and without 129 00:10:14,400 --> 00:10:19,560 programming. Long-term goal is to build software similar to what the Intelligence 130 00:10:19,560 --> 00:10:24,310 Community has. Things similar to LM-Wisdom, things similar to Real Time Regional Gateway. 131 00:10:24,310 --> 00:10:29,779 So that people can collect all this information in one place and analyze it. 132 00:10:29,779 --> 00:10:33,389 I’d like to show a demo of some of the tools that we’ve been working on. It’s 133 00:10:33,389 --> 00:10:41,110 possible to just – this won’t work at all but we’ll see. So this is Harvester. It’s 134 00:10:41,110 --> 00:10:48,660 a tool for collecting data from online sources in an automated fashion. You can 135 00:10:48,660 --> 00:10:53,200 choose different data sources, say “Indeed” – this is a résumé website – and 136 00:10:53,200 --> 00:10:58,240 say you want to find anyone who mentioned XKeyscore and for sake of timing let’s 137 00:10:58,240 --> 00:11:08,160 just get people in Maryland. And “start collecting”, and it might take a second 138 00:11:08,160 --> 00:11:12,920 because it’s still a bit rough. But it opens a browser, goes finds other people 139 00:11:12,920 --> 00:11:19,069 who mention XKeyscore in Maryland and it goes and downloads all of their résumés 140 00:11:19,069 --> 00:11:24,149 in one place… you can kind of see them as they download because this is being 141 00:11:24,149 --> 00:11:48,709 slowed a bit down right now. That just works key services and fairly small. 142 00:11:48,709 --> 00:11:57,699 *Something shouted from out of the audience* M.C.: *laughs* 143 00:11:57,699 --> 00:12:02,060 *applause* 144 00:12:05,800 --> 00:12:12,350 Takes a second to load, still kind of rough… 145 00:12:12,350 --> 00:12:18,930 Yeah, so we’re hoping to add many different data sources, so that people can collect 146 00:12:18,930 --> 00:12:22,690 data from sources online as well as just take a pile of pdf’s on their computer, 147 00:12:22,690 --> 00:12:26,570 point at the directory and it will load them and OCR them and people will be able 148 00:12:26,570 --> 00:12:31,470 to search through them in a searchable database. 149 00:12:31,470 --> 00:12:35,549 So while this is loading why don’t I go and walk through some of the rest of the 150 00:12:35,549 --> 00:12:40,020 pipeline. So our goal is to have tools for collecting data, loading it into 151 00:12:40,020 --> 00:12:46,770 a database; and then tools for matching data across various sources on the same 152 00:12:46,770 --> 00:12:50,220 person or the same company. So it should take someone’s résumés and Social Media 153 00:12:50,220 --> 00:12:54,130 profiles and everything and link it together and then also link that to the 154 00:12:54,130 --> 00:12:57,180 companies they work(ed) for, the other people they know, the locations they’ve 155 00:12:57,180 --> 00:13:01,540 lived. As well as tools for extracting things from data. So to be able to go 156 00:13:01,540 --> 00:13:04,330 through a résumé, extract all the code words mentioned, to be able to go through 157 00:13:04,330 --> 00:13:08,019 a document and extract all the companies mentioned and generating 158 00:13:08,019 --> 00:13:13,190 entities that way. And tools for searching through data in databases where you can 159 00:13:13,190 --> 00:13:17,699 search for search queries and browse by categories. And for viewing data and 160 00:13:17,699 --> 00:13:23,649 network graphs and maps. Let’s see if this is done… Right now it just seals the 161 00:13:23,649 --> 00:13:32,540 raw chays on. The connection between tools is a bit rough. But we should be able to 162 00:13:32,540 --> 00:13:41,240 index the data and load it into a search tool. Will take a second. Hopefully this 163 00:13:41,240 --> 00:14:05,760 works. Ouh, it’s going! Yah… So it takes a little bit. Index… And you can see… 164 00:14:05,760 --> 00:14:13,699 The data will be at… It kind of super loaded into a searches list… 165 00:14:13,699 --> 00:14:17,310 So there’s a searchable database on the people who are working on XKeyscore 166 00:14:17,310 --> 00:14:27,400 in Maryland! *applause, cheers from audience* 167 00:14:27,400 --> 00:14:33,100 So I think that in using this Free Software and open data really the key is 168 00:14:33,100 --> 00:14:38,070 because we have far, far fewer resources than the Intelligence Community. And we 169 00:14:38,070 --> 00:14:41,240 don’t even have the resources that a company like Lockheed Martin has. We can’t 170 00:14:41,240 --> 00:14:45,269 internally build all of this software. I hope that we will anticipate every future 171 00:14:45,269 --> 00:14:50,609 use to be able to help people adapt to that. Having people be able to take our 172 00:14:50,609 --> 00:14:54,199 data, take our tools and adapt it to their own situations is absolutely key to 173 00:14:54,199 --> 00:14:58,380 actually ensuring that they’re useful. And there are also a lot of open source tools 174 00:14:58,380 --> 00:15:01,269 that the Intelligence Community has, really. It’s like accumulo, the thing 175 00:15:01,269 --> 00:15:05,399 that’s used in Real Time Regional Gateway. It was released by the NSA and made open 176 00:15:05,399 --> 00:15:11,029 source. And Gaffer which is a graph database recently released by GCHQ. 177 00:15:11,029 --> 00:15:15,660 So we can sort of take those and possibly also build on those in some cases. 178 00:15:15,660 --> 00:15:17,940 As well are using the same tools *chuckles* 179 00:15:17,940 --> 00:15:22,050 And it’s appropriate because our goal is to enable people to collect and use 180 00:15:22,050 --> 00:15:27,529 information in the same way that the Intelligence Community can. 181 00:15:27,529 --> 00:15:31,880 But, well, I think that we should aim to collect it all and collect all the 182 00:15:31,880 --> 00:15:35,009 information that we can. I think we also need to be careful to avoid a lot of the 183 00:15:35,009 --> 00:15:39,740 mistakes that the Intelligence Community has made. Because some of the effects are 184 00:15:39,740 --> 00:15:45,550 quite bad and lead to people being killed for no reason at all. And – it’s quite 185 00:15:45,550 --> 00:15:49,729 absurd. And the main one of these, I think, is de-humanizing people. 186 00:15:49,729 --> 00:15:53,370 Torture techniques are specifically designed to de-humanize people. 187 00:15:53,370 --> 00:15:56,100 When people are looking at data that they’ve intercepted, they’re not looking 188 00:15:56,100 --> 00:15:59,569 at a person, they’re looking at meta-data, they’re looking at numbers on a screen. 189 00:15:59,569 --> 00:16:05,819 It’s not something that’s easy to find a way around. When I was working on ICWATCH 190 00:16:05,819 --> 00:16:11,410 I was grabbling with this problem quite a bit. So I decided to try to see who some 191 00:16:11,410 --> 00:16:15,649 of these people are and try to put faces to these issues. So I started going to 192 00:16:15,649 --> 00:16:19,440 Intelligence conferences. Many of these conferences are quite open and you can 193 00:16:19,440 --> 00:16:24,490 just go in. And I wasn’t that out of place either, I just told people that I made 194 00:16:24,490 --> 00:16:27,430 tools to collect and analyze Open Source Intelligence. 195 00:16:27,430 --> 00:16:29,139 *laughter and applause* 196 00:16:29,139 --> 00:16:35,590 Disturbed many people doing. 197 00:16:35,590 --> 00:16:38,080 There’re many people doing some of the things out there, too. Like I met this 198 00:16:38,080 --> 00:16:45,409 year a ? who ? one of these conferences. They are actually very, very nice. And 199 00:16:45,409 --> 00:16:48,139 there were also some people who were quite interested in what I was doing. There was 200 00:16:48,139 --> 00:16:50,970 one recruiter from Northrop-Grumman who seemed somewhat interested in hiring me 201 00:16:50,970 --> 00:16:54,300 and I looked her up later and found a bunch of job listings where she was 202 00:16:54,300 --> 00:16:59,159 trying to hire people who… to work on programs we relate to XKeyscore. It wasn't 203 00:16:59,159 --> 00:17:03,639 all good, I got kicked out of one conference. I got some strange request like there was 204 00:17:03,639 --> 00:17:09,690 one guy who was trying to figure how to use open data to help venture capitalists 205 00:17:09,690 --> 00:17:15,170 figure out what porn the founders of the startups they funded watched. I’m not sure 206 00:17:15,170 --> 00:17:18,109 that’s even possible. But it was really weird and he was asking me for help and 207 00:17:18,109 --> 00:17:20,260 I was like “I don’t think I can help with that, sorry!” 208 00:17:20,260 --> 00:17:27,160 *laughter and applause* 209 00:17:27,160 --> 00:17:30,940 Of course there were some negative comments on things like Manning and Snowden 210 00:17:30,940 --> 00:17:33,990 and some confusion like there was someone who is making insider threat detection 211 00:17:33,990 --> 00:17:39,130 software, who was talking about how it would stop a situation like when Snowden 212 00:17:39,130 --> 00:17:43,070 leaked documents to Wikileaks and things like that. So people don’t actually 213 00:17:43,070 --> 00:17:46,280 know what’s going on. But generally most of them were decent people and some of 214 00:17:46,280 --> 00:17:49,250 them were quite nice, some of them were quite funny. And some of them really 215 00:17:49,250 --> 00:17:52,570 seemed to think that what they were doing is saving lives. So they’re not evil people 216 00:17:52,570 --> 00:17:57,540 who want to hurt others but they’re not infallible either. They’re human beings. 217 00:17:57,540 --> 00:18:02,800 And our strategy – looking at individuals – scares a lot of people. But what you 218 00:18:02,800 --> 00:18:09,810 have to realize is that institutions are made up by people. It’s easier to just 219 00:18:09,810 --> 00:18:12,810 look at the institution. It’s easier to just look at an abstract program. Just 220 00:18:12,810 --> 00:18:15,590 like it’s easier not to think of the person who you just decided to kill in a 221 00:18:15,590 --> 00:18:21,430 drone strike as a person. That’s why these things continue to happen. I think that 222 00:18:21,430 --> 00:18:24,520 there’s a lot of benefit to looking at people as people, both to avoid some of 223 00:18:24,520 --> 00:18:28,970 the problems the Intelligence Community has as well as because people’s data trails 224 00:18:28,970 --> 00:18:31,780 are part of the data trails of the institutions. And if we’re only looking at 225 00:18:31,780 --> 00:18:36,490 institutions we’re missing part of the data trail the people leave. 226 00:18:36,490 --> 00:18:40,690 Though, of course, no one person is responsible for the wrong-doings of the 227 00:18:40,690 --> 00:18:46,900 Intelligence Community. So we shouldn’t demonize any one person. But… 228 00:18:46,900 --> 00:18:49,650 these are the people who go to work every day and perpetuate the actions of the 229 00:18:49,650 --> 00:18:54,810 Intelligence Community. So I think everyone involved is a little bit at fault. 230 00:18:54,810 --> 00:18:57,950 And the other benefit of looking at people as people is that we can start to 231 00:18:57,950 --> 00:19:01,220 understand them. Because you have to understand what their hopes are, what 232 00:19:01,220 --> 00:19:05,330 their fears are. How they see the world. What upsets them. And what might cause 233 00:19:05,330 --> 00:19:08,920 them to change their behaviour. And from that we can start to maybe come up with 234 00:19:08,920 --> 00:19:13,150 alternatives. So let’s look at some of these people and look at some of their 235 00:19:13,150 --> 00:19:21,960 stories. This is Jason Epperson. He works on Intelligence collection for Special 236 00:19:21,960 --> 00:19:27,420 Operations. In his spare time he enjoys coaching children sports. He currently 237 00:19:27,420 --> 00:19:32,050 works at the US Special Ops Command (USSOCOM) helping different agencies 238 00:19:32,050 --> 00:19:35,190 collect data, share it, say and figure out what data they need, just generally 239 00:19:35,190 --> 00:19:39,340 helping them integrate it. But when he started his career back in 1998 also 240 00:19:39,340 --> 00:19:43,950 working on collecting data for Special Operations. Then later, in 2004, he went 241 00:19:43,950 --> 00:19:49,650 to work at the US Central Command in the NSA cryptologic services group and he was 242 00:19:49,650 --> 00:19:53,330 focused on tracking down high-value targets and individuals. And he claimed 243 00:19:53,330 --> 00:19:56,710 that as a result of his work, numerous high-value individuals were captured 244 00:19:56,710 --> 00:20:03,990 or killed. It is especially interesting because he was working on this in 2007 245 00:20:03,990 --> 00:20:09,330 when PRISM was launched and at the top of his résumé he lists in his specialties 246 00:20:09,330 --> 00:20:14,620 PRISM as “possible”, so that’s kind of a dinagra but based on his background it 247 00:20:14,620 --> 00:20:20,640 might not be. So I think it probably is actually PRISM. 248 00:20:20,640 --> 00:20:27,530 Then after he was working there he went and started working counter-radicalization 249 00:20:27,530 --> 00:20:31,030 efforts – things like boosting the capacity of Muslim Faith Leaders to win 250 00:20:31,030 --> 00:20:33,910 hearts and minds and establishing competing social networks to counter 251 00:20:33,910 --> 00:20:37,150 Al Qaeda ideology and he’s very clear in his job description that he’s not killing 252 00:20:37,150 --> 00:20:43,480 people, he’s just helping allies of the US figure out who is who, set interpole notices for. 253 00:20:43,480 --> 00:20:46,790 But the most interesting thing about him isn’t any of his jobs. It’s this 254 00:20:46,790 --> 00:20:50,940 publication that he has at the bottom of his résumé called “An Examination of the 255 00:20:50,940 --> 00:20:55,980 Effect of Government Data Mining on US Citizens”. And this clearly an area where 256 00:20:55,980 --> 00:21:00,470 he has a lot of expertise. And he presented this at a conference back in 257 00:21:00,470 --> 00:21:04,810 2010. I still don’t have a copy yet. It’s not easily available. I think it might be 258 00:21:04,810 --> 00:21:09,630 possible to get either by buying it from the company directly or by going to the 259 00:21:09,630 --> 00:21:14,820 Library of Congress that seems to have some copies of the conference proceedings. 260 00:21:14,820 --> 00:21:19,670 That could be quite interesting. Both because he was relatively high up, he was 261 00:21:19,670 --> 00:21:23,700 in command of nearly 400 people back when PRISM started and he was working with the 262 00:21:23,700 --> 00:21:27,840 NSA. It’s possible that he had some role early on in the program and this might 263 00:21:27,840 --> 00:21:33,790 provide some clues. And then also the little “data mining on US Citizens” a bit 264 00:21:33,790 --> 00:21:36,910 in the title is kind of interesting because that’s supposed to be the last 265 00:21:36,910 --> 00:21:40,500 protection – I think that’s kind of a super protection because most US citizens 266 00:21:40,500 --> 00:21:43,200 wouldn’t find it very comforting if the Chinese Government said: “Oh yeah, we have 267 00:21:43,200 --> 00:21:47,420 a mass surveillance program but we only spy on people who aren’t Chinese citizens.” 268 00:21:47,420 --> 00:21:50,680 That’s not really comforting to them, so I don’t see why it would be. But it’s been 269 00:21:50,680 --> 00:21:54,800 the one thing that people were impeding. “We don’t collect it on US citizens”. And 270 00:21:54,800 --> 00:21:59,960 just seeing that on the title of a paper is like a tiny admission that maybe they 271 00:21:59,960 --> 00:22:08,240 do. So some of these (?) files tell other interesting stories about people’s lives. 272 00:22:08,240 --> 00:22:11,760 If you’ve seen any of my other talks, this is someone you’ve heard me talk about 273 00:22:11,760 --> 00:22:15,920 a lot. Solomon Varnado. He spent most of his life in the military intelligence 274 00:22:15,920 --> 00:22:20,190 community, focused on Signals Intelligence and Geolocation. He took down his résumé 275 00:22:20,190 --> 00:22:25,960 after ICWATCH launched. But I actually recently found another résumé of his on 276 00:22:25,960 --> 00:22:31,070 another website that has additional information like on the side in the 277 00:22:31,070 --> 00:22:35,580 military he ran diversity programs and a sexual assault prevention program and 278 00:22:35,580 --> 00:22:39,070 things like that. I first came across this profile because he mentions a lot of 279 00:22:39,070 --> 00:22:45,010 interesting code words. This is probably the first known mention of XKeyscore back 280 00:22:45,010 --> 00:22:54,610 in 2004/2005. But these aren’t the most interesting part of his résumé. Later on 281 00:22:54,610 --> 00:22:58,230 he… after he works on Intelligence Collection Management – just Standard 282 00:22:58,230 --> 00:23:05,170 Signals Intelligence Collection – he goes and he works for L-3 Stratus. And there he 283 00:23:05,170 --> 00:23:08,550 says that he identified, collected and performed direction-finding as justified 284 00:23:08,550 --> 00:23:13,730 target signals using pan and race display view and sex. But I wasn’t sure what 285 00:23:13,730 --> 00:23:17,200 “pan and race” was – so I found it a definition very conveniently located in 286 00:23:17,200 --> 00:23:21,800 another résumé. That said it was an airborne collection platform for “pan and 287 00:23:21,800 --> 00:23:27,500 race”. That sounds like some sort of Signals Intelligence collection platform. 288 00:23:27,500 --> 00:23:31,760 And the other interesting thing about this job is that he said that he called for 289 00:23:31,760 --> 00:23:35,720 external review of intelligence management processes which is not something I see 290 00:23:35,720 --> 00:23:39,590 normally. And he was there for a fairly short time, only a couple of months. 291 00:23:39,590 --> 00:23:43,340 After staying at most of his other jobs for over a year. And then at his next job 292 00:23:43,340 --> 00:23:47,170 he was also there for only a couple of months. … international also on Drone 293 00:23:47,170 --> 00:23:50,650 Intelligence this time definitely Drone Intelligence on Predator drones because he 294 00:23:50,650 --> 00:23:54,370 mentions Airhandler which we now know more about thanks to the catalogue 295 00:23:54,370 --> 00:24:00,780 released by The Intercept. It’s a geo-processing system for geolocation data 296 00:24:00,780 --> 00:24:05,800 from Predator drones. And the update to ICWATCH includes all the data on all of 297 00:24:05,800 --> 00:24:13,610 the words mentioned in that catalogue. And then he leaves the Intelligence Community 298 00:24:13,610 --> 00:24:19,090 entirely after that job. And he goes and works as a used car salesman at this used 299 00:24:19,090 --> 00:24:23,610 car dealership. And it turns out is actually found on this other résumé that I just found. 300 00:24:23,610 --> 00:24:27,860 He’s actually quite a successful used cars salesman. He’s … he’one of the best 301 00:24:27,860 --> 00:24:32,140 salesmen in the region. So he’s doing quite low. And he … is the military … 302 00:24:32,140 --> 00:24:35,730 seems like he’s very committed to what he does. But still that’s quite a huge career 303 00:24:35,730 --> 00:24:39,880 change and it sounds like maybe he was starting to get upset with some of how 304 00:24:39,880 --> 00:24:42,840 things are really being done and he couldn’t figure out a way to fix it after 305 00:24:42,840 --> 00:24:49,010 calling for external review so he just left. 306 00:24:49,010 --> 00:24:54,190 *applause* 307 00:24:54,190 --> 00:25:02,360 And then, this is Michael Dial. Michael Dial is a pipe fitter and a plumber. And 308 00:25:02,360 --> 00:25:08,400 this is him with his family. He’s actually a pipe fitter and a plumber. But he’s not 309 00:25:08,400 --> 00:25:13,780 just any pipe fitter. He has security clearance. And he goes and he fits pipes 310 00:25:13,780 --> 00:25:17,990 in secure facilities. As you might expect he does a lot of pipe fitting for naval 311 00:25:17,990 --> 00:25:27,080 ships. He also does things like he goes to embassies and other secret locations in 312 00:25:27,080 --> 00:25:38,170 Afghanistan and Iraq, Ecuador, Serbia and sets up their pipes. He also did some 313 00:25:38,170 --> 00:25:43,620 pipe fitting in Dschibuti at some sort of Homeland Security facility which 314 00:25:43,620 --> 00:25:50,170 coincidently is also where many of the drone programs are run out of. So there’s 315 00:25:50,170 --> 00:25:54,640 some interesting cases like that’s where there are people like Michael Dial who 316 00:25:54,640 --> 00:25:59,020 aren’t involved in Intelligence at all, directly. But the information in the 317 00:25:59,020 --> 00:26:04,960 résumés still provides very interesting useful details about where secret 318 00:26:04,960 --> 00:26:07,880 facilities are located and other aspects of the Intelligence Community. Because 319 00:26:07,880 --> 00:26:11,090 secret facilities don’t just materialize out of thin air. They need people to build 320 00:26:11,090 --> 00:26:15,750 them, they need people to operate them. So from tracking down these people we can 321 00:26:15,750 --> 00:26:18,740 start to map them. And then there’re other useful things like we can figure out which 322 00:26:18,740 --> 00:26:25,740 companies clean the NSA. I’m sure that has all sorts of useful applications. 323 00:26:25,740 --> 00:26:33,850 This is Eleana Costa. He lives in D.C. and he works for the DOD. And this is Emeda’s 324 00:26:33,850 --> 00:26:38,340 High School Graduation back in 1988. He has been working in Military and 325 00:26:38,340 --> 00:26:45,240 Intelligence for nearly 20 years. And back in 2003, he worked on Psi Ops programs. 326 00:26:45,240 --> 00:26:50,880 Specifically he worked on Psi Ops programs in Paraguay, Columbia and Bolivia. And 327 00:26:50,880 --> 00:26:55,970 these were in support of DEED, the drug enforcement agency and the CIA. 328 00:26:55,970 --> 00:26:59,260 And there are a few other reasons ICWATCH you mention involvement in Psi Ops in 329 00:26:59,260 --> 00:27:04,480 Latin America for the DEA. It seems me quite an extensive thing especially since 330 00:27:04,480 --> 00:27:08,900 I didn’t collect any data on this specifically, and I had just suddenly a bunch 331 00:27:08,900 --> 00:27:13,950 of people on the database on this, so: maybe worth looking into a bit. And then 332 00:27:13,950 --> 00:27:17,320 after that he went and he worked on Psi Ops programs in Iraq. So it’s kind of 333 00:27:17,320 --> 00:27:22,120 interesting. Then he went and worked at the DOD on Human Intelligence. 334 00:27:22,120 --> 00:27:27,240 The other interesting thing about Kiliana Costa is that he’s one of the people who 335 00:27:27,240 --> 00:27:34,010 deleted his résumé after ICWATCH launched and that was how I found him. 336 00:27:34,010 --> 00:27:41,090 *laughter and applause* 337 00:27:41,090 --> 00:27:46,050 So after ICWATCH launched a lot of people were positively interested in it, but we 338 00:27:46,050 --> 00:27:49,180 also got a lot of threats because… it’s really absurd, because all we’re doing is 339 00:27:49,180 --> 00:27:52,670 collecting information that people explicitly, independently, willingly 340 00:27:52,670 --> 00:27:56,720 posted online about the profession; as we’re not posting addresses or 341 00:27:56,720 --> 00:28:02,930 anything like that. And making it more searchable. Just like google does. 342 00:28:02,930 --> 00:28:07,200 But a lot of people in the Intelligence Community contacted us and for the first 343 00:28:07,200 --> 00:28:11,730 few weeks, we saw a new response every day. Some of these were kind of 344 00:28:11,730 --> 00:28:17,580 interesting and reveals some sort of non- sensical mind sets of people in the 345 00:28:17,580 --> 00:28:25,330 Intelligence Community. Like this guy. This is Alexander Irinovitch. He sent me 346 00:28:25,330 --> 00:28:29,380 a…, actually a nice email, a very nice email. It was really nice. Saying that he 347 00:28:29,380 --> 00:28:32,740 couldn’t understand why he was in ICWATCH because he wasn’t involved in surveillance. 348 00:28:32,740 --> 00:28:36,610 He was working at a private company that had nothing to do with surveillance. 349 00:28:36,610 --> 00:28:42,750 So I looked at his profile and I saw that he was working at unit 8200, the Israeli 350 00:28:42,750 --> 00:28:46,930 Intelligence unit which, okay, there are mandatory military services not that 351 00:28:46,930 --> 00:28:50,810 weird, though he was there for several years, not just the mandatory portion, 352 00:28:50,810 --> 00:28:57,800 and this is the Intelligence unit that spies on Palestinians. And then I looked 353 00:28:57,800 --> 00:29:02,700 at where he works now. And he works for a company called Verint. According to their 354 00:29:02,700 --> 00:29:09,160 website they make software for analyzing data from wiretaps. So I think that has to 355 00:29:09,160 --> 00:29:13,220 do with surveillance. I’m not sure why he interpreted that as “nothing to do with 356 00:29:13,220 --> 00:29:16,940 surveillance”. But it’s kind of interesting interpretation I think it makes sense for him 357 00:29:16,940 --> 00:29:20,220 to be in the database, but of course, for any particular profile, there is 358 00:29:20,220 --> 00:29:23,140 some noise. So it’s up to whoever is looking at it to make the call 359 00:29:23,140 --> 00:29:26,050 and do the research. 360 00:29:26,050 --> 00:29:30,040 And sometimes other people who complained also helped us find interesting details. 361 00:29:30,040 --> 00:29:34,420 Like this guy, Joshua Lively. He’s one of the people who reported us to the FBI for 362 00:29:34,420 --> 00:29:43,120 domestic terrorism. He worked as a linguist at this company. I looked at 363 00:29:43,120 --> 00:29:48,490 his profile and he mentions a lot of interesting code words in it. 364 00:29:48,490 --> 00:29:51,750 Some of them didn’t make so much sense for the time. This thing called ZB. 365 00:29:51,750 --> 00:29:55,740 And then a few weeks later the Intercept released this article on a thing called 366 00:29:55,740 --> 00:30:03,830 Skynet. It’s used to use machine learning to analyze travel data, the telecom 367 00:30:03,830 --> 00:30:08,130 providers. And ZB is one of the databases they use and he, coincidently, has a lot 368 00:30:08,130 --> 00:30:12,130 of the databases that are used in this listed in his skills. And as a linguist 369 00:30:12,130 --> 00:30:14,860 professioned with the language that’s used in the region that’s mainly targeted 370 00:30:14,860 --> 00:30:18,510 in this… So I’m not sure if he’s involved in this particular program. But it seems 371 00:30:18,510 --> 00:30:22,860 like he’s involved in something similar. 372 00:30:22,860 --> 00:30:28,160 So it’s quite interesting. Generally there are a lot of angry people in the 373 00:30:28,160 --> 00:30:31,750 Intelligence Community. Some are nicer than others and were just asking questions 374 00:30:31,750 --> 00:30:35,910 being like “Can you please take my profile down!”, some other more afraid, some other 375 00:30:35,910 --> 00:30:40,640 were more violent and sending things like death threats. Our server started getting 376 00:30:40,640 --> 00:30:44,440 hit pretty hard and ICWATCH kept going down. We wanted to be sure that we weren’t 377 00:30:44,440 --> 00:30:48,090 going to be compelled to take the data down some way. And the easiest way not 378 00:30:48,090 --> 00:30:52,130 to be compelled to take the data down is to make it so you can’t really take the 379 00:30:52,130 --> 00:30:55,700 data down yourself. And the people had much less incentive to go after you. 380 00:30:55,700 --> 00:31:00,970 So we moved ICWATCH to Wikileaks which has been great, and they’ve been wonderful 381 00:31:00,970 --> 00:31:03,940 helping with all this. So thank you, Wikileaks! 382 00:31:03,940 --> 00:31:09,720 *applause* 383 00:31:09,720 --> 00:31:11,610 *from the audience:* Your welcome! 384 00:31:11,610 --> 00:31:13,760 M.C.: *chuckles* *laughter* 385 00:31:13,760 --> 00:31:17,500 As I mentioned earlier a lot of people are taking down their résumés in response to 386 00:31:17,500 --> 00:31:24,700 ICWATCH. Specifically 1.030 people have, out of the original 27.000. And others have 387 00:31:24,700 --> 00:31:29,120 edited them and made them private. So as part of the update in addition to doubling 388 00:31:29,120 --> 00:31:35,050 the number of résumés available we also recollected all of the initial résumés 389 00:31:35,050 --> 00:31:39,750 and you can go on the site and see which ones are removed, which ones are made 390 00:31:39,750 --> 00:31:43,590 private, which ones have been modified and all of that is fug so you can easily see 391 00:31:43,590 --> 00:31:50,540 how that’s changed. *applause* 392 00:31:50,540 --> 00:31:55,330 And some of these revealed details that people hadn’t posted… that many wish that 393 00:31:55,330 --> 00:32:00,760 they hadn’t posted in the first place. But they also provide useful updates on where 394 00:32:00,760 --> 00:32:05,480 people are working. Because they’re to track people as they move from job to job. 395 00:32:05,480 --> 00:32:10,840 E.g. there’s this guy, Michael Acosta, from the original ICWATCH. From 2011 396 00:32:10,840 --> 00:32:15,750 to 2012 he worked at Guantanamo. He was primarily trying to find out about 397 00:32:15,750 --> 00:32:21,690 potential attacks on Guantanamo itself. He monitored various detainees and 398 00:32:21,690 --> 00:32:27,660 collaborated with the Behavioural Science Team and was trying to figure out if 399 00:32:27,660 --> 00:32:32,790 detainees were planning some sort of coup, I guess. And then he started working for 400 00:32:32,790 --> 00:32:41,030 the Airforce. And here he was working on Drone Intelligence and targeting and such 401 00:32:41,030 --> 00:32:44,230 things like how he was responsible for “the production made instant upgrade of 402 00:32:44,230 --> 00:32:47,960 DGS2 mission critical Intelligence databases which include high value target 403 00:32:47,960 --> 00:32:52,550 development folders” like the things used for JPAL targeting, regional fairbriefs, 404 00:32:52,550 --> 00:32:57,980 mission storyboards and mission target logs with document FMV mission rollups. 405 00:32:57,980 --> 00:33:00,520 But the most interesting thing on this résumé isn’t any of those things. 406 00:33:00,520 --> 00:33:05,510 It’s the thing that changed between the original launch of ICWATCH and now. 407 00:33:05,510 --> 00:33:08,980 And that’s that he moved and started working for a different company. 408 00:33:08,980 --> 00:33:14,160 He started working for this company called… he called SOS International 409 00:33:14,160 --> 00:33:20,780 as All Source Analyst. He unfortunately had to leave the position that he had 410 00:33:20,780 --> 00:33:24,880 on the site coaching High School Baseball which he seemed to really like. 411 00:33:24,880 --> 00:33:27,630 And he kind of liked it because right now he’s looking for Baseball opportunities 412 00:33:27,630 --> 00:33:31,610 in Germany. So he seems to be in Germany working for this company called SOS 413 00:33:31,610 --> 00:33:34,730 International that I never heard of before. So I went on the website and they 414 00:33:34,730 --> 00:33:38,040 have a list of the cities that they operate in Germany. These 6 cities, 415 00:33:38,040 --> 00:33:43,870 along with Guantanamo and a number of other sketchy locations. And based on 416 00:33:43,870 --> 00:33:47,610 Michael Acosta’s past record of working at Guantanamo and on Drone targeting and 417 00:33:47,610 --> 00:33:50,130 things like that it sounds like this company is probably doing something quite 418 00:33:50,130 --> 00:33:56,450 sketchy. By tracking changes to where people work we can start to find things 419 00:33:56,450 --> 00:34:00,360 like this we might not otherwise think to look at. That we might not otherwise about 420 00:34:00,360 --> 00:34:03,070 as interesting. 421 00:34:03,070 --> 00:34:10,219 But it’s not just open data that we collect. Because the same tools for 422 00:34:10,219 --> 00:34:13,549 collecting and analyzing open data are also useful for other data sets, 423 00:34:13,549 --> 00:34:18,510 they’re useful. Like we made a search tool in collaboration with Church Foundation 424 00:34:18,510 --> 00:34:22,149 for all of the published Snowden documents that allows you to search the full text of 425 00:34:22,149 --> 00:34:26,280 the documents, browse which code words are in these documents, see documents that 426 00:34:26,280 --> 00:34:33,139 mention particular countries, see the full PDFs and articles. And we also made a… 427 00:34:33,139 --> 00:34:37,230 when the Hacking Team data came out this summer we mirrored the data and became one 428 00:34:37,230 --> 00:34:41,659 of the primary mirrors of the data. We had a torrent that was almost downing the server 429 00:34:41,659 --> 00:34:44,350 with a lot of space and figured that none of the other people had that, so we put it 430 00:34:44,350 --> 00:34:51,510 up. And that got a lot of traffic, it got about 57 M hits in the first 2 days. 431 00:34:51,510 --> 00:34:54,300 And soon we realized there was a problem where our server charged a lot for 432 00:34:54,300 --> 00:34:59,370 bandwidth and did cost us 48$ everytime someone decided to download the 400GB 433 00:34:59,370 --> 00:35:07,480 with WGET. So that was interesting but it’s been resolved now. It hopefully made 434 00:35:07,480 --> 00:35:11,030 the data more accessible to people who don’t have 400GB of harddrive space 435 00:35:11,030 --> 00:35:15,990 available or enough internet connectivity to download that. So then we’ve also made 436 00:35:15,990 --> 00:35:21,240 a search tool for all of the Hacking Team emails; that has a search interface that 437 00:35:21,240 --> 00:35:25,400 lets you browse them like you would in a normal email client with threading, and a 438 00:35:25,400 --> 00:35:28,870 network graph so that you can see the connections between senders and 439 00:35:28,870 --> 00:35:39,860 recipients. The Intelligence Community has a variety of collection disciplines: 440 00:35:39,860 --> 00:35:45,350 SIGINT, OSINT, HUMINT, measurements of Signals Intelligence, Symmetry 441 00:35:45,350 --> 00:35:49,080 Intelligence. They have all these different sources that they’re gathering 442 00:35:49,080 --> 00:35:55,780 data from. I think that we should try to duplicate this. Because there are a lot 443 00:35:55,780 --> 00:35:58,230 of different sources that we can gather data from as well, and we need to find 444 00:35:58,230 --> 00:36:01,600 base to better collect data from all these sources and to fuse them together. 445 00:36:01,600 --> 00:36:06,300 These are some other ones that I’ve been spending all the time looking at. 446 00:36:06,300 --> 00:36:10,170 And there’s open source Intelligence things like ICWATCH where you’re 447 00:36:10,170 --> 00:36:13,060 collecting data from purely public sources. But this is just part of the vare 448 00:36:13,060 --> 00:36:17,950 ecosystem that we can draw on. This is mostly information that people and 449 00:36:17,950 --> 00:36:21,230 institutions make about themselves publicly, either intentionally or 450 00:36:21,230 --> 00:36:25,840 unintentionally. And it’s really difficult to use because there’s a lot of it and it 451 00:36:25,840 --> 00:36:29,940 needs to be collected and matched up and pulled together in a browsable way for 452 00:36:29,940 --> 00:36:33,390 people to be able to use it. So you can’t really just mainly go and use it at scale. 453 00:36:33,390 --> 00:36:39,900 You can do it a little bit but not nearly enough. And so we’re working on making 454 00:36:39,900 --> 00:36:44,540 this easier to use. The other sort of data, it’s anonymously leaked documents, 455 00:36:44,540 --> 00:36:47,370 documents that were (?) sent to journalists, that they think should be 456 00:36:47,370 --> 00:36:51,700 public and these often pretty explicitly reveal corruption, human rights abuses 457 00:36:51,700 --> 00:36:56,480 or other issues. But this can also be used to collect more data. Like we used the 458 00:36:56,480 --> 00:37:00,800 published Snowden documents very heavily to find code words that we could use to 459 00:37:00,800 --> 00:37:05,240 collect the data in ICWATCH. And once we start to collect data on secret things 460 00:37:05,240 --> 00:37:10,800 that were recently not known at all, but now are, and we can find data on that, we 461 00:37:10,800 --> 00:37:14,140 can start to find data on unknown code words and unknown things that we might not 462 00:37:14,140 --> 00:37:20,560 otherwise recognize. And then there’s data released by governments, from FOIA 463 00:37:20,560 --> 00:37:25,400 requests through open data initiatives. This, of course, can be spun or things can 464 00:37:25,400 --> 00:37:31,370 be held back. So it’s not ideal to use on its own. But it can be used like the other 465 00:37:31,370 --> 00:37:34,740 2 types with in combination with each other. You can use that to provide context, you 466 00:37:34,740 --> 00:37:42,540 can use open source data to frame FOIA requests and things like that. So the goal 467 00:37:42,540 --> 00:37:46,730 of Transparency Toolkit is to make it easier to collect all these (?) data 468 00:37:46,730 --> 00:37:50,950 in one place and to start to use this data in the same ways that the Intelligence 469 00:37:50,950 --> 00:37:55,330 Community uses the data collected from all the various collection disciplines. 470 00:37:55,330 --> 00:38:00,400 Except gurgle isn’t to kill people or be some sort of omniscience to God-like being 471 00:38:00,400 --> 00:38:04,370 but we just want to build some sort of external structure of accountability. 472 00:38:04,370 --> 00:38:09,690 To make it easier to uncover and understand things like surveillance programs or human 473 00:38:09,690 --> 00:38:14,520 rights abuses or corruption. And when we can find the people and companies that are 474 00:38:14,520 --> 00:38:18,290 involved in things like surveillance we can start to map who’s doing what. 475 00:38:18,290 --> 00:38:21,870 And we can start to request information about specific contracts. And we know who 476 00:38:21,870 --> 00:38:24,580 we can ask questions about particular programs. And then we can start to use the 477 00:38:24,580 --> 00:38:30,020 data to start legal cases against specific companies. And we can start to take more 478 00:38:30,020 --> 00:38:34,850 concrete actions than we would be able to, otherwise, if we were dealing simply in 479 00:38:34,850 --> 00:38:38,820 theory or in guesses as to what’s going on. 480 00:38:38,820 --> 00:38:42,310 So – open source intelligence – let’s just be more pro-active and more direct with 481 00:38:42,310 --> 00:38:49,280 our techniques. And it also lets us find some of this information earlier, because 482 00:38:49,280 --> 00:38:52,490 many of the programs mentioned in the Snowden documents were mentioned first 483 00:38:52,490 --> 00:38:58,890 in other and open data sources. And if we can start to figure out where these are 484 00:38:58,890 --> 00:39:02,390 and start to figure out what they are, then we know what data we’re missing and 485 00:39:02,390 --> 00:39:05,410 we can start to go after it with FOIA requests or trying to find it by other 486 00:39:05,410 --> 00:39:14,060 means. But all of this a really, really big project and we can’t… this is not 487 00:39:14,060 --> 00:39:17,220 going to work if it’s just us working on it. We need to work with other people. 488 00:39:17,220 --> 00:39:20,650 We need to work with activists who have ideas of how they want to use the data. 489 00:39:20,650 --> 00:39:23,640 We need to work with journalists that collect the data and write stories about 490 00:39:23,640 --> 00:39:27,130 it. We need to work with human rights lawyers to help them with their research 491 00:39:27,130 --> 00:39:30,430 help them build legal cases based on the findings. We need to work with NGOs and 492 00:39:30,430 --> 00:39:34,800 human rights researchers who want to collect and use open data in their work. 493 00:39:34,800 --> 00:39:38,330 And we need more people going through databases like ICWATCH. This doesn’t 494 00:39:38,330 --> 00:39:42,340 require any special expertise. You gain the knowledge that you need as you’re 495 00:39:42,340 --> 00:39:46,490 going through them looking up terms. It’s not easy but it can be quite interesting 496 00:39:46,490 --> 00:39:52,040 once you combine all of these obscure terms and it’s like “Oh, that’s what 497 00:39:52,040 --> 00:39:56,840 they’re doing!” and oftentimes what they’re doing is something entirely absurd 498 00:39:56,840 --> 00:40:01,300 like reading all your email or killing people. 499 00:40:01,300 --> 00:40:05,870 And we also need software developers to help develop software and help us figure 500 00:40:05,870 --> 00:40:11,130 out how all of these tools should fit together. So if anyone’s interested in 501 00:40:11,130 --> 00:40:14,770 working with us to take on the Intelligence Agencies of the world and 502 00:40:14,770 --> 00:40:18,430 figure out what they’re doing please let us know. I think it sounds a bit insane 503 00:40:18,430 --> 00:40:23,130 and I know that, but (?) far more resources and far more experience but if 504 00:40:23,130 --> 00:40:27,720 we keep ignoring the situation and we continue as we are now making scattered 505 00:40:27,720 --> 00:40:30,640 attempts to change things that aren’t coordinated, that are based on limited 506 00:40:30,640 --> 00:40:36,290 information, nothing is going to change longterm. So I think we need to collect 507 00:40:36,290 --> 00:40:40,800 all the information we can and figure out how to effectively combine it and use it 508 00:40:40,800 --> 00:40:45,510 for concrete goals. And I think we need to do this with free software and open 509 00:40:45,510 --> 00:40:49,100 data, because against such powerful adversaries they’re probably the best 510 00:40:49,100 --> 00:40:51,490 hopes we have. 511 00:40:51,490 --> 00:41:01,940 *applause* 512 00:41:01,940 --> 00:41:05,960 Herald: Thank you, thank you so much! Now we have the round of Q&A, 513 00:41:05,960 --> 00:41:11,630 for anyone who liked to ask a question, please forward to the mikes on both sides 514 00:41:11,630 --> 00:41:17,070 of this Saal (Hall). Start taking the question from… 515 00:41:17,070 --> 00:41:18,440 *is nodding towards first person asking* …yeah. 516 00:41:18,440 --> 00:41:24,610 Q: So I’d like to ask about documents which are scans. Which are sometimes 517 00:41:24,610 --> 00:41:30,010 released as official open source information. What kind of workflow do you 518 00:41:30,010 --> 00:41:35,950 have or even if you have any kind of workflow for some OCR on these…!? 519 00:41:35,950 --> 00:41:40,870 M.C.: A serious (?) that depends on the document. There’s some open source 520 00:41:40,870 --> 00:41:46,960 software called Tesseract that’s quite good, but it doesn’t always work in cases 521 00:41:46,960 --> 00:41:51,260 where there needs to be more specialized parsing. I (?) to use something that’s 522 00:41:51,260 --> 00:41:54,830 called Abbyy (FineReader) which is, unfortunately, not open source and we are 523 00:41:54,830 --> 00:41:59,220 looking for an alternative. For the published Snowden documents, because we 524 00:41:59,220 --> 00:42:03,560 needed to extract the classification headers and that wasn’t so working with 525 00:42:03,560 --> 00:42:07,150 Tesseract. But Tesseract works for most things. 526 00:42:07,150 --> 00:42:10,030 *listens to unrecorded comment from the audience* 527 00:42:10,030 --> 00:42:15,190 Yeah. 528 00:42:15,190 --> 00:42:19,720 Herald: Thank you. Do we have question from… [the internet]? Yeah, oui! 529 00:42:19,720 --> 00:42:24,310 Signal Angel: Yes, rooty is asking on IRC: What would you recommend the NSA to 530 00:42:24,310 --> 00:42:27,540 develop towards a future of Social Usefulness!?? 531 00:42:27,540 --> 00:42:35,780 E.g. what value have databases from 2015, people cell phone sensors in 2115!?? 532 00:42:35,780 --> 00:42:40,550 Could you give the NSA, maybe CEO there, useful work!?? 533 00:42:40,550 --> 00:42:42,760 M.C.: Can you (?), sorry !?? 534 00:42:42,760 --> 00:42:50,010 Signal Angel: *naively repeats first of the apparent Troll questions* 535 00:42:50,010 --> 00:42:52,290 M.C.: *laughs* Social Usefulness… 536 00:42:52,290 --> 00:42:56,070 Probably the most useful thing they could do is stop collecting the data in the 537 00:42:56,070 --> 00:43:01,760 first place, especially the data that’s being intercepted or illegally collected. 538 00:43:01,760 --> 00:43:07,250 There’s probably some amounts of useful tracking they could do, but I’m not sure 539 00:43:07,250 --> 00:43:10,300 that’s the best approach using the tactice that they were to collect the data at that 540 00:43:10,300 --> 00:43:12,670 time. 541 00:43:12,670 --> 00:43:16,070 Herald: Thank you. So, next question from you, please! 542 00:43:16,070 --> 00:43:20,490 Question: Hello, thanks for the talk, that was one of the best ones I’ve seen at this 543 00:43:20,490 --> 00:43:26,740 congress. I was wondering what you think about the question you’re raising about 544 00:43:26,740 --> 00:43:30,840 “we shouldn’t make the same mistakes”. Because I’m not totally sure that’s 545 00:43:30,840 --> 00:43:34,780 possible because of things I’ve seen in other communities. All communities have 546 00:43:34,780 --> 00:43:41,100 their extremists and they will abuse this data. And then that allows a political 547 00:43:41,100 --> 00:43:46,610 attack on you, because they say you made that happen, it’s not true. But it will celd 548 00:43:46,610 --> 00:43:50,230 people. So how do you protect against that? 549 00:43:50,230 --> 00:43:53,660 M.C.: I think it’s hard to entirely protect against it because we can’t 550 00:43:53,660 --> 00:43:57,330 control the actions of other people. But people could also go off and use this data 551 00:43:57,330 --> 00:44:01,530 negatively by collecting it on their own, independently of us. I was actually quite 552 00:44:01,530 --> 00:44:05,280 impressed, after we launched ICWATCH, I haven’t heard of anyone complaining of 553 00:44:05,280 --> 00:44:07,380 threats that they’ve gotten from people… 554 00:44:07,380 --> 00:44:10,040 People in the Intelligence Community: I haven’t heard of anyone in the 555 00:44:10,040 --> 00:44:11,980 Intelligence Community complaining about threats that they’ve gotten as the results 556 00:44:11,980 --> 00:44:16,450 of ICWATCH being launched. All of the complaints have been theoretical. The only 557 00:44:16,450 --> 00:44:19,340 threats I’ve heard of resulting from ICWATCH are that from the Intelligence 558 00:44:19,340 --> 00:44:21,940 Community to us. I haven’t heard of anything, so I’ve been very impressed with 559 00:44:21,940 --> 00:44:27,190 the civility of the internet in that case. And I think that maybe, by framing it, and 560 00:44:27,190 --> 00:44:30,400 actually bringing it down to the individual level, and making it clear that 561 00:44:30,400 --> 00:44:35,460 these are people, that makes it a little bit less likely that people will go after 562 00:44:35,460 --> 00:44:37,610 them in a vicious way. 563 00:44:37,610 --> 00:44:43,260 Q: Have you thought of creating a kind of usage guidelines? I mean that's not gonna change what 564 00:44:43,260 --> 00:44:48,270 anyone does. But if someone does something you can then say “That’s against our usage 565 00:44:48,270 --> 00:44:52,170 guidelines” and it’s a political defence against someone accusing it… 566 00:44:52,170 --> 00:44:56,040 M.C.: Yeah, I don’t think there’s any way that we can enforce something like that. 567 00:44:56,040 --> 00:44:59,830 But we do try to be very careful with how we’re framing it in saying – like I (?) 568 00:44:59,830 --> 00:45:02,920 all the time in this talk saying these are people that are not evil people. They’re 569 00:45:02,920 --> 00:45:06,570 normal people that you should look at as such. So I think being very careful of 570 00:45:06,570 --> 00:45:09,140 framing it and we’ll be developing some sort of guidelines. That’s definitely a 571 00:45:09,140 --> 00:45:11,230 good idea. 572 00:45:11,230 --> 00:45:13,740 Herald: Thank you. Your question, please! 573 00:45:13,740 --> 00:45:19,590 Troll: Hi! First, thank you very much for this tool that makes it possible to fight 574 00:45:19,590 --> 00:45:27,750 back against, legally. For people who try to punish or yeah… 575 00:45:27,750 --> 00:45:34,020 What I have to say, or my question is: I worked in the last 3 1/2 years, let’s say, 576 00:45:34,020 --> 00:45:39,530 in the field of IT Forensics. And I worked with Maltego and stuff, and so I know what 577 00:45:39,530 --> 00:45:45,210 a lot of work it is to collect data and bring it into good conditions, so others 578 00:45:45,210 --> 00:45:57,480 could read it or you can get a goal, or see a goal. And what I personally think: 579 00:45:57,480 --> 00:46:04,700 it’s very important, this could be very sensible data to people and my question 580 00:46:04,700 --> 00:46:12,620 is: How do you [take] care that this data which you will offer to download will keep 581 00:46:12,620 --> 00:46:20,470 safe? That’s the first question, and the second is: Did you think about 582 00:46:20,470 --> 00:46:27,830 verifications? So you are collecting a lot of data, and in a few years another person 583 00:46:27,830 --> 00:46:34,650 wants to see if this data was correct. So do you verify the sources like MD5 sum 584 00:46:34,650 --> 00:46:44,230 or so you can say “This fingerprint taken at this-day and this-time is correct?” 585 00:46:44,230 --> 00:46:51,220 M.C.: For the first question: I don’t think there’s really… I’m not sure (?) 586 00:46:51,220 --> 00:46:56,220 protected because this is a version that people posted publicly themselves. So they 587 00:46:56,220 --> 00:47:00,720 sort of said that they don’t want it to be protected or secured because they’re 588 00:47:00,720 --> 00:47:07,250 posting it on the public internet. So I’m not sure there’s really any reason to try 589 00:47:07,250 --> 00:47:11,510 to protect it when it’s something that they’ve published very publicly. 590 00:47:11,510 --> 00:47:16,050 And on the second one, for verification, that’s quite tricky with some of the data 591 00:47:16,050 --> 00:47:18,990 especially around the Intelligence Community because all of these things 592 00:47:18,990 --> 00:47:22,320 are secretive and it’s hard to confirm them. We can confirm them against each 593 00:47:22,320 --> 00:47:26,760 other like now we have multiple résumé sites on ICWATCH, so sometimes we can find 594 00:47:26,760 --> 00:47:31,020 the same person’s résumé on another site and compare over time and we can go 595 00:47:31,020 --> 00:47:34,410 finding their profiles they have and try to combine as much data on the same 596 00:47:34,410 --> 00:47:36,310 as is possible and have it over time. 597 00:47:36,310 --> 00:47:41,790 Q: What I did: I made a fingerprint when I downloaded a website, I made a 598 00:47:41,790 --> 00:47:45,790 fingerprint and then I can say OK, this is… yeah. 599 00:47:45,790 --> 00:47:48,730 M.C.: Of truth verifying various actions collected, then. Yeah, (?) harder to 600 00:47:48,730 --> 00:47:54,980 absolutely do that on the behalf all of the full text of the web page save, then 601 00:47:54,980 --> 00:48:01,350 we have it all published on Github so you can verify those collected then but, yeah. 602 00:48:01,350 --> 00:48:03,980 Herald: We’ll take the questions from up there. 603 00:48:03,980 --> 00:48:10,390 Jake Appelbaum: Hi, community extremist here… So I wanted to say something which 604 00:48:10,390 --> 00:48:13,380 is that I think what Julian did for leaking documents you’re doing for 605 00:48:13,380 --> 00:48:17,800 analysis. Which is really great! Because transparency is enough – you need action! 606 00:48:17,800 --> 00:48:21,310 And so I just wanted to say that I hope that everyone can give and see in 607 00:48:21,310 --> 00:48:28,000 Transparency Toolkit a lot of material support. And maybe a round of applause! 608 00:48:28,000 --> 00:48:33,750 *applause* 609 00:48:33,750 --> 00:48:37,940 Definitely the best talk at the congress and I had a couple of suggestions. But 610 00:48:37,940 --> 00:48:41,640 one of them is: I think it would be great if you could focus on American Domestic 611 00:48:41,640 --> 00:48:43,060 Police Agencies. M.C.: Hmm-mhm… 612 00:48:43,060 --> 00:48:48,110 Jake: In particular collecting the images of Police Academy Graduation photographs. 613 00:48:48,110 --> 00:48:53,340 And to be able to move in the direction of facial recognition, so that we can find 614 00:48:53,340 --> 00:48:56,440 Undercover Police Officers that are in our midst… 615 00:48:56,440 --> 00:49:01,740 *applause* 616 00:49:01,740 --> 00:49:06,640 And I think it would be great if you could create a FOIA wizard, essentially, ’cause 617 00:49:06,640 --> 00:49:10,720 everybody likes wizards, and who doesn’t like UNIX… So it’d be great if you could 618 00:49:10,720 --> 00:49:14,290 create a FOIA wizard where you could say: “I wanna know about these terms” and it 619 00:49:14,290 --> 00:49:19,310 would just generate automatically – maybe by partnering with Macroc e.g. – 620 00:49:19,310 --> 00:49:22,890 interesting things, where there’s a kind of “Wait!”. Where you realize there’s a lot 621 00:49:22,890 --> 00:49:26,630 of people working on this classified program and it’s at this agency and they 622 00:49:26,630 --> 00:49:29,350 have a contract with this company and these are the people involved and just 623 00:49:29,350 --> 00:49:34,020 automatically generate those FOIAs and then get people to sort of sign up to put 624 00:49:34,020 --> 00:49:38,440 their name down and sort of sponsor a little transparency and to say “Oh, that’s 625 00:49:38,440 --> 00:49:41,610 the FOIA I wanna get behind, I’m in a check on it, you know, once a week, I’m 626 00:49:41,610 --> 00:49:45,170 gonna do this thing. Through Macroc.” I think that would be a way to take this 627 00:49:45,170 --> 00:49:49,410 information in a legal manner and to make it actionable. And I think there’s lots of 628 00:49:49,410 --> 00:49:53,869 other interesting things you could do that are not about the law. But I leave that to 629 00:49:53,869 --> 00:49:57,270 the imagination of other people. It should be legal but it doesn’t need to be through 630 00:49:57,270 --> 00:50:02,090 legal channels like, say, FOIA. So thanks for the work that you’re doing, M.C. and 631 00:50:02,090 --> 00:50:06,170 I hope that you will expand it to, basically, all of the pigs of the whole 632 00:50:06,170 --> 00:50:10,190 world. And I would really encourage you to read Hannah Ahrend’s “Eichmann in 633 00:50:10,190 --> 00:50:15,760 Jerusalem”, because you described a fundamental thing: these people aren’t 634 00:50:15,760 --> 00:50:21,280 evil. But actually, Evil itself doesn’t exist. These people are the Banality of 635 00:50:21,280 --> 00:50:26,040 Evil. They’re people who have soccer practice, and they have a dog, and they 636 00:50:26,040 --> 00:50:29,540 like to go home and fuck their wife, and they’re regular people who do drone 637 00:50:29,540 --> 00:50:31,520 strikes. 638 00:50:31,520 --> 00:50:36,340 *applause* 639 00:50:36,340 --> 00:50:40,150 Herald: Thank you. We have a question on mike 1. 640 00:50:40,150 --> 00:50:46,540 Q: How easy is it to add support for new databases or new sources of information? 641 00:50:46,540 --> 00:50:51,050 M.C.: It depends on the source and how that site is structured. But generally 642 00:50:51,050 --> 00:50:55,110 it’s not too difficult. The adding to proper new sources does require 643 00:50:55,110 --> 00:51:00,060 programming at this point. But it’s not particularly complex programming and we 644 00:51:00,060 --> 00:51:03,350 have some libraries that make some parts of it easier, as well. And if you’re 645 00:51:03,350 --> 00:51:05,700 interested in adding a data source we’re more than happy to help with that. 646 00:51:05,700 --> 00:51:10,980 Q: Awesome! My favourite is the list of… the report of when people were denied 647 00:51:10,980 --> 00:51:16,440 security clearance and why and if their appeal was then, like, removed. 648 00:51:16,440 --> 00:51:18,280 M.C.: Yeah, that would be quite interesting! 649 00:51:18,280 --> 00:51:24,490 Q: Okay! 650 00:51:24,490 --> 00:51:29,050 Herald: If there’s no further questions… moment… 651 00:51:29,050 --> 00:51:34,140 yeah, okay! Please! 652 00:51:34,140 --> 00:51:44,010 Q: Yesterday it was said that we have to make sure that they know that we watch 653 00:51:44,010 --> 00:51:50,900 them and make sure that they know that we watch them. Because some day they will get 654 00:51:50,900 --> 00:51:57,680 prosecuted. So, in some way. I think you are exactly doing this. So this is 655 00:51:57,680 --> 00:52:12,350 brilliant. Are you already in the stage where you’re thinking you can start 656 00:52:12,350 --> 00:52:18,390 concrete legal actions against some individuals that you are getting 657 00:52:18,390 --> 00:52:24,590 information with your tools. We’ve been working with some lawyers towards that. 658 00:52:24,590 --> 00:52:29,230 We are looking to do more in this, so if you know… if you have any ideas for 659 00:52:29,230 --> 00:52:32,080 particular situations where this may be applicable, our lawyers, that we should 660 00:52:32,080 --> 00:52:37,150 work with, let us know! But we’re working towards that and making some progress. 661 00:52:37,150 --> 00:52:41,730 Q: Thanks! 662 00:52:41,730 --> 00:52:44,690 Herald: Getting a question from up there, please! 663 00:52:44,690 --> 00:52:49,840 Q: I just wanna say that you are a visionary who is more passionate than 664 00:52:49,840 --> 00:52:53,420 anybody I have ever collaborated with and it’s a total honor. 665 00:52:53,420 --> 00:52:54,369 *applause* 666 00:52:54,369 --> 00:52:57,220 Herald: Thank you. 667 00:52:57,220 --> 00:53:02,780 M.C.: Yeah, and just to everyone, that’s Brennan who also works on Transparency 668 00:53:02,780 --> 00:53:06,710 Toolkit. He made the awesome UI for Harvester and Lookingglass that you saw 669 00:53:06,710 --> 00:53:09,470 in the Tabs of all this. 670 00:53:09,470 --> 00:53:14,780 *applause* 671 00:53:14,780 --> 00:53:17,900 Jake: If no one else is gonna ask a question, I’d like to ask a question which 672 00:53:17,900 --> 00:53:21,260 I know the answer to but no one else in the room does. And I think it’s very 673 00:53:21,260 --> 00:53:25,210 fascinating. I wonder if you could talk about lessons that you’ve learned from 674 00:53:25,210 --> 00:53:28,490 studying about the South African Resistance to Apartheid. 675 00:53:28,490 --> 00:53:30,020 *M.C. is laughing* Jake: And maybe you could talk about the 676 00:53:30,020 --> 00:53:34,880 things that drive you to work on these things. E.g. what inspires you to justice? 677 00:53:34,880 --> 00:53:39,310 E.g. experiences at MIT and maybe – I mean if you don’t want to talk about it, I’m 678 00:53:39,310 --> 00:53:42,940 sorry for asking it. But if you do wanna talk about it I think you can inspire 679 00:53:42,940 --> 00:53:48,930 everyone else here to raise their fist with you! In solidarity. 680 00:53:48,930 --> 00:53:57,150 M.C.: Yeah… Okay… I guess it’s been nearly 3 years now, so maybe that’s okay 681 00:53:57,150 --> 00:54:06,480 to talk about. 3 years ago there was this case at MIT… everyone has probably heard 682 00:54:06,480 --> 00:54:13,930 of Aaron Swartz and he was being prosecuted for downloading documents from 683 00:54:13,930 --> 00:54:22,480 JSTOR. And I was brought in trying to figure out MIT’s role in this situation, and if you 684 00:54:22,480 --> 00:54:26,400 might be able to sway a public opinion, a few people in Boston. I think some of 685 00:54:26,400 --> 00:54:31,110 them are in this room. And we were trying to help him. And eventually, part way into 686 00:54:31,110 --> 00:54:35,770 the process, he became afraid and decided that it would be more risky for us to help 687 00:54:35,770 --> 00:54:38,890 him, with the prosecutor who might lash back, so we stopped. But one of the things 688 00:54:38,890 --> 00:54:45,650 that I did in this process was, I sent out a survey to all of the professors at MIT 689 00:54:45,650 --> 00:54:54,450 asking their opinion on his case. And whether they identified with his actions. 690 00:54:54,450 --> 00:54:59,280 And I got a lot of response to this survey. Some were quite nice and were 691 00:54:59,280 --> 00:55:03,560 quite supportive. Some were very vicious, saying that he should go to jail and that 692 00:55:03,560 --> 00:55:09,040 he is a waste of humanity and he works at this Harvard Center for Ethics, so how is 693 00:55:09,040 --> 00:55:13,390 this ethical. And things like that. They were quite horrible. And initially he had 694 00:55:13,390 --> 00:55:17,540 access to this database and somehow over the next year, when we weren’t doing much, 695 00:55:17,540 --> 00:55:21,970 he lost access to this database. And he emailed me asking for access again. And 696 00:55:21,970 --> 00:55:26,800 back then I was on some stupid kick about research ethics and redaction and thought 697 00:55:26,800 --> 00:55:30,570 that there’s no reason to… It really seems that’s like “I cannot give you the answers 698 00:55:30,570 --> 00:55:34,770 about the names”. I was just stupid because the names are the most useful part of that 699 00:55:34,770 --> 00:55:42,470 data. And I kind of abandoned him, along with a lot of other people in that. And I 700 00:55:42,470 --> 00:55:50,119 feel like if I had given him the names that might have been something that could 701 00:55:50,119 --> 00:55:53,490 be used to find supporters within MIT or people who were rallying against him. And 702 00:55:53,490 --> 00:55:56,050 I don’t think it would have made a huge difference but it might have made just a 703 00:55:56,050 --> 00:56:02,140 little bit. And that was one of the things that really showed me the power of data on 704 00:56:02,140 --> 00:56:06,190 individuals and the role of individuals within institutions. And I feel like I 705 00:56:06,190 --> 00:56:10,780 really failed there. So I don’t want to do that again. 706 00:56:10,780 --> 00:56:16,270 *applause* 707 00:56:16,270 --> 00:56:20,540 Herald: Thank you. Unfortunately, we need to wrap up because we are out of time. 708 00:56:20,540 --> 00:56:26,900 Thank you for attending this very interesting lecture and, quite touching 709 00:56:26,900 --> 00:56:28,230 in the end. 710 00:56:28,230 --> 00:56:33,780 *postroll music* 711 00:56:33,780 --> 00:56:38,350 Subtitles created by c3subtitles.de in 2016. Join and help us do more!