Uber’s Massive Scraping Program Collected Data About Competitors Around The World

Uber’s Massive Scraping Program Collected Data About Competitors Around The World

For years, Uber systemically scraped data from competing ride-hailing companies all over the world, harvesting information about their technology, drivers, and executives. Uber gathered information from these firms using automated collection systems that ran constantly, amassing millions of records, and sometimes conducted physical surveillance to complement its data collection.

Uber’s scraping efforts were spearheaded by the company’s Marketplace Analytics team, while the Strategic Services Group gathered information for security purposes, Gizmodo learned from three people familiar with the operations of these teams, from court testimony, and from internal Uber documents. Until Uber’s data scraping was discontinued this September in the face of mounting litigation and multiple federal investigations, Marketplace Analytics gathered information on Uber’s overseas competitors in an attempt to advance Uber’s position in those markets. SSG’s mission was to protect employees, executives, and drivers from violence, which sometimes involved tracking protesters and other groups that were considered threatening to Uber. An Uber spokesperson declined to comment for this story.

It’s possible Uber’s data gathering did not violate any laws – much of it occurred internationally, and the data was often collected from publicly-available websites and apps – but the work of Marketplace Analytics and SSG has attracted the attention of US federal investigators and the judge presiding over ongoing civil litigation against Uber for theft of trade secrets.

Marketplace Analytics and SSG’s work was dragged into the sunlight in recent weeks as part of Waymo’s lawsuit against Uber, which alleges Uber stole trade secrets from the self-driving car company for use in its own autonomous vehicles. In a pair of letters written earlier this year, Richard Jacobs, a former Uber employee, accused the company of using its competitive intelligence teams to steal trade secrets from Waymo and other companies; those letters became central in Waymo’s lawsuit after they were disclosed to Waymo in late November.

The trial, initially scheduled to begin this month, has been postponed until February to allow Waymo to investigate the claims included in the letters – that members of the Marketplace Analytics and SSG teams used secret servers, devices that couldn’t be traced to Uber, ephemeral messaging services, and physical surveillance to extract secrets from other ride-hailing companies and keep the information hidden from the prying eyes of competitors and the courts.

Uber’s intelligence agency

The Marketplace Analytics team traces its roots to a previous group within Uber that was known as Competitive Intelligence, or COIN. COIN also set up non-attributable servers to store information on competitors, and oversaw Hell, a program Uber used to track the location of Lyft drivers and offer them deals to switch to Uber. By scraping data from Lyft’s app, Uber was able to collect driver ID numbers and therefore track Lyft drivers’ locations. The existence of Hell, and COIN’s role in deploying it, were first reported in April by The Information.

COIN used an Amazon server that, according to domain records, was registered by a security engineer at Uber in March 2015, using his personal address and phone number. The secret server used by COIN was shut down in late 2016, sources said, and the team began to develop new, more tightly-held infrastructure under a new code name.

As the COIN team grew, it was merged into Marketplace Analytics. A small team of about a dozen software engineers and data scientists made up the Marketplace Analytics team, while its non-technical sibling, the Strategic Services Group, was staffed by several former government employees. The Marketplace Analytics team began to focus its efforts exclusively on overseas competitors and abandoned its investigations of Lyft in late 2015, two sources said. Lyft declined to comment on whether it was aware of Uber’s intelligence collection efforts, or whether it had conducted similar data gathering against Uber.

Marketplace Analytics conducted extensive surveillance of several of Uber’s overseas competitors, including Ola, a major ride-hailing platform in India, and Didi Chuxing, Uber’s one-time rival in China that bought out Uber’s business in the country for $US35 ($47) billion. A Didi spokesperson declined to comment on Uber’s data collection practices or its own competitive intelligence efforts. A spokesperson for Ola did not respond to a request for comment.

Marketplace Analytics focused on gathering competitor data online, while SSG hired contractors to travel overseas and glean information about threat groups and other companies in person — in at least one instance, obtaining a surreptitious recording of a conversation between executives at Grab and Didi Chuxing, according to court testimony from an Uber employee.

The team stored the information it collected about competitors on its more secure server, detached from Uber’s corporate infrastructure and kept hidden from most of the company’s employees, two sources said. Members of the Marketplace Analytics team were issued new computers to access the server and were expected to use them only for competitive intelligence work, so that no data on the devices could be formally traced back to Uber. They also used the encrypted, ephemeral messaging app Wickr to communicate with each other.

According to sources, the server needed to remain invisible to hackers and competitors. Even if Uber’s own systems were hacked, the company wanted to make sure that this system remained hidden. In addition to being issued non-attributable laptops that couldn’t be traced back to the company, employees also had access to pre-paid phones and Mi-Fi wireless internet devices.

The secretive nature of Marketplace Analytics and SSG has raised suspicions in the Waymo case, and in the offices of the US Attorney in both the Northern District of California and the Southern District of New York.

But sources said the Marketplace Analytics and SSG teams felt secrecy was necessary to protect the company from hackers, as well as competitors who were engaged in the same kinds of intelligence collection Uber was doing. One person said there was an overwhelming sense of distrust that pervaded their work, and that team members were on the lookout for threats.

In court, Jacobs testified that Uber used its non-attributable devices to surveil protestors and other groups Uber considered a threat, so those groups wouldn’t be able to trace them back to Uber.

Although paranoia pervaded the Marketplace Analytics and SSG teams, it wasn’t entirely unwarranted. Employees needed to be warned about protests outside Uber offices, and drivers sometimes needed to be protected from violence in new markets, a fourth source explained. In one extreme instance last December, SSG discovered a WhatsApp group in which an individual threatened to attend an event Uber’s former CEO Travis Kalanick was attending in India and set themselves on fire.

Marketplace Analytics used its clandestine setup to scrape data from competitors’ websites, Github accounts, Pastebin posts, and APIs to gather information about how they operated. That data was siloed away from Uber’s network, analysed, and fed to various teams within Uber to help them gain an edge on their competition, two sources familiar with Marketplace Analytics explained. Marketplace Analytics also focused its tactics on Uber itself in an effort to detect vulnerabilities and understand how feasible it would be for competitors to collect intelligence on Uber, using web monitoring techniques to scour the internet for leaks.

This work led to an incident that Jacobs described as theft of a competitor’s proprietary code.

“There was a discussion around the acquisition of data about a rival firm overseas that had been a success,” Jacobs testified in court, noting Marketplace Analytics’ efforts to find private code accidentally posted to public Github accounts. Those efforts, he said, yielded “a way to understand more about how rival platforms functioned by finding sort of spilled data, so to speak, on the Internet.”

However, one source familiar with Marketplace Analytics downplayed the incident, explaining that an employee for a European competitor, Gett, posted proprietary code on Github and the Marketplace Analytics team merely looked at it. Gett did not respond to a request for comment about the incident.

Uber scraped pricing information from competitors’ websites and apps, and used a technique called “eyeballing” to gather vehicle location data from apps. In some instances, as was the case with Hell, Uber was able to collect unique ID numbers companies used to identify their drivers and use that data to track them. Uber supplemented the data it collected with other information purchased from data brokers.

“If an app has rudimentary function overseas, especially when it’s first launched, a rider could simply pull up their app, request a ride. And as soon as they request a ride, it identifies a driver, their name, and their vehicle and licence plate number or phone number. Any rider, anybody who has an email address could register and get that data about drivers,” Jacobs explained in court.

Uber then used machine learning to analyse this wealth of information in an attempt to discover how many trips competitors were making, which would reveal the market share dominated by the competitor. The data was also used to infer how much cash its competitors were burning on driver promotions. Ride-hailing companies regularly spent millions to increase their market share by just a few percentage points, one source explained, and scraped data helped inform how that money should be spent.

The collection was sanctioned by in-house lawyers at Uber, who designated documents reviewed by Gizmodo that outlined Marketplace Analytics’ techniques as attorney-client privileged. Uber’s competitive intelligence teams were also egged on by then CEO Travis Kalanick’s example — SSG threat analyst Ed Russo testified in November that he used Kalanick’s secretive meetings with Anthony Levandowski, a former Waymo employee accused of stealing trade secrets and bringing them to Uber, as an example of successful tradecraft during a staff meeting with security employees.

“I then said words to the effect that, you know, if our CEO is willing to go to these kinds of length to help protect information while he’s trying to negotiate a deal, what does that say about the contributions we can make in protecting information?” Russo testified.

But internal support for their competitive intelligence efforts has waned. In August, Uber announced a new policy governing the collection of competitive intelligence.

Uber’s new chief legal officer Tony West warned employees this month that any ongoing physical surveillance efforts had to end, and the company’s current CEO, Dara Khosrowshahi, publicly announced that Uber employees had been directed to stop using Wickr as of September 27. “My understanding is that this behaviour no longer occurs at Uber; that this truly is a remnant of the past,” West wrote in an email to Uber’s security teams. “But, to be crystal clear, to the extent anyone is working on any kind of competitive intelligence project that involves the surveillance of individuals, stop it now.”

West’s message emphasised individual surveillance, but Jacobs seemed more concerned with the digital mass surveillance Uber conducted. “I suppose because of my personal metrics, it felt overly aggressive and invasive and inappropriate,” he testified in November.

Although the team’s secret server system has been widely criticised, compartmentalising sensitive data in order to secure it isn’t an unusual practice in cybersecurity, in the broader tech industry, or in government — for instance, the National Institute of Standards and Technology recommends compartmentalising sensitive data for all federal computer systems.

Uber’s web monitoring and scraping didn’t seem unusual to employees either, three sources said, and they believed it was common throughout the industry because they often caught competitors scraping data from Uber’s services. Uber considers this kind of behaviour abusive and tries to block it from happening on its own apps, but it doesn’t regularly pursue legal action against scrapers.

Scrapers in other parts of the tech industry often operate in a legally grey area. The data startup HiQ recently sued LinkedIn after LinkedIn alleged that HiQ’s scraping of user profiles violated the Computer Fraud and Abuse Act. A federal judge ruled in August that HiQ’s activity didn’t violate the anti-hacking law, and it’s possible that Uber’s web monitoring would be similarly protected.

“Using automated scripts to access publicly available data is not ‘hacking,’ and neither is violating a website’s terms of use,” Jamie Williams, a staff attorney at the Electronic Frontier Foundation, wrote in defence of HiQ.

Still, Marketplace Analytics and SSG members are being dragged into depositions for the Waymo litigation and being questioned about Jacobs’ letters, experiences that have left them worried about their own legal exposure.

The blowback has also hit Wickr, the encrypted messaging service used by Uber. Two sources explained Uber was drawn to the app for its security features, which could help Uber avoid an embarrassing data breach like the 2014 Sony hack.

Wickr CEO Joel Wallenstrom told Gizmodo that the company is currently developing a product that would be able to retain some messages for pending litigation, a change that’s intended to placate executives like Khosrowshahi who are worried that their company’s use of encryption will be viewed as nefarious by courts and regulators.

“I would go to them and say, ‘We have a really cool way to minimise data.’ They would get tripped up in optics rather than practicalities,” Wallenstrom said, recalling his conversations with lawyers at large tech firms. “There was a moment when I paused and said, ‘Wow, this community sees things differently than the information security community.’”

“I think it’s really important that people see the contradiction between yelling and screaming and stomping their feet about how bad Equifax is at protecting our data, and then when someone uses a smart approach to data minimization to protect data, they’re skewered for that,” he added. It begs the question, he said, “Which one is it you’re going to fire me for today?”

The legal challenge ahead

Of all its competitive intelligence projects, Hell in particular has come back to haunt Uber. This fall, the company confirmed the federal investigation into the program. That investigation has stalled Marketplace Analytics’ work completely. The group’s automated scraping systems were turned off in September, two people familiar with the situation said.

Its work may have ended, but the repercussions of Uber’s scraping efforts are just beginning to emerge. Jacobs’ letters, which are expected to be released publicly on Wednesday, threaten to expand the investigations into Marketplace Analytics and deepen the damage that their work might do to a company already scrambling to save its reputation. The letters are certainly crucial to the Waymo case — the entire trial has been delayed because of them — and it’s not yet clear how pertinent they will become in the other investigations Uber faces.

The first letter was Jacobs resignation email, fired off on April 14th to Kalanick, the company’s former top lawyer Sallie Yoo, its human resources lead Liane Hornsey, and its brand-new PR chief Jill Hazelbaker, who had been promoted to the role just days prior. Jacobs’ chosen subject line: “Criminal and Unethical Activities in Security.”

Jacobs, then a 37-year-old security analyst at the company, had worked for Uber’s global intelligence team for just over year. He had just been caught forwarding company emails to his personal email address, and claimed that he’d done so in order to blow the whistle on Uber’s illicit activity.

The email, paired with a 37-page demand letter sent to Uber by Jacobs’ attorney Clayton Halunen three weeks later, set off a sequence of worst-case-scenario events for a company already mired in multiple legal fights and public scandals. Although Uber paid $US7.5 ($10) million to keep his claims quiet, Jacobs’ email and his lawyer’s letter eventually made their way into the hands of lawyers for Waymo.

Since Jacobs’ claims were revealed to Waymo’s lawyers on Thanksgiving, they have been tantalized by the idea that Uber used secret, secondary systems to store information about competitors. In its trade secrets lawsuit, Waymo needs to prove not only that its former engineer Anthony Levandowski stole its technology, but that he introduced their tech into Uber’s self-driving cars when he went to work there last year. So far, Waymo’s searches of Uber servers haven’t turned up the 14,000 documents Levandowski allegedly stole. Although Uber’s own lawyers also admitted that they uncovered videos Uber employees had filmed of a competitor’s self-driving cars, it’s not clear whether that competitor was Waymo or another company. (Waymo declined to comment.)

Judge Alsup acknowledged the implications Marketplace Analytics’ work could have on the case. “You stood up so many times and said, Judge, we searched our servers; these documents never hit a Uber server,” he said in a rebuke of Uber’s lawyers. “You never told me that there was a surreptitious, parallel, nonpublic system that relied upon messages that evaporated after six seconds or after six days. You never mentioned any of that stuff. You never mentioned that there were these offline company-sponsored laptops.”

There are references to Waymo in both Jacobs’ resignation email and Halunen’s demand letter, according to portions of the letters that were read aloud and discussed in court. Halunen’s letter specifically accused Marketplace Analytics of stealing trade secrets from Waymo and using its shadowy network of secret servers, ephemeral messages, and devices that couldn’t be traced to Uber to cover it up.”Jacobs is aware that Uber used the MA team to steal trade secrets at least from Waymo in the U.S.,” Halunen wrote.

But — somewhat bizarrely — Jacobs testified that he and his lawyer were wrong. The Marketplace Analytics team never focused its efforts on Waymo, he said. A source familiar with the team’s work also told Gizmodo that the team didn’t target Waymo. Halunen did not return a request for comment about his former client’s claims.

“I don’t think I did as thorough a job as I wish I could have,” Jacobs said in court, describing his fact-checking of his attorney’s letter. He only reviewed Halunen’s work for 20 minutes during a vacation, he added. When asked about the part of the letter that accused Marketplace Analytics of stealing Waymo’s trade secrets, Jacobs testified, “I don’t stand by that statement.”

It’s easy to assume that Jacobs has been bought off. In late August, Uber reached a $US4.5 million settlement with Jacobs. He received $US2 million upfront and $US1.5 million in stock. As part of the settlement, Jacobs agreed to be a consultant with Uber’s internal investigation into Marketplace Analytics, spearheaded by the law firm WilmerHale, for which he will receive another $US1 million over the course of a year. Uber also paid an additional $US3 million to Halunen, Jacobs’ lawyer.

But it’s worth considering what information Jacobs had access to while he worked at Uber, and what conclusions he might have drawn from it.

Jacobs started at Uber in March 2016 as a manager of global intelligence. He reported to Mat Henley, Uber’s head of global threat operations, and his work involved reviewing and parsing data from press reports, social media, and crime statistics in an attempt to understand potential threats in the regions where Uber operates, he explained during testimony. Henley also supervised the Marketplace Analytics team, but Jacobs didn’t have access to their data. Jacobs testified that his knowledge of Marketplace Analytics’ activities was mostly based on workplace chatter between Henley and other employees.

Jacobs wanted to create a centralised database of intel collected by the Marketplace Analytics and SSG teams, but claimed that his effort was met with resistance by other Uber employees, who thought the database would create an unnecessary paper trail.

In February of this year, Jacobs sat down for his annual performance review with Henley. It did not go well, and after the performance review, Jacobs was demoted.

“Jacobs experienced this review and demotion as pure retaliation for his refusal to buy into the threat ops culture of achieving business goals through illegal conduct, even though equally aggressive legal means were available to achieve the same end,” Halunen wrote in his demand letter, an assertion that Jacobs left undisputed during his testimony. Both Jacobs’ email to Kalanick and the damning letter written by Halunen contained a mix of Jacobs’ own firsthand knowledge and assumptions he made about Uber’s operations, Jacobs later testified.

Although Jacobs is being paid as a consultant, it doesn’t appear that his advice is being taken very seriously within Uber’s walls. Uber’s deputy general counsel Angela Padilla downplayed Jacobs’ allegations.

“We felt that Halunen was trying to extort the company, and I wanted to take the air out of his extortionist balloon,” Padilla testified, explaining the decision to turn over his letters to the US Attorney’s Office in June. Uber also gave the letters to the Northern District of California and Southern District of New York offices in September. “This case, given the huge sums of money that Mr. Jacobs was demanding at the outset, I felt was clearly extortionist, especially given the low value of his claims.”

Even though Uber didn’t believe Jacobs, it’s clear that Waymo does – and Judge Alsup thinks his claims are credible enough to allow Waymo to investigate the Marketplace Analytics team.

“I know it’s scandalous, but it’s something that the United States Attorney thinks at least is true enough to give to me,” Judge Alsup said. “If even half of what’s in that letter is true, it would be a huge injustice to force Waymo to go to trial and not be able to prove the things that are said in that letter.”

The Cheapest NBN 50 Plans

It’s the most popular NBN speed in Australia for a reason. Here are the cheapest plans available.

At Gizmodo, we independently select and write about stuff we love and think you'll like too. We have affiliate and advertising partnerships, which means we may collect a share of sales or other compensation from the links on this page. BTW – prices are accurate and items in stock at the time of posting.