On August 23, 2023, the United States Court of Appeals for the Ninth Circuit affirmed a previous ruling that scraping public profiles on LinkedIn does not violate the Computer Fraud and Abuse Act (CFAA). This ruling has significant implications for the use of web scraping and data aggregation from public websites. In this article, we will examine the background of the case, the court’s reasoning, and the potential impact of the ruling.
Background of the LinkedIn Scraping Case
The case originated in 2017 when LinkedIn sent a cease-and-desist letter to a data analytics company called hiQ Labs. HiQ had been scraping information from LinkedIn user profiles, aggregating the data, and reselling it to employers for talent management insights. LinkedIn argued that this scraping violated the CFAA and the user agreement. When hiQ did not stop scraping, LinkedIn blocked hiQ’s access to their site.
HiQ Labs then filed a lawsuit against LinkedIn asserting that the scraping of public data did not violate the CFAA. The district court agreed and granted a preliminary injunction preventing LinkedIn from blocking hiQ’s access. LinkedIn appealed the decision to the Ninth Circuit Court of Appeals, which affirmed the injunction in 2019. The Supreme Court declined LinkedIn’s request to review the case, allowing the Ninth Circuit ruling to stand.
Key Aspects of the Ninth Circuit Ruling
In the ruling, the Ninth Circuit focused on two key issues:
- Whether the scraping of user data from a public website violates the CFAA
- Whether preventing access to this data is permissible under the CFAA
Regarding the first issue, the court ruled that because the LinkedIn profiles were visible to anyone with an internet connection, there was no “breaking and entering” under the CFAA when hiQ scraped this data. The CFAA is intended to punish unauthorized access to non-public computer systems, whereas hiQ was accessing data made available to the general public.
On the second issue, the court ruled that LinkedIn could not invoke the CFAA to block access to public data simply because it wanted to prevent scraping. The court held that the CFAA does not grant companies the authority to restrict access to websites otherwise open to the general public. As long as users abide by the website’s terms of use, the CFAA does not allow the site to selectively block users.
Implications of the LinkedIn Scraping Ruling
This ruling could have far-reaching implications for web scraping and the use of public data on the internet. Here are some potential impacts:
Expansion of Data Scraping and Aggregation
The ruling makes it clear that scraping public profiles does not violate the CFAA. This will likely lead to an expansion of scraping and data aggregation from sites like LinkedIn. As long as the data is public, companies are now freer to harvest and analyze large datasets. The rise of AI and Big Data depends heavily on aggregating data from across the web.
Challenging Terms of Service Violations
The ruling also means that Terms of Service (ToS) violations alone do not necessarily establish CFAA violations. Scraping public data could still breach the ToS, but companies may not be able to use the CFAA for enforcement. However, ToS can still be enforced through breach of contract or copyright claims.
Using Technical Measures to Block Scraping
The Ninth Circuit said that blocking IP addresses to prevent scraping of public data is not permissible under the CFAA. However, websites may still use technical countermeasures like CAPTCHAs, scraping rate-limits, or network request blacklists. Such blocking methods are still in a legal gray area.
Pressure on Websites to Limit Data Privacy
Now that public profile pages can be freely scraped, LinkedIn and other sites may decide to reduce the amount of public information available. For example, they may:
- Remove public options for viewing profiles
- Require logins to view more profile data
- Give users options to privatize their information
- Turn off search engine indexing of profile pages
These changes could enable greater user privacy controls but would also limit the public data available for scraping and aggregation.
Other Legal Considerations
While the Ninth Circuit ruling settles some questions around web scraping, a number of legal considerations remain unresolved:
Scraping Private User Accounts
Scraping data behind a login wall rather than on public pages may still violate the CFAA. Courts draw a distinction between public and private data access.
Violating Other Laws
Scrapers may still violate copyright law, data protection regulations, and other statutes. The CFAA does not exempt scrapers from all other laws.
ToS Breach of Contract Claims
As mentioned above, violating a site’s Terms of Service may constitute breach of contract even if it does not violate the CFAA. However, contract claims must meet certain requirements.
Using Scraped Data for Unlawful Purposes
Scraping legal data for illegal activities – like identity theft, discrimination, fraud, etc. – remains prohibited. The ruling focused on scraping itself, not downstream data uses.
Jurisdictional Differences
The Ninth Circuit ruling is influential but applies directly only in the nine western US states under that court’s jurisdiction. Other US circuit courts could potentially rule differently on similar cases.
Comparison to Other Data Scraping Laws and Cases
It is helpful to contrast the LinkedIn scraping ruling with laws and cases in other jurisdictions:
United States
The US takes a lighter touch approach to web scraping regulation. Beyond the CFAA, there are few national laws restricting scraping of public data. However, platforms can use ToS to discourage scraping like LinkedIn attempted.
European Union
The EU has more stringent data protection laws than the US. The General Data Protection Regulation (GDPR) requires consent for data collection, limiting the scope of public data scraping.
PeopleBrowsr vs. Twitter (2012)
This earlier US case ruled that Twitter could selectively block API access to certain companies to prevent scraping. However, it differs because APIs are not public data.
Max Planck Gesellschaft v. Elsevier (2019)
A German court ruled that Elsevier could not block the Max Planck Institute from scraping bibliographic data. But this data was deemed public sector information.
Nationwide Biweekly v. JP Morgan Chase (2020)
A federal court upheld web scraping of public mortgage data despite the site’s ToS. This supports the LinkedIn precedent.
Jurisdiction | Key Laws and Cases | Implications for Scraping |
---|---|---|
United States | – LinkedIn v. hiQ Labs (2019) – Computer Fraud and Abuse Act (CFAA) |
– Public scraping generally allowed – Blocking access faces scrutiny |
European Union | – General Data Protection Regulation (GDPR) | – Enhanced data protections – Stricter limits on scraping |
Germany | – Max Planck v. Elsevier (2019) | – Public sector data scraping allowed |
This comparison shows that laws and precedent concerning data scraping vary considerably across jurisdictions. The LinkedIn decision gives more leeway for public scraping in the US context.
Conclusion
The LinkedIn scraping ruling has opened the door to expanded use of public profile data. While not exempting scrapers from all laws, it makes clear that aggregating public information does not inherently violate the CFAA. However, many legal uncertainties remain surrounding web scraping activities. Data harvesters should stay apprised of relevant laws, court decisions, and website terms of use when scraping online data. With proper precautions, this ruling enables exciting new possibilities for data analysis and AI based on publicly available data.