LinkedIn is one of the world’s largest professional networking platforms, with over 800 million members globally. Recently, reports have emerged that scraped data of 500 million LinkedIn users is being sold online, with 2 million records leaked as proof.
What is LinkedIn scraping?
LinkedIn scraping refers to the automated collection of public information from LinkedIn profiles without users’ consent. This data can include information like full names, job titles, companies, locations, skills, and connections.
Scrapers use bots to harvest this information at scale. The scraped data is then aggregated into large datasets which can be sold commercially. This allows buyers to access contact information and other details on millions of professionals quickly and easily.
Implications of LinkedIn data scraping
Here are some key implications of mass LinkedIn scraping:
- Privacy concerns – Members may not want their information sold without consent.
- Security risks – Scraped data could enable phishing, fraud and identity theft.
- Spam and unsolicited communications – Contact details can be used to send spam messages.
- Competitive intelligence – Buyers may use data to research competitors and gain unfair advantage.
- Reputational risks for LinkedIn – Highlights limitations in protecting user data privacy.
Details on the LinkedIn scraping incident
In late 2022, a user on a popular hacking forum claimed to be selling scraped information on 500 million LinkedIn users. Key details include:
- Asking price was reportedly 4 bitcoin (around $100,000 at the time).
- The poster stated the data was scraped between December 2021 and September 2022.
- Samples containing 2 million records were shared as proof of the overall data cache.
- The samples contained real LinkedIn profile information like names, addresses, job titles, genders and phone numbers.
Cybersecurity researchers who analyzed the samples confirmed they contained legitimate LinkedIn data. One expert called it one of the largest LinkedIn data leaks to date.
LinkedIn’s response
LinkedIn has disputed the scale of the incident. A spokesperson stated:
Our initial investigation shows that this purported LinkedIn scraped data is actually an aggregation of data from a number of websites and companies. It does include publicly viewable LinkedIn profile data that appears to have been scraped.
The company maintains there was no LinkedIn data breach. However, it is investigating misuse of its publicly viewable member profile data.
Past incidents of LinkedIn scraping
This is not the first time large amounts of LinkedIn data have been scraped and sold online. Some notable past incidents include:
- 2021 – 700M records: Data on 700 million LinkedIn users was found for sale, including emails, names and more.
- 2021 – 500M records: Scraped info on over 500 million LinkedIn users posted and sold online.
- 2020 – 500M records: Details on 500 million LinkedIn profiles were sold on hacker forums and websites.
Despite LinkedIn’s efforts to combat scraping, the sheer volume of public information on their platform makes it challenging to prevent abuse at scale.
Scraping prevention best practices
Here are some ways LinkedIn could potentially better protect user data from scraping risks:
- Proactive monitoring – Use AI to detect and shut down scraping bots at scale.
- Rate limiting – Impose stricter limits on API and webpage access.
- Legal action – Pursue serious cases of illegal scraping and solicitations.
- Improved consent flows – Provide clearer options for members to control public visibility.
- Automated profile anonymization – Mask emails, phone numbers and other personal details by default.
How individuals can protect their LinkedIn data
For concerned LinkedIn members, here are some steps to help minimize your personal data exposure:
- Review profile settings – Limit public visibility of personal details like email, address and phone number.
- Be selective with connections – Only accept invites from people you know and trust.
- Watch for social engineering – Reject suspicious messages requesting personal data.
- Use tools sparingly – Be cautious of third-party apps requesting access to your profile data.
- Report issues – Notify LinkedIn if you see signs of spam, scraping or data misuse related to your account.
The risks of purchasing scraped LinkedIn data
For those considering purchasing scraped LinkedIn data, there are also significant risks to be aware of:
- Legal liability – Purchasing illegally obtained data can violate privacy laws.
- Inaccurate information – Scraped data is often outdated, incorrect or incomplete.
- Reputational risks – Use of illicit data could damage brand reputation if exposed.
- Security threats – Data sets may contain malware, exposing your systems and network.
- Uncertainty – Hard to verify origins and collection methods of scraped data.
Conclusion
The alleged leak of LinkedIn data serves as a reminder of the massive amounts of our personal information circulating online – often without our knowledge or consent. For LinkedIn users, it highlights the need to be more cautious about what profile data is made public.
For LinkedIn itself, more proactive safeguards are needed to detect and halt unauthorized scraping activity that violates their terms of service. Legally obtaining consent before collecting user data, implementing privacy protections by design, and being transparent about third party data access can help rebuild trust and prevent future abuse.
Key Details on LinkedIn Scraping Incident |
---|
Forum user claims to be selling data on 500M LinkedIn users |
Asking price was reportedly 4 bitcoin (around $100K at the time) |
Stated data was scraped between Dec 2021 – Sept 2022 |
2M record samples shared as proof of overall data |
Samples contained real profile info like names, jobs, contacts |
Experts confirmed legitimacy of data in samples |
LinkedIn disputes scale, blames aggregations of public profiles |
Notable Past LinkedIn Scraping Incidents |
---|
2021 – 700M records containing emails, names etc found for sale |
2021 – 500M records of LinkedIn users’ data posted and sold online |
2020 – 500M LinkedIn profiles’ details sold on hacker forums |
Scraping Prevention Best Practices |
---|
Proactive monitoring to detect scraping bots |
Implement stricter rate limiting on access |
Pursue serious cases of illegal scraping |
Improved consent flows for member visibility controls |
Automated profile anonymization |
How Individuals Can Protect LinkedIn Data |
---|
Limit public visibility of personal details |
Be selective with connections |
Watch for suspicious messages requesting data |
Cautious of third-party apps accessing profile data |
Report any signs of data misuse on account |
Risks of Purchasing Scraped LinkedIn Data |
---|
Legal liability for buying illegally obtained data |
Inaccurate, outdated or incomplete information |
Reputational damage if use exposed |
Security threats from potential malware |
Uncertainty around data origins and collection methods |