Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wont someone think of poor little LinkedIn, a subsidiary of one of the largest data brokers in the world?


Why frame what you are trying to say like that? Businesses of all sizes deserve the ability to protect their businesses from abuse.


Do they respect my data? Why do they get to track me across sites when I clearly don't want them to but someone can't scrape their data when they don't want them to. Why should big companies get the pass but individuals not? They clearly consider internet traffic fair game and are invasive and abusive about it so it is not only fair to be invasive and abusive back, it is self defense at this point.


They don’t need to track your web browser when they’re owned by Microsoft, because they track every action at a lower level.


Weird, I don't use Windows as an OS but have linkedin. I'd believe the concern and disregard of Linkedin's concern is fair game.


What lower level? Microsoft owns internet?


The operating system. For example see the Windows 11 screenshot debacle/scandal.


Are you talking about Recall, which got such huge negative press they delayed it a year and added a clear opt-in? And never sent anything off the device itself?

If anyone has evidence of constant tracking and reporting then please share it.


Well, I won't touch Windows 11 with a ten feet pole and I don't know if what I am referring to is called "Recall". Not that much into the MS terminology. I also read about Windows 11 having all kinds of shenanigans to suddenly upload data into onedrive. Wouldn't be surprised, if that also included screenshots, or could "accidentally" lead to that happening. Screenshotting every few seconds is unacceptable even if it stays on the device per se. Once data exists, it has potential to leak, and we have not even started considering malware infection yet. Huge risk to people's privacy and safety online.

We can stop pretending all it alright at some point, can't we? We don't need more enshittification. Windows 11 is already a disaster, that no one wants. It already starts with its idiotic HW requirements, trying to make perfectly fine HW obsolete. $$$


There was a lot of pushback to Recall for a reasons, yes. But it's not what you described, and criticism works a lot better when it's accurate.

For suddenly putting your documents into onedrive, that's real but it started years ago in windows 10.


“They” is an in incredibly useful tool.


You do realize anti-scraping measures are one way of protecting your data too?


In this context, "protecting" means the interest of linkedin who aggressively sells the data. Users that give data to linkedin are not protecting their data either way.


Because you signed up to a set of terms and conditions saying LinkedIn can use your data in this way


What if I signed up before those ToS said they could use my data in this way?

Oh right, companies change ToS and EULA and "agreements" without notice, without due process, and without recourse.

I have no problem changing how I use "their" data in such situations.


> Oh right, companies change ToS and EULA and "agreements" without notice, without due process, and without recourse.

Companies change their terms of service all the time. They usually send emails about it.

I've responded to decline them a handful of times and asked for my account to be deleted. I chuckle slightly at the work it creates, but sometimes it has been easier to close an account that way.


No one likes paying taxes but they still do it. They could just not work and not have money and therefore not need to pay tax.


Except what you have to pay each year for the privilege of staying in "your" house.


I didn't want the web to turn into monolithic platforms. I abhor this status quo.

You cannot function without these enterprises, but that doesn't mean they're ideal or even ethical.

Microsoft wins because of network effects. It's impossible to compete. So I think it should be allowed to assail their monopoly here by any means. It's maximally fair for consumers and for free markets.

Ideally capitalism remains cutthroat and impossible to grow into undislodgeable titans.

Even more ideally, this would become a distributed protocol rather than a privately owned and guarded database.


That doesn't actually mean anything


I think they framed it this way because they don't consider scraping abuse (to be fair, neither do I, as long as it doesn't overload the site). Botting accounts for spam is clear abuse, however, so that's fair game.


No, I consider all data collection and scraping egregious. From that perspective, LinkedIn is hypocritical when Microsoft discloses every filesystem search I do locally to bing.


Are you not scraping a site with your eyeballs when you view a site?


By that logic I can charge you for looking at me.


I agree. Maybe that logic (which is your logic) isn't very good.


You’re just making yourself look dumb by drawing invalid comparisons and an inaccurate understanding of my logic.


When they scrape, it’s innovation. When you scrape, it’s a felony.


I'm sure there are issues with fake accounts for scraping, but the core issue is that LinkedIn considers the data valuable. LinkedIn wants to be able to sell the data, or access to it at least, and the scrapers undermine that.

They could stop all the scraping by providing a downloadable data bundle like Wikipedia.


thinking more about, I don't think its a terrible thing that they prevent scraping. Their listings are already suffering from being flooded with garbage applications and having to sift through tons of noise. allowing scraping would just amplify that and make the platform almost entirely worthless.

I "scrape" linkedin in a roundabout way for personal use, and really what Ive found is that i should just maybee not bother at all. I can't get through the noise even when im applying at places that heavily match my skillset, and just get automated rejection emails.


LLMs scrape Wikipedia all the time, or at least attempt to.

The data bundle doesn't help that at all.


That's true, the normal scraping would still happen, but it would eliminate this side business of trying to re-sell LinkedIn's data.


What is abuse? Is it anything that reduces my profit margin? Or is it anything that makes the world a worse place? The Flock CEO called Deflock terrorism, is he right?


this exchange -- obvious critical / perhaps insurrection speech versus a stable voice of business economics -- should be within the purview of an orderly and predictable legal environment. BUT things moved quickly in the phone battles. Some people say that the legal system has never caught up to the data brokering, and in fact the surveillance state grew by leaps and bounds.

So, reasonable people may disagree. This is a fine place to mention it .. what if individual profiles built at LinkedIn are being combined with illegitimate and even directly illegal surveillance data and sold daily? Everyone stand up and salute when LinkedIn walks in the room? there has to be legal and direct ways to deal with change, and enforcement to complete an orderly and predictable economic marketplace.


>BUT things moved quickly in the phone battles. Some people say that the legal system has never caught up to the data brokering, and in fact the surveillance state grew by leaps and bounds.

Partially by discrepancy in how responsive you can be or comprehensive you must be to win the next round of cat-and-mouse, and partially because a private/corporate surveillance apparatus is useful to a government that might otherwise be hampered by constitutional bounds.


We enjoy the fruits of an LLM or two from time to time, derived from hoards of ill gotten data. Linkedin has the resourses to attempt to block scraping, but even at the resource scale of LI I doubt the effort is effective.


I am not denying that scraping is useful. If it wasn't people wouldn't do it. But if the site rules say you aren't allowed to scrape, then I don't think people should be hostile towards the people enforcing the rules.


Well, they can try to enforce the rules; that's perfectly fair. At the same time, there are many methods of "trying" which I would not consider valid or acceptable ones. "Enforcing the rules" does not give a carte blanche right to snoop and do "whatever's necessary." Sony tried that with their CD rootkits and got multiple lawsuits.


the abuse>using the information they publish to the public


Yes, until it becomes abusive and malignly affects innocents.


The big social media businesses deserve a Teddy Roosevelt character swooping in and busting their trusts, forcing them to play ball with others even if it destroys their moats. Boo hoo! Good riddance. World's tiniest violin.

This is a popular position across the aisle. Here's hoping the next guy can't be bought, or at least asks for more than a $400M tacky gold ballroom!


I mean, regardless of who they are or even if you don’t like what LinkedIn does themselves with the data people have given them, the random third parties with the extensions don’t additionally deserve to just grab all that data too, do they?


Surely they do! The data is in the public internets, aren't they?


They'd put Widevine or PlayReady DRM on the website if they could, I'm sure.


why can't they?


because they're only for video files?


I say the same thing about my start menu sending every action I perform to bing.


Eh. I worked at a company which made an extension which scraped LinkedIn. We provided a service to recruiters, who would start a hiring process by putting candidates into our system.

The recruiters all had LinkedIn paid accounts, and could access all of this data on the web. We made a browser extension so they wouldn’t need to do any manual data entry. Recruiters loved the extension because it saved them time.

I think it was a legitimate use. We were making LinkedIn more useful to some of their actual customers (recruiters) by adding a somewhat cursed api integration via a chrome extension. Forcing recruiters to copy and paste did’t help anyone. Our extension only grabbed content on the page the recruiter had open. It was purely read only and scoped by the user.


Doesn't sound like your operation was particularly questionable, but I can imagine there must be some of those 3,000 extensions where the data flow isn't just "DOM -> End User" but more of a "Dom -> Cloud Server -> ??? -> Profit!" with perhaps a little detour where the end user gets some value too as a hook to justify the extension's existence.


I started their but it felt like a dodgy way (as it could be seen to be illegal). We then just went aloffical and went through Google search API’s with LinkedIn as the target. Worked a treat and was cheaper than recruiter!!!

So when pay the highest scraper, it’s ok! Same data, different manner.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: