Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While you somewhat correct, in that the browser sends the request, but it is not a 'can be downloaded' but rather an imperative saying 'get that font from that server'.

In the end, the w3c standards define, that browsers execute the commands they receive from the server and in this case, the server tells the browser to download the font. So the site-owner configures his website in a way, that this site instructs browser to share the IP address.



This is the essence of CDNs, though. Every offsite CDN is subject to this same ruling, meaning any developer trying to use a third-party CDN for something as simply as loading jQuery is subject to this. For example, on load, https://evanandkatelyn.com/ grabs stuff from: twitch.tv (embedded player), youtube.com (embedded player), facebook.com (likely just a like button), and what I assume are several wordpress CDNs (c0.wp.com, i0.wp.com, s.w.org, ssl.p.jwpcdn.com).

If this ruling is upheld, either (a) browsers need to immediately stop interpreting these commands, instead providing user prompts for _each offsite load event_, or (b) a very large swath of websites are all open to the same legal issue. As a small example, the Aesop wine company site (https://www.aesopwines.com/), made with Squarespace, uses typekit, squarespace, and google CDN loads. They're subject to the same ruling, right? And so on, and so on...


> browsers need to immediately stop interpreting these commands, instead providing user prompts for _each offsite load event_

No, why should they? The ruling makes the (pretty realistic) assumption that users are in no position to decide about individual load requests. Therefore, those are the responsibility of the site author.

This way to interpret the events seems most consistent with real-world usage. Meanwhile pretending the user is responsible to vet any individual network requests seems like a legal fiction - except there is no reason why it should be applied.


By this argument, then, should third party requests always be blocked? If the user "is in no position to decide", that means that the only way to avoid potential liability would be to load everything from the same domain, right? No CDNs, no off-site scripts, no off-site embeds, ever. Seems a bit extreme to me.

Thoughts my own, not those of my employer.


To my knowledge, there are other avenues beside consent under which the GDPR allows data exchange with third parties - in particular if such an exchange is essential for fulfilling the service. The point here was though, that the data exchange was not "essential" because you could simply self-host the fonts or proxy the request through your own servers.

But yes, it would seems to me that this interpretation of the law sort of communicates that third party requests should be a measure of last resort. That would definitely cause a shift in current web dev practices, but I'm not sure it's a bad thing.

> avoid potential liability

I think "potential" liability is an odd criterion. Any law is a risk of potential liability. If the effort to find out if a law actually applies to you is already too much, then I guess anything less than anarcho-capitalism would be unacceptable.


> self-host the fonts

Do you, as the website operator, have the right to copy and serve these fonts to your visitors? (Actual question; my guess is that you don't according to Google Fonts, but could be wrong.)

> proxy the request through your own servers

Isn't this worse? Assume that your visitor does not want Google contacted at all as part of their visit; isn't, then, the potential leak of an IP address simply a side effect? The website is still leaking timing of when a visitor accessed the site, potentially their usage patterns...

Personally, I think this is a bit of an absurd argument... I think, at most, consent should be enough for third party requests. I was mostly responding to GP's claim that the user can't reasonably consent to such use.


> Do you, as the website operator, have the right to copy and serve these fonts to your visitors?

Good question. I have no idea, but apparently the court thinks self-hosting is ok in this case.

> Isn't this worse? Assume that your visitor does not want Google contacted at all as part of their visit; isn't, then, the potential leak of an IP address simply a side effect? The website is still leaking timing of when a visitor accessed the site, potentially their usage patterns...

There is a specific set of data which is defined as "personally identifiable information". IP address is part of that set, but I don't think timing information or anonymous usage data are. So the question in this case is specifically "does the request leak information defined as PII?". You can prevent that effectively with proxying: Google would only see the IP address of your proxy but not the address of the user.

> Assume that your visitor does not want Google contacted at all as part of their visit

I don't think a user can enforce this under the GDPR. They only have a right to block you from sending their PII to Google, not to block you from talking to Google at all.


> There is a specific set of data which is defined as "personally identifiable information".

I believe there isn't and that is part of the problem with GDPR in my experience.


> Do you, as the website operator, have the right to copy and serve these fonts to your visitors?

All the fonts on Google Fonts are open source. When GDPR came into force in 2018 I downloaded all the fonts I needed, checked their licenses, and uploaded them on my servers along with necessary notices as required by the licenses.

The matter could also be sidestepped if the CDN were to offer a GDPR data processing agreement (DPA) and would make guarantees about the locations of servers. The free public CDNs understandably don't do this, and it seems Google Fonts is not covered by the Google Cloud DPA.


At least in Germany (possibly also other European countries) a design pattern only loading Facebook/Twitter/Youtube/... content with explicit user consent is nowadays pretty common.


But shouldn't the site owner pay for a CDN and host the resources themselves? In which case the CDN wouldn't own the IP information. I think the problem here is that the website author is getting free bandwidth in exchange for their user's IP address, which in the example Google can then use for tracking and other things in exchange.


> But shouldn't the site owner pay for a CDN and host the resources themselves?

Not sure I understand this. Whether you pay for a CDN or not, you'll still be guilty of sending the user's browser to an external domain without consent (because it happens before the page is fully loaded). The only GDPR-compliant solution seems to be self-hosting everything.


That's true but the mitigation to that is that it would have been OK if the user has consented to this "data processing".

The court isn't ruling this sort of technology en bloc but says in its ruling that it is a problem because the user didn't consent to his personal data (IP address) being given to a third party (Google in this case).

Personally I have mixed feelings about this ruling too because this sort of technical solution is widespread and an army of GDPR vigilantes has the potential to cripple large portions of the web by filing similar suits. Or we won't be able to access websites without having to go through entire multi-page EULAs and consent forms for every and all kinds of similar 3rdparty technology embedding.

Law is a blunt tool and will have unintended consequences, unfortunately :(


A lot of websites won't serve addresses from Germany.

I've seen companies doing that with just the GDPR cookie warning, it wasn't worth rewriting code and annoy non-EU people with the warning so the detect IP address and redirect to a page saying they don't serve that region.

Let's be honest, what have we gain from the cookie warning?


That is a minority and mostly only US-centric sites that are otherwise chock full of advertising/tracking technology - exactly what was GDPR meant to deal with.

However, GDPR and this type of ruling has EU-wide impact because of the single market (e.g. a French website can and does server also German customers). Businesses (especially the ones from the EU) can't afford to not comply or to not serve customers within the EU.

That is where the problem is.


There is an important point to this ruling that shouldn't be omitted:

> Der Einsatz von Schriftartendiensten wie Google Fonts kann nicht auf Art. 6 Abs. 1 S.1 lit. f DSGVO gestützt werden, da der Einsatz der Schriftarten auch möglich ist, ohne dass eine Verbindung von Besuchern zu Google Servern hergestellt werden muss.

To roughly translate: One can use Google Fonts without forcing users to make a request to google servers (by downloading the fonts and serving them locally) so this doesn't fall under GDPR (which allows sharing/using user data if it is necessary for functionality).

Which would most likely include CDNs but a point could be made for things like youtube and twitch where that isn't really possible/feasible.

Edit: One addition to the "necessary" part: Necessary for what the USER wants to do when visiting your site. Might be arguing semantics but this is law after all, which is all about semantics


Then get Squarespace to stop pinging random third parties on page load. The website owner is paying for Squarespace, why is it loading Google CDN (and Google trackers?)


It's loading fonts. So squarespace needs to host those fonts, fine. But more to the point, it could be argued even the Squarespace CDN is "different" from the actual website, so we need CDN shims that forward local domain requests to the CDNs and return the results. All to hide an IP number for downloading fonts." Moreover, "host it yourself" is easy if you're technically skilled, but very, very difficult if you aren't.


Technical skill is kind of a requirement if you want to achieve something that's technical by nature - such as website development or web hosting.

And hosting a font file entails dumping it next to your index.html file and adding some very basic CSS. Not exactly difficult.


> And hosting a font file entails dumping it next to your index.html file and adding some very basic CSS. Not exactly difficult.

If you are a 60-year-old woodworker living in Appalachia trying to set up an online store to sell hand-carved flutes, this task is essentially impossible.


So they will have outsourced their website to some external entity that does possess the required technical knowledge. This required technical knowledge should include the ability to host a simple file.


So congratulations, you've regulated a middleman into the process. What a great day for humanity and a victory for regulatory capture.


The 60-year-old woodworker living in Appalachia will be relieved to know that browsers are able to display text in their online store without having to add any font files at all. If the 60-year-old woodworker living in Appalachia decides they absolutely must have a custom font on their website then self-hosting that font file is not any more impossible than adding the HTML/CSS required to fetch it from Google.


And then EVERY site needs to serve the same font? The browser can no longer cache it across sites?


How is that relevant here?

Caching was never mentioned as being a requirement. I'm only giving the most basic solution for achieving compliance with regards to hosting a font file.

Scope creeping, on a Sunday no less... where are we headed


I don't think they can right now anyway since the cache is segrgated by origin to prevent leaking cache timings.


ianal, but I think CDNs would not be affected by the ruling, since they serve an important function. Google Fonts was deemed illegal here since it's not necessary and you can easily provide a font in a privacy-preserving way.


Not a lawyer as well but I'm not sure about this. Let's use the "jQuery served by a CDN" example here: You can easily argue that using jQuery is necessary for your site to function but there is no real benefit to the user by doing this with a CDN when you could just ship jQuery from your own server. AFAIK the benefit of CDNs is largely nullified nowadays by browsers using a different cache for each primary domain anyways, so you can't even really point out a potential benefit for the user (faster load times) here.


Google Fonts is a CDN. The *C*ontent they *D*eliver are fonts.


> the w3c standards define, that browsers execute the commands they receive from the server

I'm no expert in the matter, but this seems a little convoluted to me? To me, the server does not issue instructions, per se, it returns a declarative text/binary response that describes the sturcture of the website, it is then up to the browser, that the user installed and chooses to use and may configure (and possibly configure to leak their data, even if spec-adhering behaviour of rendering the webpage should not), to attempt to understand the document and retrieve any other resources that may assist displaying the content correctly.

On the other hand, if one was to send CPU instructions back to the user, I guess it's also there choice to execute them...? Also, it's not possible to determine which resources are for display purposes (fonts), and which are for tracking purposes, the browser will blindly have to retrieve the resource, so websites have a certain responsibility to issue privacy-respecting "instructions".

I'm trying to argue both sides here, I still believe that the user chooses voluntarily to use the browser, visit the webpage and therefore parse the document and initiate any subsequent requests that the document proposes, on the other hand, this is beyond most people, they just want to view a frickin' website, so perhaps the lives of web developers should be made harder to make the lives of the average Joe, who is not an IT expert, a little easier? The architecture of the web is inherently not privacy-respecting, in order to save bandwidth (and for sake of simplicity), we only send fragments and let the browser choose what else it needs, which can be tracked.

It's like walking in a park. You choose to show your face to people, we've come to just accept the fact that by the laws of nature, we cannot prevent other people from seeing our face (unless you use a mask, but then you make them very uneasy), we leak data that others can remember and use to identify us later.


Try making this argument with compiled code instead of HTML:

"The company included the code to do $BAD_THING in the binary executable, but it was the user's choice to run it, and he could have easily modified the binary to ignore $BAD_THING, but didn't. Therefore, it was the user doing $BAD_THING, not the company."

A lot of people in this discussion are splitting hairs here, trying to blame the user or the browser. The technical details of what a browser does are less important than the end effect: Fundamentally, the web developer added "stuff" to the HTML, knowing that this "stuff" will cause most browsers in the field to access these fonts on another computer. The fact that the end user could technically block it, doesn't change the developer's intent.


Correct. The question for law is: what does a reasonable person expect?

This is a problem because people (in general) are not good at understanding or reasoning about what computers do... and the entire purpose of the web is to put a simplifying, abstract model in between what humans want to do and how computers work. Models are always wrong, sometimes useful. The web is very useful because it is wrong.

If the web were more like Gemini -- I'm not advocating for this -- every link would be an explicit change, and the argument that a reasonable person would be aware that different things came from different entities would be solid. If JavaScript existed but a web page could not request any resource from a non-origin domain, the argument would be solid.

It's not reasonable for a random user to have to internalize a model that says that sometimes the font used by a page is local, sometimes it is supplied by the website, and sometimes it is a call to a third party. It's true, but it's not reasonable.


From my experience, I think the average user treats the browser and the internet as a black box anyway, they don't reason about what is happening. As long as they can get to what they want, they don't really care what happens in between. Cookie notices get in the way, and therefore annoy them. Most also just accept the fact that there data gets leaked everywhere and there's not much that they can do about it. I genuinely don't believe that the average user can make sense of the TOS that they agree to when signing up for something...

This is definitely not a good thing, and should change, but I also believe that ad-driven companies will continue to find a way, we just continue to rack up operating complexity, which in turn, favours large companies. This ruling seems like a pretty weird and unhelpful way (in the grand scheme of things) of helping protect user privacy, but then again, that was not the goal of the lawsuit.


Yeah, I guess this stands. But HTML is not executable. It has to be parsed, like words in a book, not chemicals in a tube. Who is liable, the person who creates the poison, or the book (encyclopedia) which describes the process (and therefore the person who wrote it/distributes it)?

Again, I'm not saying what is right and wrong, but I think this issue is fundamentally much, much more complex than the court may have thought, and more importantly, may have repercussions on almost every website out there.


> Yeah, I guess this stands. But HTML is not executable. It has to be parsed, like words in a book, not chemicals in a tube. Who is liable, the person who creates the poison, or the book (encyclopedia) which describes the process (and therefore the person who wrote it/distributes it)?

I know as soon as you typed this, you probably thought "oh crap, what about Python?" so I won't go there.

I think the major underlying thing here that makes developers uncomfortable with this court ruling is that the whole industry of software development has a chronic and pervasive problem with the idea of consent. I'm not saying individual software engineers don't know what consent means, but we constantly put out software that does things without giving the user informed consent and control, and resist all efforts to force us to ask for this consent.

Imagine trying to use the Software Industry's idea of consent when dating: "Hey, Alice, do you want to go out on a date with me? I'll only accept the answers [Yes] or [Ask me again later]". Ridiculous! But software regularly does this! "Hey, Bob, I love you and I'm going to keep sending you text messages. Do you want [all my text messages] or [only essential text messages]?" Ridiculous, but look at the "consent" options when it comes to cookies.

Not to be crass, but when software wants to get users to do something, they need to treat it as if the software is trying to get laid: You need to ask for, and receive, informed consent at every step of the way, at every new and different request. This is an uncomfortable idea to developers who are used to just commanding the code to do things.


I didn't, actually, but it's a fun thought experiment :)

Python code is instructions that are executed in a very precise manner, they could, in theory, encrypt a hard-drive. You execute the program the same way you execute binary instructions.

HTML is a description of a problem, there are different interpretations, and different solutions, depending on screen-size, etc. You don't always get the same result. Technically, JS could mine crypto, but I don't believe that's illegal (correct me if I'm wrong?), just very inconvienient, and there wasn't any JS involved here. You could make a browser that leaks data due to misinterpretation of the HTML. The problem also lies with the eager-evaluation of HTML, it's difficult to put a disclosure or ask for consent, such as the responsibility clause of most software licenses, without greatly annoying the user...

> makes developers uncomfortable with this court ruling

Guilty. At the end of the day, we need to get stuff done, without creating a private internet infrastructure for our customer. The small guy is at a disadvantage here, not big companies.


Not really. You are making this much harder than it has to be by bringing in irrelevant technical arguments. The court mostly cars about intent and effect, not technical minutiae.

Also by your logic machine code isn't executable either since it too has to be parsed (by the CPU which transforms it into microcode instructions).


In your paradigm: shell, Python, Ruby and Java programs are not executable either, since they require interpreters before they become machine code. Java calls this bytecode and wants to run on a whole JVM, which is rather like a browser.


To follow down the difference here I think that the browser is much more aware of what it is doing than compiled code. It know what this is a stylesheet or a font, it knows it is coming from a different origin and more. Technically the browser is very capable of blocking all third party resources (if you assume that party == domain).

With compiled code stopping it from executing one particular instruction or feature is much harder.

Basically I think the declarative is a significant difference.

Of course I don't think the law thinks that. Otherwise we would have implemented technical measures for limiting cookies such a permissions (with legal backing) rather than these legal-only ones which are abused more often than implemented as they are supposed to be.


I think you confuse user, the Human, and user, the Programming Idiom.

User, the Human, is not going to be asked weather or not the browser should open every one of the possibly hundreds of references in a web page!!

Now, user, the Programming Idiom, might be configured, programed, etc.. to behave differently, but the reality is that is that's not how the modern web works. If the browser is not configured to behave the way it comes from the box by Google, Microsoft or Apple, most pages will not work correctly.

So I think your configuration/technical argument is purely theoretical. The court cannot expect the average user to understand do any of that.


> User, the Human, is not going to be asked weather or not the browser should open every one of the possibly hundreds of references in a web page

Exactly, because we strive for a balance of convienience with complexity. Most people wouldn't mind downloading a font from Google, they already use it directly anyway. This isn't how law works, as an engineer, I'm just trying to find some sanity in what to me seems like an insane court decision with possibly very big repercussions for the rest of the internet.


Just host the font yourself. What is the problem?


The terminology for these concepts in common use is "user" and "user agent".


But User, the Human has the means to command its User Agent, the Browser, to conveniently skip loading whatever they would not like to see, like fonts, executable scripts or ads.


Your post basically amounts to blaming victims of malware and spyware.

"Sure your honor, the victim died by carbon monoxide asphyxiation, but it was his choice to inhale the gas, even though it smells the same as normal air"


I'm not trying to put any blame here, we can twist metaphors to support either side of the argument.

I definitely think that websites have a huge responsibility in keeping the user safe, but this feels to me like an over-extension of GDPR that will make websites much more difficult to develop in the future for the layman without a team of layers. It's a font, there was no malicious intent.


But I am trying to put blame: people who wrote the malware/spyware code are to blame. Similarly to people who write website code that leaks user personal information. The choice to embed third-party code was made by them.

It is nowhere near reasonable to ask a common user to protect themselves from such things: they might not have the technical expertise. They might not be using their own computer (library, etc). The browser doesn't provide enough tools for it and requires third-party solutions. Third-party solutions can either cost money (Little Snitch), additional hardware (Pi-Hole), are not available in all browsers (uBlock Origin due to its interface) or require technical knowledge (other ad-blockers that use lists).


This is called plausible deniability- the site owner knows what he is doing, but does not care. He could have downloaded the font or used a analytic service which does not collect IP address. Great ruling.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: