pSYoniK - Software, Hardware and Tech

Thoughts On The UK Online Safety Bill

Protecting people online

I started this article several times and thought about different approaches to the subject, so in the end, I will focus on my thoughts on the bill, the potential risks that I see and how I believe these issues should be tackled instead.

A recent article by Dan Luu highlighted an issue I hadn’t thought about before - if someone isn’t an expert in a field, they will have difficulties determining if someone else will do shoddy work in that field. While I am aware of the process of crafting legislation in the UK, I do wonder about the ability of our legislators to understand what they are being told and how much of what they are being told, they actually understand.

This is not to attack any of the members who worked on the legislation itself, but rather to raise a question - do they, the architects of this bill, understand HOW some of the measures could be implemented? I am not talking here about “well, someone is going to check every picture everyone sends every day on WhatsApp”, but rather, how will someone check every picture everyone sends every day on WhatsApp? How will they look at it? How long will they look at it for? Will anyone actually look at it? Will the picture be checked after a user uploads it? Will software check it? If it is software (it will be software), how long will the check take? How accurate is the software that does the checking AND if the software so determines that the picture needs to be further investigated, how will you inform the user (will you?)? What happens while the investigation occurs? Where is that picture kept? Does it stay with WhatsApp? Does WhatsApp send a private photo that might contain sensitive and obscene or upsetting material to another server/email inbox somewhere? Who checks it? What will they do after they check it?

Graffiti of a camera with the message reading - For your safety and our curiosity

Credit for image goes to Etienne Girardet

I mean, I could probably keep on going, but I guess you get my point. At this stage I am working on the assumption that despite the best efforts of organisations highlighting flaws in the legislation, it will pass through Parliament, because who wants to be seen as voting NO on the Online Safety Bill? It’s almost as if you would press Like on a video of a hospital being blown up. You’d have to be a monster, right?

So the first question I want to raise, not against the bill, because that boat seems to have sailed a long time ago, but to the legislators - do you understand, how you would implement such checks on systems that currently serve billions of users?

These decisions and the legislation also didn’t start in a vacuum, but from concerned parents, from people harassed, threatened and bullied online. How are they supposed to protect themselves? How can they possibly hold their aggressors to account? In their desperation they have turned towards those who should be able to protect them, the providers of these services, but those requests fell on deaf ears so they kept going until they found someone.

Little did it matter that a wide variety of problems get all lumped into the same boat. Fraud? It’s in the bill. Harassment and threats? It’s in the bill. CSAM? It’s in the bill. Online pornography? It’s in the bill. Horrible issues after horrible issue, the people drafting the bill reached the conclusion that they are all part of the same problem - the Internet.

The Internet

I covered that the legislation glosses (and that’s generous) over how any of these measures will be implemented and that neither the people demanding the change or those in power have sufficient understanding to actually talk about solutions. This brings us in an awkward position - we know we have huge problems with this thing we built in the corner, but we’re not sure how to deal with them, so we’ll put together some rules and then those who DO understand how these things work, will figure out a way to fix them. There. Everyone’s protected.

But that’s never the case. When did poorly thought through legislation yield immensely successful results? When did corporations for once acknowledge that the problem they are being asked to tackle cannot be tackled in the status quo? Not once or very few times if memory serves. So then what will they do? They will do the same thing humans do - they will look for the easiest way to solve the problem or at least give the impression that the problem is solved. And with that, the law is passed, the companies made the changes and everyone is protected.

The end.

Only it isn’t.

Allow me to highlight some technical difficulties that might become apparent when we’re thinking about solutions for the problems at hand.

1. Scale

The first problem that everyone should be able to understand is the sheer scale of our current online activities. Facebook is estimated to have 2.912 billion active users which are estimated to be sending 20 billion messages per month. How would we scan all of those messages for offensive content? At this scale, it is obvious that we would need some automated system to be implemented as human moderators would simply not be able to keep up with this flurry of messages, not to mention that people who have to do this for a living are exposed to traumas on a daily basis and that a lot of these workers end up developing PTSD over the working conditions and the content they have to moderate.

This is just one of the messaging apps that would fall under scrutiny, but there are many more so the number of messages keeps increasing exponentially. It is as obvious as can be that with such scale to content, we would need to use an automated system as no amount of human moderators would be able to process and verify this many messages sent out daily, 24/7, 365 days a year.

2. Software

So we established that software is needed to help out filter this flurry of messages. I believe it’s safe to assume that message text will not be scanned and if it will be scanned, the general theme will be to look for keywords to flag the account. However, even this isn’t guaranteed to yield 100% accurate results and percentage points matter when we are talking about billions of users. There are very real difficulties even in something as “simple” as natural language processing where data that is too heterogeneous can narrow the scope of understanding and languages for which the sample size is small can create real issues for text recognition software. While I know that the legislation’s focus is the UK market, there are similar laws being passed in other major economies and this can be a real problem there as well.

But let’s assume that we can mangle software into shape to recognize speech that shows malicious, obscene or harmful intent (talk about a loaded statement). What about the images? Those should be clear cut and we shouldn’t have any trouble recognizing CSAM or pornographic content. Surely this is a slam-dunk. The reality is that even a trillion dollar company such as Apple cannot get a solid system off the ground and the backlash to that was massive with the letter highlighting the very real dangers that such software poses.

While we humans have great abilities at detecting what a picture represents, algorithms aren’t there yet. To try and simplify the explanation, each image is made out of tiny boxes containing colour information. Each of those is broken down and algorithms are then used to look for patterns in those arrangements of coloured boxes. Based on a large enough sample, we could flag certain items as being more desirable than others and adjust an algorithm to favour finding those. The problem here is that because the analysis is done on such a low level (the tiny pixels making up the image) a modification of the pixel structure of an image can easily fool image scanning tools while keeping the context of the image intact for humans. This means that perpetrators could easily alter all their existing collections while maintaining the content and still share it. On the other hand, regular users can easily be flagged down due to false positives.

In 2020 we shared 3.2 billion images daily. If we have an error rate of 0.001% we end up with 32,000 errors, which is a lot of images to have to manually check. This number only grows and different companies will have different resources to manage this massive increase, which in turn means that you might find yourself under investigation, you might have your account locked or you might have a host of other issues while the verification for one of those fake positives is cleared up.

3. Privacy

Up to this point I hope I highlighted some of the issues that such legislation would need to take into account (or not). We have a lot of people sending a lot of messages and we will need to rely on software that might or might not know the difference between animals and people. But assuming it manages to distinguish species and that it will be able to pinpoint objectionable material, how will this work in terms of privacy?

The first aspect is that any system that is designed to spot one pattern can generally be trained to spot a different pattern. So much so that you could swap out the recognition tech to look for something else that can then be used to stop image sharing from conflict zones. It could be changed to look for particular images shared by groups of people to single them out for persecution, arrest or worse.

Then there is of course the big issue of encryption and how content scanning isn’t compatible with strong encryption. Any process in which someone can access content outside of encryption defeats the purpose of encryption. Any argument that a key that is only controlled by the “good guys” is completely false in today’s world. If not through an accident, we can have a straight up sharing of those keys between companies and governments and we’re in the same boat where anyone can decide what you can say and where you can say it. The risk of dissidents being caught and murdered after a backdoor makes its way unto a device is very real. So any system that is supposed to scan content will invariably defeat the purpose of strong encryption. You can have the most amazing lock on your front-door if you get robbed in front of it.

Supporters of similar legislation will claim that this is done in the name of security and privacy will be protected, but that is not how encryption works. A messaging service is either encrypted end to end, meaning that the message, the content of the message and recipient/sender are all protected, or it isn’t. Scanning content, even locally, defeats the purpose of end to end encryption and turns it into “encryption here, somewhere in the middle, a bit”, with former law enforcement professionals commenting on the need for strong encryption.

Law enforcement

Well, I hope I covered the technology aspect of it (the surface of it at least) but what about law enforcement? They will rejoice at the thought that all those messages are now scanned and criminals far and wide are now out in the open. However, the problem is that we’re already living times in which law enforcement is asking for an increasing amount of data from tech companies. The quantity of these requests for information are increasing exponentially and adding a new system in for scanning billions of images will flood them with more content.

Police could very well buckle under the pressure of handling the massive volumes of data, which can easily detract from work on real cases which can expose global child abuse rings. These investigations are essential in keeping us all safe, but they take a long period of time and involve a large amount of agencies working together, collecting evidence and finding connections. Most of these groups either end up using “the regular internet as well as the dark web”. An open shift to scanning devices would mean that these groups will invariably dig themselves deeper and greatly complicate investigations and information gathering. Which brings me to…

Unintended consequences

The consequences of such a massive change would mean that all of our conversations are now compromised. It means that you can never really speak to anyone without someone listening in, directly or indirectly. You’re no longer open to explore difficult or complex ideas because the risk of being investigated becomes very real. As we saw, the system can easily become repurposed. You can’t share intimate pictures. I mean you can but what will happen when it gets flagged? How would you go about having that image removed from the company’s archives or from law enforcement records? How will you prove it’s indeed a picture being sent between two consenting adults?

What will happen when criminal groups will migrate deeper and start using custom software delivered by people who aren’t interested in legislation and just want to make a quick dollar? While this legislation will put pressure on large companies to change how they treat and analyse messages, this doesn’t mean that software stops in it’s track. Anyone can write software and anyone can make an app with enough time. It is not a leap to assume that these groups could hire software developers or organisations to create software that will enforce end to end encryption and can be self hosted or even run without the need for servers.

I’m also convinced that such a move would make trading in explicit content even more profitable for the groups involved in human trafficking and misery. I come from a country where trading in misery was the name of the game after the fall of Communism. These are patterns that repeat themselves and it will always be the most ruthless and vicious who will gain the most, while those who have nothing to hide, will lose the little bit of privacy that they had left.

Final thoughts

What is the purpose of this bill in the end or bills like it? I know I have discussed all the nefarious purposes - end of encryption, mass surveillance and the addition of tools of oppression. But what had the legislators in mind? Do they honestly believe that this will stop crime in it’s path? Do they assume that tech giants will easily solve these problems over the summer and come Christmas we’re all safe and sound?

Or it is meant to assign accountability and hold these companies responsible for the content that is shared on their platform? If that’s the case, then there are better ways to go about it and not have to monitor the devices of billions of users. There are better ways to break up monopolies that control the way we communicate, share thoughts and ideas and keep in touch with loved ones. But all of them require more work, care and attention than what they are prepared to provide.

Children shouldn’t be taught that the Internet is a dark and dangerous place and we shouldn’t be afraid of constant monitoring. We’re going through an accentuated period where social cooling stifles what little creativity, originality and rebelliousness we have in ourselves. We shouldn’t be afraid to speak our minds, we shouldn’t be afraid to point out when oppressive regimes use Western technology to bomb detention centres. We shouldn’t be afraid to highlight human rights violations no matter who commits them.

We need to review the way we use and interact with technology. We need smaller online communities that govern and police themselves. We need to teach children and adults (!) how to safely use technology, how to protect themselves and how to protect their details to prevent further pain and suffering. This bill does not give anyone the means to protect themselves or others, it just makes everything unsafe for everyone.

So how will this be enforced, what will it do and how will that first lawsuit look like under this new bill? I guess time will tell, but if you believe, like I do, that the bill doesn’t take into account the complicated nature of modern technology, that we need to show care when crafting legislation that affects the lives of so many people, then please find and contact your MP and speak out against the current structure of the bill and ensure that our right to privacy will be maintained. We don’t have much else.