Will Big Data kill Intellectual Property Rights?

  • With the explosion of data, IPRs are under threat.
  • IPRs aim to create artificial scarcity, but the growth of data under IPR protection is vastly inferior to the growth of non-protected (“open”) data.
  • Because of their characteristics, IPRs are not able to tackle the growth of Big Data, and will therefore probably drown in a sea of data.

I’ve recently learned what a Zettabyte is. As Wikipedia explains, it’s a unit of information (data), equal to 1,000,000,000,000,000,000,000 bytes.

EMC has calculated that, in the year 2011, humanity and its machines have created 1.8 zettabytes of information. That’s more data than created in the whole of human history until the end of 2009 included (roughly 150,000 years), and more than three times what was estimated to be in the Internet in 2009.

So, we are drowning in the data we are creating. Massive amounts of it. And every year, the amount of data created grows. By 2020, the amount we create will have gone up by a factor 50, and we’ll have to learn what a Yottabyte is (remember, you read it here first – Wikipedia does not even mention the name of 1,000 Yottabytes). This is a huge challenge for businesses – where will they find the people and the tools to deal with them, and how should they behave?

So how will all those data impact Intellectual Property Rights (IPRs)?

IPRs create an artificial scarcity. They give their holders the right to prevent other people from manufacturing or distributing certain products or services.

Classic IP theory states that the increased pricing resulting from this artificial scarcity is necessary in order to promote innovation and creativity, because the free market, with its inherent right to copy, would not allow for sufficient reward to obtain the required level of innovation and creativity.

However, the scarcity created by IP rights is a scarcity that is created by a legal instrument (the IP right). And in order for such IP right to work, that legal instrument needs to be enforceable in a cost-efficient manner.

One of the things that the Napster and Pirate Bay stories are teaching us, is that enforcement of IP rights does not work very well downwards the value chain. Put simply, it’s not cost-efficient to sue your customers, especially if you have a lot of them, to enforce an IP right; it’s only cost-efficient to sue your competitors, and then only if there aren’t too many of those.

So what happens with business models that are based on charging for legally created artificial scarcity, in a world that generates several Zettabytes of data per year?

Will IPRs spread, and cover more and more of those data, or will IPRs become like little atolls in an ever rising ocean of data?

Let’s look at some practical issues that arise out of Big Data and affect IPRs.

First, there’s the issue with patents, patent quality and prior art.

Between 1985 and 2010, the number of patents granted worldwide has risen from slightly less than 400,000 to more than 900,000.

That’s an increase by more than 125% over one generation (25 years).

Data grows that much in about two (2) years.

So let’s be very clear: even if the USPTO, EPO, JPO, SIPO and tutti quanti would increase their budgets significantly, and start hiring every other engineer, it just doesn’t make sense. There is no way the combined forces of patent offices of this world can avoid the evolution that they will become, in terms of the amount of information they treat, quite insignificant.

But that is not their only problem.

Not only is it impossible for them to keep patenting significant, it is also impossible for them to keep patenting useful.

As we have seen repeatedly over the last 15 years (effectively since the number of patents, particularly in the US, has started to grow much more strongly compared to before), is that the quality of patents has gone down.

Some recent court cases have brought additional evidence in this respect – even in patent trigger-happy Texas, the land of the patent troll, courts are rejecting patents that are obviously non-novel or obvious.

This is important, because you can only get a patent on an invention that is novel, and that is non-obvious to someone who knows the technology.

To me, it is clear that patent offices will be increasingly unable to verify whether something is novel and non-obvious, because the amount of information to check is simply too large.

In patent terms: the chance that there is prior art is likely to grow to above 99%.

As a result, it seems unavoidable that the quality of patents will continue to decline. In the end, this will start to seriously affect society’s tolerance for the enforcement of patents. If they protect “inventions” that are not novel or that are pretty obvious, patents will become even more a roadblock to innovation than they already are, and society will start to refuse the draconian remedies and powers currently granted to patent holders.

So, yes, Big Data stands a good chance of killing patents.

Second, there’s copyright.

Copyright has the advantage of arising without registration, so the capacity bottleneck of a registration office does not, in theory, restrict its application.

But let’s have a look at what happened with MegaUpload recently.

The fileshare website was shut down, because it was used also (allegedly mainly) for “piracy”, which is selling content to which the sellers had no exclusive rights or licenses.

But immediately after the FBI shut the site down, a cry arose from all those people who had used MegaUpload in the same way you use Dropbox or another file-sharing site. Some of them are suing the FBI, and class-action lawyers are happy to comply.

In effect, MegaUpload was part of the Cloud. Remember the Cloud? It’s the future of technology and IT.

With the huge generation of data that is occurring at the moment, the percentage of those data that is covered by copyright AND that a distributor wants to control (think of music distributors or Hollywood) will shrink every year vis-à-vis the amount of data that is either outside copyright, or where no-one is interested in harvesting a royalty or enforcing a distribution monopoly. Or does anyone seriously think that all that User Generated Content will be policed on the basis of copyright?

The moment sites that are also offering peer-to-peer are becoming too important because of their Cloud function, the potential to shut them down because they also effectively commit or allow piracy, will disappear.

It’s economics, really. If the value of unprotected data is significantly larger than the value of protected data, the protection becomes largely unenforceable.

So yes, Big Data will probably kill IPRs. And there’s not much patent holders or content distributors can do about it – their scarcity is simply drowning.

Of course, they can try legislation like SOPA or PIPA – but I don’t think it’s a coincidence that this legislation was killed as a result of two conflicting businesses – and it’s clear which one is growing faster. It’s the one that won this time, and will most likely continue to win.

Finally, a short word on Trade Secrets. While technically not an IPR (although some might contest that), it is hard to see how they can remain relevant, when, in that ocean of data, businesses are making more money by disclosing and sharing data, than by keeping them secret. When 80% of innovation is open innovation (and growing), the idea that it is possible to classify all those data, and make sure that the confidential ones are not disclosed, is rather fanciful.

Trademarks will stay, and probably increase their value – but that’s another story for another day.



Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>