[email protected]

0207 993 5485

September 2018

Custom database development, Data mining and using data as factual evidence

back to blog

In November 2015 Wirebox were contacted by an international legal firm, a firm with over 1200 lawyers specialising in multiple specific practice areas including Intellectual Property, Banking & Finance and sectors such as Technology and Communications and Retail.

The legal firm had been instructed by one of their international clients in the Communications sector in a legal case involving Intellectual Property, specifically around the valuation of their patent portfolio .  

There are literally millions of patents successfully registered within the sector, many can be classified differently, under different owners within the same company, at different times and all for different reasons and benefits.  The value of each patent is also variable, the value of each patent is obviously different for multiple reasons, such as the age of the patent and the value to different companies using the technology.

Our client’s client had a dispute on such values of their Intellectual Property, they had live patents of which their patented technology was being used in millions of different devices.  This dispute had become a legal matter. The technology required to analyse and present the value of such patents was to be used in a court case, there was no room for error.

Wirebox built a database system using number crunching data science analysis on publicly available data – the data being 90 million patents – cross checking and linking to multiple external databases to provide patent portfolio valuations.  Valuations on all the the different mobile manufacturers at different dates in history. After building the database Wirebox were asked provide expert legal representation in the case, using evidence from a pre existing commercial database supposedly providing simple functionality. Wirebox were asked to review and provide evidence to discredit the database, based on recent correct technology with the database developed by Wirebox.

Ten figure sums are awarded in such cases for global communications and the development from Wirebox was instrumental in the success of the case.


This bespoke database was estimated to be billions of records, already understanding that there were over 90 million patents and that each patent would have at least 500 records, it was quickly noted that the quality of the data from the outset was paramount. Information and data for this database was to be linked from external sources and external sources were somewhat outside of our control and could be amended at any time. Patents get bought and sold, owners details therefore change – even small changes including the legal team representing such patents.

Importing data can not be imported to the database without ensuring it is cleansed, understanding how data should be cleansed and linked within the system.  Specific routines and documentation of this was written to ensure the process worked smoothly and could always be referenced and understood.

The system, which was to be used in a court case obviously had to be highly reliable and allow users to gain information about specific intellectual property quickly and efficiently. We’ve seen users working with Excel for large databases previously – this can be achieved – but reporting on something via Excel and gaining information that our database could be done in minutes would take Excel weeks, and the volume of the reporting would be minimal. They may receive one scenario as a result whereas the database we provided generated multiple scenarios.

As part of the development the data processing and pre processing of data was designed with different types of data and inventing new ways to calculate figures.  Calculations were recalculated if a specific patent was interacting or mentioning another patent – what value does each patent have and how was this referenced?

MySQL – an Open Source relational database management system was used on the server as it has the fastest access.  For the database this was the perfect backbone for the structure of the database – but for implementing rules – different rules in different territories and countries – this was coded in Python – Python also allowed us to create graphical outputs and relationship graphs. Everything but the raw number crunching was completed in Python.

By admin September 2018

Share this post


Ready to start a project?
Get in touch