Data Science Workshops for Legal Professionals

Data Science Workshops for Legal ProfessionalsAttribute

We provide Basic to Advanced Interactive Workshops in: • Data Analytics • Machine Learning Algorithms & Artificial Intelligence • What Data Science is and what its impact is & will be on Legal Services • How to adapt your practice to an increasingly Data-Driven world • How to use Data Science to boost Productivity, Efficiency, Speed and Cut Costs

  1.  Our Workshops include but are not limited to: • Interactive demonstrations using in-house Company Data • Professionals participating in analyzing their own Company Data to unearth insights • Learning how to build Statistical Models that Predict Outcomes and Classifying eventualities • Core Data Science Techniques being taught to participants varying from Basic to Advanced levels • Learning how to apply these Data Science techniques to enhance every-day Legal Processes

Data Transmutation vs. Data Analytics: Round 1

Trans

I became a Data Science practitioner in probably the most peculiar, counter-intuitive way of becoming one. Before being confronted by the embroidered Python or R dilemma, before being exposed to Data Analytics Platforms, Math and Statistical theory; I concerned myself primarily with what I can best term “Data Transmutation”. I intuitively knew that before I could do Data Analytics, I had to somehow transmute the Data. I would primarily be mining Legal Data; its unstructured nature and textual rigidity would compel me to develop a way of modifying it into a form that is Analytics-receptive. This was before I even knew what ETL was.

I soon began quantifying Legal permutations and expressing them as mathematically weighted numerals for the purposes of efficient Data Mining. This method is designed specifically for Machine Learning Algorithms that require numerical attributes and weights for their calculations. I however realized that beyond mathematically quantifying Legal Data, it was in fact altering the Data completely and (to my surprise) not just Legal Data. That is how Data Transmutation as I understand it, was born.

Data Transmutation creates a synthesis of multiple events that are first encapsulated into a single expression, then transposed into a math function and finally transmuted into a coherent data point. It condenses Data into mathematically calculable numerals and symbolic expressions that weight the factual permutations of events and occurrences. The results are highly potent Data points; it’s like condensing Lite Beer into a liquid with an alcohol content of 100%. This distillation of Large Data sets into highly concentrated but rational Data points is very helpful. Data Transmutation is a bit like gene extraction. All data tells a story (some stories more exciting than others) and every story has a core phenotypical structure and genome. Data Transmutation is a way of delineating the Data into “DNA” strands that represent the foundational archetype of the story the Data is telling. Transmutations mine the primary consequence of events and occurrences by isolating the systemic functions of data; thereby extracting only salient truths. Think of the process of Diffusion in Biology, which means something that goes from a very high concentration to a low one once it expands and occupies larger spaces. Through Diffusion, a gas loses its potency and efficacy as it begins to spread, very similar to what happens during the collection and architecture of Data. When you Extract, Transform and Load Data, you’re essentially taking a series of events and fragmenting them into scalable features for the purposes of Algorithmic enquiry. This fragmentation increases density, widens factual parameters, increases variation and ultimately “diffuses” the efficacy of the story. However when Data is transmuted into an condensed form; factual parameters aren’t unnecessarily expanded, the features become more salient, the density remains the same and variation is kept at a healthy level.

There is of course Sampling, Feature Optimisation, Feature Generation (combinational vector creation) and many other tools which are all used to perform the function of distilling data into a state of optimum lucidity. However there is a difference between Transmutation and Segmentation, which the above tools essentially are. They minimize and optimally abridge the Data to be analysed, they do not fundamentally mutate it.

Discretization methods are ubiquitous on all good Analytic platforms, that is probably the closest you can come to changing the aesthetic identity of data points without using Data Transmutation. You could certainly use Discretization to convert numerical attributes (where some entries are “0”) into binary attributes detailing “Yes” or “No”. This however cannot be done without compromising the structural and probative integrity of those data points.

Nevertheless, the definitive feature of Data Transmutation is the ability to mathematically calculate the value of a transmuted data point, without using Analytics to do so. Consider a classification model for Gold for example. One of the data points under the attribute “Pressure and Temperature Data” has been Transmuted from 27.0 GPa of Sheer Modulus (original data point), into a mathematically weighted expression of (P+) 4.833 (Transmutation Value). Because the data point has been delineated into a math function, it is possible to calculate the mathematically representative value of Sheer Modulus, in short-hand form, without using any code or software: with pen and pad. Think of Machine learning Algorithms that have the ability to produce Formulas for their results or at a very abstract level, even Map Reduce; Data Transmutation works in a similar way.

I am not saying that this method of altering data is a divine panacea, just like a Machine Learning Algorithm there are conditions and parameters that it must satisfy to perform optimally. All I am saying is that there is a way of changing data to facilitate a more advanced method of Machine Learning. Unfortunately Transmutation will inevitably lengthen the already protracted ETL process, however the rewards are bountiful. As Data Science practitioners we should let go of the fear of “corrupting” data. Change it as you see fit and you may be pleasantly surprised by the results.

Math: My Data Science Stimulus Package and its Guerilla Analytics

mathsSometimes I don’t trust Data Science, probably because my duty of care is more pronounced on account of working mostly in Legal Analytics. You see as an Analytics Practitioner in the Legal field my Data Science methodology cannot afford to yield wild guesses, these are people’s lives I’m dealing with. You have to be very careful with Legal Automation, if you build a Classification model for a State Prosecutor and you miss something, even a very small thing, the results will be cataclysmic. For Analytic practitioners in Law it’s not just about finding subtle or abstract insights that boost efficiency, you are venturing into the inner most sanctum of human life and its consequences.  This is not a whimsical Analytic project that some companies venture into because of all the hype around Data Analytics. You know the type, not really knowing what they want out of Analytics but hoping that they’ll know it when they find it in that elusive golden nugget their vast data holds?

 

At one of the major Banks the minimum desired accuracy of a Classification Model is 85%. That is the standard for them and many Data Scientists as well. That level of accuracy is terrific in every other field, except Law. In my line of work it would be a mistake to take a gamble on a model that has a 15% probabilistic margin for error; its probative value is simply inadequate.   

 

The statistical volatility of Data Science sometimes requires other instruments to supplement an Analytic process, for me that supplement is Math. Mathematics is a necessity in our Analytics practice, rather than a peripheral and elective tool, which is unheard of for Lawyers but it’s true. I simply cannot rely on Computational Algorithms alone; if I did it would diminish the veracity of the results of my Analytic projects. This led me to develop a series of fairly elaborate Equations and Formulas that we solve before and after the Analytic process. These math functions have enabled a breed of “Guerrilla Analytics” that have become a staple for us. They are applied to a clients’ data and the results of those calculations are the values that make up a typical Data set for us. So while most Analytic practitioners will clean, architect structured and unstructured data then ultimately model it while keeping the data values mostly as is, we employ a different approach. By the time we are done with our ETL process, the Data will be unrecognisable; this is because the Equations we use transmute the Data into a form that only our math functions will recognize and can rationalize.  For example if we take a Data Set of quarterly revenue, and a particular entry is $40000, after our math calculations it will no longer be $40000, but something like “(P+) 7.33”, and that is not meant to denote its “weight” in Data Science terms either. If it is raw Legal Data, a particular averment in one of our clients Pleadings could be illustrated as “(N-) 0.333”, an answer our formulas arrived at. This is a pain staking but worth while process and the calculations will a lot of the time be done by hand(I’m old school like that). Other times they will take the form of an equation in a Matrix, again by hand and then transformed into a computation thereafter.        

 

One area that has benefited remarkably from the math equations is our Trial Simulations. Simulating a Legal Trial using Algorithms is an enormously difficult and complex task, one that you simply cannot embark on competently using Traditional Data Science tools alone. Postponements, the introduction of new Evidence, uncovering new facts, cross examinations are all factors that can single-handedly derail any Analytic Model on any Data Science platform you can think of.  Surprises like this are just far beyond any Parameter adjustment or Machine Boosting technique. This is especially the case when practicing real-time Litigation Analytics in an actual Trial. If something happens unexpectedly, you need a short- hand technique to quantify those sorts of permutations right then and there, ergo, summarily deploying Analytics in the quickest way possible. Unfortunately in a situation like this there is no time for an ETL process, Data Cleansing or Architecture; this is Guerrilla Analytics and the math functions we’ve developed make it happen.

 

Now I know that I face imminent attack by Data Science purists when I say that sometimes I don’t trust Data Science on its own, but Machine Learners have their own peculiar biases and dispositions, some of which can jeopardize Legal Analytics. I would never use a Support Vector Machine alone if I’m building a Predictive Model for the Attorney General of a country which informs his decision to prosecute a citizen or not. In an instance such as this one, it would be absolutely criminal (maybe even literally) to pursue Data Science recklessly or without some sort of supplementary tool.

 

Data Science forms the very substratum of an Analytics Practitioners’ work, it’s what sets us apart from Statisticians or Mathematicians. However in some instances we cannot rely on it alone, we need to employ other measures to increase its definitiveness.   In any event I am sure many Data Scientists use math and other means to augment the potency of their Analytics, some not even scientific at all. It is undeniably prudent to do so where necessary, especially in fields that demand a higher standard of accuracy and care.

Data Science: Uber for Law.

In my final year of Law School I did a Practical Legal studies course, it is compulsory at the University that I attended and students are required to pass it to be able to graduate. The course entails working as a trainee Attorney under the supervision of an experienced Attorney at the Universities’ Law Clinic; which provides free legal services for indigent people, the poorest of the poor. This particular Law clinic is the oldest and largest law clinic in the country, I had the privilege of being trained by some of the countries’ best and most experienced Lawyers. That said however, the year I spent there made me abhor conventional legal practice and all its instruments. Students are broken up into pairs, my partner and I were initially given five cases, some of which had already been there for years, we inherited them from previous students and their supervising attorneys. Ultimately my partner and I would bequeath some of our very own cases to the students coming to work there the next year and the cycle would repeat itself. As the year progressed we took on a few new cases of our own and by the second semester we had nine in total. I loved consulting with clients, my partner and I enjoyed coming up with solutions that made their faces light up. When that was said and done, we became what most Attorneys are, “glorified scribes”, “paper pushers”… the part that the producers of the best Legal Drama’s leave out. The time-consuming, exhaustive and iterative tasks were stifling creativity and innovation. The endless drafting and insurmountable case-load was haemorrhaging meaningful productivity and mental sharpness but most importantly the quality of legal services to our clients. This narrative of cumbersome legal practice is not peculiar to Law Clinics; it is ubiquitous in all spheres of legal practice in both the public and private sectors. Case back-logs, understaffed departments with sometimes tight budgets, overworked and sometimes under-paid Lawyers are increasingly alienating technologically adept clients.

The inherent inefficiencies of law will result in the proliferation of Data Science in Law in the form of the Algorithmic Lawyer. It will happen just as the inefficacies of the taxi industry ushered in the meteoric rise of Uber. Uber epitomizes disruptive innovation introduced to a field that grew too comfortable in its business model, much like the relative comfortability that the field of law finds itself in now. The ease with which Uber uprooted the status quo in public transportation with a very simple but novel idea was ingenious, some might even say cruel. Conventional legal practice harbours the kind of technological ineptitude that makes it unsustainable, especially in a generation of people looking to technology to better their lives. Clients will find that the Algorithmic Lawyer formulates the kind of mechanized processes that are unimaginable in conventional legal practice. Eventually, legal processes that cannot assume this mechanized form will become increasingly archaic, out of touch and unattractive in the minds of clients. There is an innate quality in every consumer to demand the very best at the very best price, not below par service at an extortionate fee and at a huge inconvenience; it is this quality that gave rise to Uber’s success. Society as a whole is yearning for a Legal Uber, something that will do for Legal services what Uber did for public transportation. They want Legal services that are convenient, affordable and most importantly at the forefront of cutting edge technology.

I’m afraid the Legal Practitioners no longer has the luxury of scepticism, because whether or not technological advances will be adapted in law or not is unfortunately not up to lawyers, it is up to their clients. They are growing more and more restless and tired with their commercial interests and transactions operating solely within the limited confines of traditional law, even though innovative alternatives exist.

I don’t think lawyers fully appreciate the impending disruptiveness of legal innovation in the form of Data Analytics quiet enough. So in the briefest and simplest way allow me to loosely outline its magnitude: The fact that an algorithmic lawyer can draft pleadings in seconds instead of hours is disruptive, the fact that he can predict the arguments of his opponent is disruptive, the fact that he can scientifically augment his chances of success and diminish yours is disruptive, the fact that he can practice in a field of law he knows nothing about and still be competitive is disruptive, the fact that he can practice multiple and unrelated areas of law in mere hours is for a myriad of clients is disruptive; his productivity is akin to that of a factory assembly line, the fact that he uses Algorithms that simulate trials to calculate hundreds of ways to either win a case or a formulate a commercial fix is disruptive, finally the jugular; the fact that he spends less time on a matter means that he will charge his client much less, despite the fact that he has magnified the scope and quality of ordinary legal services ten-fold. Advanced Pattern Recognition, Decision Science, Artificial Intelligence, these are concepts that are tragically foreign to traditional Lawyers. If you provide Analytic services to a very weak Lawyer (and there are many), he becomes a highly competent one, equally if you provide Analytic services to a highly competent Lawyer he will become the very best.

Allow me to quell any fears amongst lawyers of learning math or an entirely new scientific field. Being an Algorithmic Lawyer simply means that you make use of Data Analytics in your practice, you don’t “do” Data Analytics. You simply have to make use of the technology that enables you to practice algorithmic law by securing affordable Analytic services from a company that specializes in Legal Analytics. Preferably an Analytics firm made up of other Lawyers, that way they have an intuitive understanding of the inner workings and intricacies of legal practice and they can gently guide you into much needed technological reform. Legal technology is providing simple, easy to use tools that allow any Lawyer to become an Algorithmic one, don’t allow the “Legal Uber” make you an obsolete Lawyer.

Magnum Opus Dolus Eventualis

Formula

Someone on twitter once remarked that law cannot be reduced to algorithm; he was commenting on my article “Alchemy and Algorithmic Lawyers”, he said that it was “impossible”. As a means of substantiating his claim he added that he had more than twenty years experience as a scientist and programmer. Personally I prefer to not make very grand, generalized claims of impossibility. Simply because impossibility is a fallacy that people create as an opium to pacify our innate desire to be different.  It is purely a means of making it easier to ingest the status quo, thereby making it slightly more palatable. Fortunately he is wrong, Law can be reduced to algorithm, in fact the very substratum of law and its application is inherently algorithmic in any event.  Laws and Algorithms are architected on identical paradigms or at the very least very consubstantial ones. It is indeed a fact that how “traditional” law is architected is devoid of any computational or math-based processes. I would like to put an emphasis on the word “traditional” however, because as I have shared before in previous blogs, I have succeeded in delineating law into a series of computations for the purposes of algorithm-based machine learning. “Big Law” (derivative of Big Data) is also growing rapidly and is beginning to uncover and illustrate the algorithmic acuteness of law.

Anyway, enough waffling, let us get down to the blade runner and how his conviction of Culpable Homicide (or man slaughter) instead of Murder (Dolus Eventualis) has an effect on Data Science. To jog your memory, Oscar Pistorius is the athlete who shot and killed his girlfriend through a locked bathroom door on Valentines Day. He was convicted of culpable homicide, and sentenced to five years in prison in October last year.  He is expected to be released on 21 August and placed under house arrest as per a recommendation from the Department of Correctional Services. The state appealed the decision of the High Court and the Supreme Court of Appeal will hear the matter in November.

Dolus Eventualis (D.E) is where the accused foresees the real possibility of death and acts reckless to that possibility or reconciles himself with it. D.E speaks to the mental state of the accused at the time of the alleged criminal act (a pivotal aspect in its general interpretation and in Pistorius’ case). Therefore D.E can be broken up into two aspects, foreseeing a real possibility of death and reconciling oneself to that possibility.  I will not delve into an in-depth legal analysis of D.E with regard to the Pistorius case because such analyses have been done exhaustively.

This article serves the purpose of merely illustrating, at a very simple level, how we go about building Analytic Models for legal principles like Dolus Eventualis. It is also an attempt to quell any suspicions (among scientists and programmers with 20 years of experience) of impossibility regarding advanced algorithmic law. It is purely an abstract account of how we build predictive and classification models for contentious legal rules and decisions. This is done so that ultimately what you have is legal principles being validated and tested for statistically by computational means. It adds efficacy to the need for certainty and consistency within the operation of law.

There are areas of this article that Data Scientists will understand and Lawyers won’t, there are also technical areas of law that Data Scientists may not fully grasp. I would ask that where you do not understand, to try contextualize the basic premise of what you are reading into your respective field. Before reading this article further, I would urge the reader to read my previous article, “Data Science: The numbers game law almost lost”.  Lest you express great shock at the ease at which I speak of math- based numeric values vis a vis case law and other legal data.

As stated before (ad nauseam), we have developed a math-based metric conversion system that converts raw legal data into coherent and stratified numerical values. Applying this conversion system is the first step in classifying a principle like D.E. Our system has a built-in ETL (extract, transform, load) methodology that is applied to all raw legal data relating to D.E. At the risk of revealing our trade secrets, I will briefly explain the ETL process.   When ETL is applied to the case law dealing with D.E, the facts of all those cases are extracted and the values of the correlating outcomes are imputed onto those facts. So that fact “x” will have outcome “y”, facts can have multiple outcomes, and therefore multiple outcome values. The process is useful because one can segment all “y” outcomes and compute their correlation with all “x” facts, sometimes in totally different areas of law, which makes exploratory analytics in legal data much easier.  That is just a rough summation of the ETL process our conversion system goes through. It is an iterative process and includes a number of Analytic models that use genetic algorithms.  When the conversion system is applied to case law it breaks-up the facts and the subsequent rulings thereof into very small particles and factual permutations (fragmentation). The system then analyses and weights those particles and factual permutations according to the legal principle set in the subsequent ruling. For example, the case of Humphreys vs S  (2013) is a very important case with regard to D.E., in this case, all the actions of the mini bus taxi driver leading up to and after the collision with the train will be fragmented into particles and  analyzed algorithmically and each particle will be attributed a calculable numerical value. For instance, the driver overtaking a line of stationary vehicles will be given a weight. Note that the weight is stratified according to its practical legal implication. For example, the court found that the mini bus taxi driver overtaking those vehicles was of “peripheral relevance”; therefore this permutation will be given a lesser value in relation to the other permutations. This process makes producing thresholds for the purposes of machine learning possible.

The next step our conversion system goes through is calculating the sum of the numerical values of all the cases that have ever dealt with D.E (and their permutations).  This will serve as what we call a “quantum”. All basic math functions that the conversion system applies (addition, multiplication, subtraction, division) will take place strictly within the parameters of this quantum. The calculations within this quantum are based on the premise of duality and that of equal and opposite reactions. That is to say, when it subtracts, an opposite and equal addition will be made, when it adds, an equal and opposite subtraction will be made; the quantum is therefore never exceeded.

This is done so that what we at Gotham Analytics call “Combative Analytics” can be applied. We use the term “combative” because in legal matters there is always a dispute between parties where two divergent interests are competing against one another. Either between a plaintiff and a defendant or the state and the accused, that is precisely why the conversion system works on the premise of duality when producing the values. You cannot produce coherent values where an objective quantum is undermined or is not adequately maintained. Therefore the quantum can best be described as the referee in the dispute or (at the risk of being blasphemous) a judge.

I find that fragmentation allows for the quantification and discovery of very subtle insights into legal data. For instance, one of the issues the court elucidates on in the Humphreys case is the conflation of the test for D.E with that of aggravated or conscious negligence. This could be one of the points the Supreme Court of Appeal relies on in making its judgement, it would therefore be prudent to quantify it when building any Analytic model for D.E. Once the conversion is complete, the numeric values can be used for algorithm-based machine learning

A myriad of machine learners are used for classification purposes, all taught D.E at varying degrees based on legal data. Three models are built, the first for classifying strictly numerical values, weights and thresholds (there is a difference), it includes Neural Networks, Regressions and a few Discriminative classifiers. The second model will analyze more varied data, with both numeric and other attributes, some very surprising (for example dates for the purposes of time series algorithms), and this second model includes tree induction, correlation, segmentation, association rules and more machine learners. The third will amalgamate the results of the first and second models into a single grand model. It will include simulations, rule induction algorithms, Meta learning schemes and many other Analytic functions.  We create an ensemble of machines and sub-processes feeding data to each other back and forth, converging and deviating continually.

Once the computations of the first three models are complete; requirements, factors for consideration and other legal reasoning summations can be reduced to a series of calculable numeric thresholds. Exceeding or falling short of these thresholds informs the degree to which a certain requirement in the courts’ reasoning has been satisfied or not. This adds mathematical precision to judgments in a consistent, un-biased and scientifically objective manner. It is these thresholds that form the basis of many of our predictive and automated compliance models.   Numeric thresholds for the D.E requirements of foreseeing a real possibility of death and reconciling oneself to that possibility can now be produced using the results of the machine learners. Building numeric thresholds for Mens Rea enquiries (where the law makes an enquiry into the cognitive state of the accused at the time of the commissioning of the crime) is quite a feat. Mathematically quantifying what the courts deem to be the prescribed mental state of the accused for D.E to succeed of fail is a jurisprudential breakthrough to say the least.

It is even possible for certain algorithms to give a formula for calculating whether Oscar’s conduct was reasonable and commensurate to the perceived imminent threat, which is a fundamental element of the reliance on private or putative self defence(which was one of Oscar’s defences). The requirements of Dolus Eventualis can be delineated into math-based formulas to be applied to the facts of any matter. We can measure mathematically how far Oscar exceeded the bounds of reasonableness in his attack as a result of the perceived threat. We can even produce numerical value thresholds to calculate how divergent his actual conduct was from the legally prescribed conduct of a person who foresees a real possibility of death. I often marvel at how Data Science platitudes like searching for  an outlier  or anomaly detection techniques in very basic Analytics software can be the difference between whether someone will be found guilty of murder or not. Unfortunately most in the legal fraternity do not know this, which is tragic. How something as basic as a standard deviation can be used as an argument in mitigation or aggravation of sentence and indeed used by a judge in handing down a sentence.  Time and time again I see how something so facile in the field of advanced Analytics can be so crucial in the veracity of legal outcomes like that of the Pistorius case.

I am also perplexed at how the law can speak about a “balance of probabilities” as a burden of proof in civil matters and not be at least mildly interested in statistical probability theorems. One scholar said of the Bayesian theorem that it is to the theory of probability what Pythagoras is to geometry. I am not saying Naïve Bayes is an all encompassing “cure-all”, because it is not, all I am saying is that at least use it as a tool, as a guideline to the sanctity of human discretion.   As a law student studying case law the phrase, ‘a balancing approach” is hammered into your head when interpreting a judges’ decision.  Judges regularly employ balancing approaches, especially where divergent interests are being considered and calculating which interest is more reasonable based on the facts is necessitated.

The court is tasked with the huge responsibility of finding some sort of equilibrium between two opposing arguments or interests, and yet they do not consider seeking a math based equilibrium point on a basic line graph giving a statistical account of both the parties’ interests.

Below is a small part of a formula created from the results of a regression algorithm used in a classification model.  It measured the respective thresholds each of the parties scored in relation to the principle of Estoppel under Administrative Law.  The formula was extracted from the well known case of City of Tshwane vs. RPM Bricks.  The Model is as robust and statistically accurate as possible; it is simply a mathematical account of the precedent the case set:

– 0.6182529010134109 * (-0.28122611157540334 * Party Value + 0.8082030429003213 * Value)+ 0.45123082530789793 * (-0.03263760806368616 * Party Value + 0.8082030429003213 * Value)- 0.6159847913194462 * (-0.28122611157540334 * Party Value – 0.20673012012673037 * Value)+ 0.45347425335361813 * (-0.03263760806368616 * Party Value – 0.20673012012673037 * Value)

“Party value” and “Value” are the values that our conversion system would produce. A formula such as this could be the summation of every piece of reported and unreported case law, every Ratio Decidendi, every Obiter Dictum, every facet of a courts’ reasoning, every single relevant legislative provision reduced to a coherent formula. Now, Analytics practitioners know that something like this is rather elementary, these are basic entry-level Analytics. That said however, when applied to the field of Law, it becomes the supreme panacea to all legal uncertainty.

Sure, Data Science is prone to error, like any other science or art, but so is law. If Law was impervious to error there would be no such thing as appeal. The point is to formulate means of analysis that are the most deviant from the inevitability of error. What was extremely worrying about the Pistorius case was really the proliferation of the view that people with money can get away with murder, literally. Perhaps society can have greater confidence in a justice system that uses mathematical and statistical aides. Aides that are devoid of subjective susceptibilities and dispositions.

Whether he is guilty of murder (Dolus Eventualis) or not from a Data Science point of view, we will not say, because that piece of information from us is proprietary. Whatever the Supreme Court decides, Legal Practitioners, Judges, even Commercial entities, and especially the general public should know that Data Science can provide a transcendent system of justice.  A system of justice that is truly impartial, precise, efficient and expedient. One that cares nothing about your net-worth, influence or celebrity status. A justice system that the world has never had the privilege of seeing before, until now. That, is my Magnum Opus.

Data Science: The numbers game Law almost lost.

On the face of it, Analytics and Law are manifestly divergent fields of practice. One need only consider the nature of Algorithms that require numerical attributes for their calculations and the textual rigidity of substantive law to realize this. The very first obstacle one will encounter in applying Analytics to Law is the absence of calculable numerical variables in raw legal data. No judicial precedent, statute or common law principle has ever been reduced to a mathematically sound numerical expression; raw legal data is simply not Analytics-receptive.
There are however some methods of mining raw legal data, like powerful Text Analytics that make it possible to build reasonably accurate classification, sentiment analysis and many other models. There are also methods like Discretization (e.g. Nominal to Numerical) in Neural Networks for example that try and facilitate this kind of machine learning. There are a few more techniques that are available but in my humble opinion not worth mentioning, precisely because the results that they yield in their application to raw legal data are catastrophic.

Law is incredibly nuanced and has with it intricacies peculiar to it alone, you need to be able to factor in those intricacies as mathematically adept numericals to assist accurate machine learning. To simply apply text analytics alone or simplistic variations of machine learning is overly facile.

We have seen the use of purely aesthetic numerals as legal data sets for Algorithmic processes. By aesthetic I mean a completely superficial value, say 78, to denote something like contractual breach in a predictive model. The results were a statistical calamity to say the least; it had solved the numerical anomaly at a surface level, but not at a completely authentic one. With this sort of weighting, something like contractual breach is simply expressed as “x”, however “x” has to have a legitimate value in that it has to be calculable mathematically and you have to be able to solve for it. Therefore other variables have to be factored into a calculation that ultimately has breach or “x” as a result. This sort of superficial weighting does not do that.This kind of representative weighting has an adverse effect on the statistical integrity of the algorithm and\or machine learner. Legal data architecture necessitates a collaborative approach between this sort of representative weighting and actual math-based weighting.

Before even purporting to mine legal data competently the numerical anomaly has to be reconciled, legal analytics is simply inconceivable without this reconciliation. This is extremely hard, which is probably why weighting systems in legal technology are non-existent. We spent a year developing a math-based metrics system for law, the system facilitates the conversion of raw legal data (for example a section in an act) into numerical values for the purposes of algorithm-based machine learning. Amongst other things a system such as this one required calculable and inter-dependent variables, stratified weighting schemes, proportionality and most importantly numerical values that are mathematically apportioned according to their peculiar legal implications. What has resulted is a metric system that can numerically quantify legal permutations not only at a purely epidermal level, but at a systemic one as well. For the first time a common law principle like the duty to disclose material facts in insurance contracts, can be mathematically quantified to a value of say “3.22”for the purposes of an algorithmic process. Once this has been done, the values can be used for data pre-processing, architecture and analysis.

At first we intended to use the metrics system for internal purposes only, however we have since resolved to provide these metric conversion systems to other Legal Technology firms and Analytics firms in general. We feel that the data science field needs alternative means of analyzing raw legal data in the most statistically robust manner. Without a capable metrics system in legal analytics it is impossible to unearth very subtle insights, any legal data mining without a system such as this one is unfortunately very finite. Decision science and diagnostics, advanced predictive modeling, algorithmic trial simulations, pattern recognition and automated policy enforcement are some of the advanced areas of our practice that simply would not be possible without a math-based metric system for legal data. Many Data science ventures into law fail before they even begin because there does not exist a numeric conversion system for their legal data. For lawyers, Data Science is a numbers game they almost lost, fortunately though they are now beginning to win it.