I. Introduction
The vast majority of the population embraced the fact that life without technology is no longer possible. The amount of information that reaches us increases every day, be it true or false. Data is the centre of all informed decision-making and the legal industry is here no exception.
Lawyers deal with large amount of data in their day-to-day work, such as facts of the case and legal rules. These data is normally contained in legal documents, legal acts, legal research papers, case law, etc. For example, imagine merger and acquisition transactions where companies have to perform due diligence in which many questions are addressed, from financial, legal, technical, management, other company matters etc. Transaction as such involves great deal of information, involves many different experts and financial costs. As a lawyer, in a legal due diligence process, one might deal with employment contracts, pending or threatened litigation, claims and so on, all in all thousands of pages of documentation. It takes time to collect and process all the necessary data, and even more to structure this data in a desirable way. Now, imagine of processing the data in high speed. It is not abstract as it seems as data science is not something so new that positive result have not been seen.
II. What is data science
To understand how data science can help legal professionals, first we have to understand what the notion of data science actually is. Data science is an interdisciplinary field that combines mathematics and statistics, analytics, programming, artificial intelligence and machine learning dealing with the problem of knowledge extraction from structured and unstructured data, including big data. Basically, it implies collecting, analysing and interpreting data to gather insights into it which could be used later in the decision-making process.
It is important to understand that data science is not the same thing as data analytics. Data analytics means the process of examining raw data sets find trends and draw conclusions beneficial for making informed decision. While both fields deal with data, data analytics is considered a more focused field as the work is engaged in answering concrete questions based on existing data. On the other hand, data analytics is included in data science, but data science isn’t looking for specific answer as its work is focused on making strategies, predictions etc. from a massive amount of data while searching for similarities and preparing data that could by analysed later.
Another important distinction that needs to be emphasised is the concept of business intelligence. Business intelligence is a procedural and technical infrastructure that collects, stores, and analyses the data produced by a company’s activities to help organisations make better data-driven decisions. They seem like very similar fields, like both fields are data-orientated and some of the techniques overlap, e.g. data visualisation. However there are many differences, even though one could still get confused. First of all, data science is orientated on predicting future trends, while business intelligence is focused on analysing past event. Second of all, data science can easily manage large and unstructured data, while business intelligence relies on well-structured data. Business intelligence has greater potential in day-to-day work as requires fewer technical skills. Because of the similarities and minor differences, those using business intelligence can easily transition to using data science.
Since lawyers generally are not educated into software development and informatics, this is the area in which lawyers could use help of data scientist and there are three ways to do so: external consultants, in-house data scientists whose primary responsibilities are not legal work or in-house data scientists in the legal department. Today, it is possible to learn about legal data science on various online platforms that enables lawyers and legal professionals to get familiar with computing and gain the expertise in the field.
III. Legal Data Science Examples
More generally speaking, data science can help in predictive analytics, document and legal analysis, case management, data visualisation etc.
Predictive analytics are mainly used for predicting future outcomes of litigation based on existing historical data. The results we get may not only tell us the possible outcome (win or lose), but may refer us to similar cases for strategy, possibility for the out-court settlement, may predict the costs of litigation and even the length. Have in mind that in continental legal circle this is of great help as there is no precedents and the rate of the same result is drastically lower.
Extraction of data is not an easy task, even if you have seen many similar documents and you know what you are looking for. Document analysis can help in search and identification of patterns, similarities and differences, as the data used as input need to be classified. Many purposes can be achieved, such as identifying problematic clauses in draft of a contract during negotiation period in order to achieve consensus more quickly. Best known legal analytics example is eDiscovery. eDiscovery Software streamlines your eDiscovery process by holding all your data in one place for easy researching, collaboration, and sharing. Before eDiscovery software, the collection and sharing of data for cases involved physically shipping documents in boxes. With eDiscovery software, legal teams can easily share information with law firms, clients, and vendors all from one platform.
Case preparation is a unified process where different steps need to be connected to obtain the most appropriate datasets. This process starts with identifying electronically stored information, then systematically preserving, collecting, processing, reviewing and analysing the data for production and presentation. Combined with predictive analytic techniques, it is possible to achieve more precise case strategy from selecting the best lawyer for the case, to constructing arguments that proved to be key and, finally, all the outcomes.
Case management, on the other hand, has a beneficial role in lawyer’s time management. Being a lawyer is business, and in order to keep business running you need more than on client. Keeping track of dates and deadlines is of utmost importance- different case different court hearing date, different appeal period etc. It comes in handy not to do keep the track on paper and to have a possibility to do get it done while preparing something else.
Data visualisation is a good way keep client informed. Laws should be written simply and comprehensibly, in order to be understandable to all. We all know that is not the real situation and clients may need simplified language to understand. It’s in their best interest to be able to understand and follow the process.
Not only that data science can help lawyers with their legal work, it can help business itself as well. By using data science algorithms for digital advertisements lawyers and other legal professionals have an opportunity to attract new clients.
IV. Concerns
If so ideal, why data science is not common among legal practitioners and what are the problematic aspects of its usage?
Truth to be told, law is considered as a very traditional profession. From faculty and learning style to legal practitioners and the clients (especially those in need of free or lower-cost legal assistance), few are willing to invest in such technologies. Some would say it is because it increases the costs of service – at least in the beginning, some don’t keep up with technological revolution and do not put their trust in the computers, and others for various reasons.
As it was mentioned earlier, lawyers usually don’t have educational background required for data science. And data scientist don’t have legal background necessary for identification and interpretation of relevant data. Such a ‘relationship’ implies cooperation which leads to high costs, especially in the beginning and if taking services of external data scientists. In addition to that, to have a benefit of data science one should have sufficient number of legal analytics use cases.
Another problem that can arise is bias in data that happens when some data are more heavily weighted or represented more than other which leads to lower accuracy and issues of discrimination, ethics and fairness. Furthermore, bias is not based only on the input data. Algorithmic bias is when bias is created by the algorithm and is not caused by the input data. There are ways to minimize the possibility of bias decisions occurring, but it mainly depends on human decision in defining the goal (the narrower the better), detailed consideration and understanding of collected data in use and diverse questions.
In the EU, where GDPR plays a great role in relation to data, data security is a mass concern. Anonymization of private data should be done in order to process necessary data without the possibility to identify private person or company based on the chosen input. There are questions that need to be answered before collecting and processing data, such as: is the legal ground for collecting and processing data, for how long we keep data, etc. In general, data should be protected from theft or any other possible threats. It requires identification of risks and datasets at risk as not all require same level of security.
VI. Conclusion
It is important to emphasize the need of educating lawyers for data science as they are the ones with the skills needed to interpret the relevant data in order to get accurate results. It is understandable that small law offices don’t invest in such technology as their datasets may not show full potential of this form of decision-making, although some techniques could be very useful for client interaction. With adopting the full potential of data science, repetitive legal work could be done faster and one could dedicate more time into analysis and informed decision-making rather than searching for relevant data. This way efficiency increases, clients get faster responses and the results are becoming more predictable and more accurate.