A partner is someone who helps where they can, not because it helps them, but because it helps you. At Legal Outsourcing 2.0, we are blessed by having many wonderful partners. In business, as in life, we are all known by the company we keep. We are particularly proud of our partners. They not only make a difference for us, they help us make a difference for our clients. For that reason, we periodically do a bit of bragging about our partners.

LO2 operates in three areas:  1) helping legal tech companies create their AI solutions and performing services for them for their clients; 2) performing traditional litigation document reviews and; 3) doing data breach notification document reviews.

Today, we shine the spotlight on one of our partners, Canopy (www.canopyco.io ), which has developed a tool designed specifically for the use case of data breach analysis and review, as opposed to recycling a litigation document review tool for another purpose. As they say in the construction trades, if the only tool you have is a hammer, everything looks like a nail. We recognize that the task at hand for a litigation document review and a data breach notification review are fundamentally different. We use the Canopy tool and recommend it to clients when we do data breach reviews, which is the highest compliment we can make.

I was able to impose on my friend, Canopy’s Founder and CEO, Ralph Nickl, and interview him for this piece.

HARRY BUCK

Ralph, before you founded Canopy, you spent many years in the eDiscovery space. What attracted you to the space and what stops did you make along the way?

RALPH NICKL

I was indoctrinated into the eDiscovery industry by a fantastic group of people at True Data Partners, one of the first three pure-play eDiscovery service companies.  I was particularly interested in the fast pace of eDiscovery, the importance or newsworthiness of some of the engagements, and the ability to apply technology to achieve real value to the customer.  I went on to co-found Global Colleague, an offshore document review company.  Once that team matured, I pursued leadership positions in pure eDiscovery software companies, as these types of companies are most aligned with my skill set. Prior to eDiscovery, I worked as a software engineer and then a technical architect in the Accenture Communications and High-Tech practice.  As data breach incidents grew in number and sophistication, it became clear that my dual backgrounds in tech development and eDiscovery were ideal for inventing a new capability to aid in responding to a cyber incident.

HARRY BUCK

What led you to believe that using litigation document review tools for data breach review was not getting it done?

RALPH NICKL

I had become increasingly aware that most eDiscovery tools were falling short when applied to problems outside of traditional litigation and investigation.  In order to solve those traditional problems, eDiscovery software was designed to find documents in order to tell a story relevant to specific issues.   At first blush, it made sense to apply the search, analytics, and review capabilities of eDiscovery software to find protected information that has been compromised.  But when you consider that the goal of data breach response is to produce a consolidated list of compromised personal identifiable information, beliefs, health, and financial information about individual employees, clients, patients, or other individuals, it’s clear that a purpose-built solution is warranted.  When using standard eDiscovery solutions, what do you do with this protected information once it is discovered? How do you compile that information across hundreds of reviewers? And, how do you shift from a paradigm of case specific discovery to universal discovery based on common definitions of PII?  When we first looked at the data breach review problem, our idea was just to provide a coding panel that could be integrated side-by-side with any eDiscovery tool. As we examined the entire process of creating the list of affected individuals, we found there were many more opportunities to provide value, because so much time was lost trying to bend the review platform to the work that was being done. That inspired us to remove all the typical eDiscovery relevancy review functionality and replace it with features to detect, extract, resolve, and relate protected-data (personally identifiable information, protected health information, etc.) about affected individuals found in the documents. We established partnerships with data breach response leaders like LO2 so that we could jointly change the paradigm and shape how this sort of work gets done going forward.

HARRY BUCK

Everybody in the Legal Tech space now claims that they use AI for one type of feature or another, what are the areas in which you applied AI in developing your tool and how has it helped any particular functionality of your tool?

RALPH NICKL

We are using machine learning in several areas of our application, as well as applying similar techniques to running our business.  Currently, we have applied machine learning and natural language processing to classify images, detect protected-data, and resolve entities. How we shift the discovery approach from a case-based to a universal model is by taking advantage of the fact that protected-data elements (e.g., date of birth, driver’s license numbers, mental or physical condition, etc.) have common definitions that do not vary case-by-case. By training our machine learning model across every project, the application gets smarter with every project, yielding more comprehensive and precise results. This is unlike Technology Assisted Review, where models are primarily trained or optimized project-by-project. Canopy’s machine learning just fits in with the way our cyber security analysts and legal reviewers work today. We have a culture at Canopy of challenging the status quo. I am a bit of a contrarian, and the Canopy team is not interested in building a “me-too” solution.

HARRY BUCK

As any Founder and CEO of an early stage tech company knows, your choice in a CTO is critical. Oran Sears CEDS is your CTO. How did the two of you join forces and what was it about him that led you to believe he was the right fit?

RALPH NICKL

You are exactly right. Picking your business partner is a “bet your business” and “bet your quality of life” choice. Oran and I had worked together under two different companies, previously. We had a familiar and honest working relationship. Most important to me was Oran’s passion for designing an exceptional product, his leadership qualities, his proven ability to work with clients and ascertain the precise pain points they’re experiencing, and his experience with eDiscovery software. My hope was that if I provided Oran the opportunity to design and deploy a product to solve a unique and business-worthy problem, it would result in something extraordinary. He hasn’t let me down.

HARRY BUCK

You launched Canopy a little over a year ago, how long did it take you to develop the software before it was launched?

RALPH NICKL

We were about nine months into development before we had a baseline product we could proudly offer to our early adopters. Everything we develop is feature driven, since we don’t have the development debt most companies are facing when going to the cloud. From the beginning, we had a modern, containerized, cloud-ready but portable architecture and a continuous deployment model. This design enables us to move fast.

HARRY BUCK

What was the biggest surprise you encountered in developing the tool?

RALPH NICKL

As I alluded to earlier, we have a strict rule not to build things that have been built before–not to spend months on building what we could buy. It was a difficult decision for us to build our own processing, and it was a surprise to me how far software has come in the area of processing loose files since the early 2000’s. Really, processing is a basic capability of any machine learning exercise dealing with unstructured data. To have flexibility with how we defensibly process data, elastically scale processing and integrate our features into the processing pipeline is essential.

HARRY BUCK

What has been your biggest surprise since you launched the tool?

RALPH NICKL

Using your analogy, perhaps the biggest surprise to me is how many incident response teams are subjecting themselves to the struggles that come from using the wrong tool for the job. Using eDiscovery software to review data after a breach results in missed deadlines and higher costs — even after attempting to use plugins and workarounds. Fortunately, once we demonstrate those drawbacks, our clients quickly change gears and reap the rewards.

HARRY BUCK

What are the specific pain points that your tool was designed to address?

RALPH NICKL

There are dozens of problems that are presented in a Data Breach Discovery project which required us to invent dozens of new capabilities. Here is a short list of pain points:  determining if compromised data contains any protected-data element; building and managing profiles of affected individuals; loading structured tables of PII contained within unstructured data; efficiently identifying images with PII/PHI; de-duplicating the list of affected individuals, even if an individual’s name has changed or other key element is slightly different; and determining if the information gathered has met the notification threshold of a particular jurisdiction.

HARRY BUCK

What are the specific improvements in functionality that Canopy offers in the data breach setting, compared to the old document review platforms?

RALPH NICKL

I think document review is a convenient term that means a lot of different things. We are not providing a review tool for relevancy. Instead, we are providing a document review tool to extract affected individuals and build their sensitive information profiles. We highlight protected-data, and you can click to add this information to the individual’s profile, eliminating copy and paste or transcription errors. All the while, the system is intelligently looking ahead at the rest of the data for the same patterns and deduplicating profiles entered by hundreds of reviewers. There are literally dozens of additional features that we introduced to solve the Data Breach Discovery problem. As I mentioned previously, a reviewer can import tables of PII into the master list of entities during first pass review. Canopy classifies images containing protected-data such as an image of a driver’s license or passport and presents all of the images in a gallery view to allow a data analyst to quickly determine if they contain PII or not.  Canopy provides a detailed entity view that allows quality control user to trace all of the discovered protected- data for an individual back to each source document.

HARRY BUCK

We all know time is money, what types of improvements have your users reported in the time it takes to perform a review using your tool, as opposed to tools designed for litigation document reviews?

RALPH NICKL

We eliminate the manual impact assessment phase, saving approximately 80 to 160 hours of analysis.   We provide a data breach impact assessment report automatically, in hours (the time it takes to process data), instead of a team of people taking a week to ten days. 

When compared to a review-all-documents approach, our clients have been able to reduce the review cost up to 90% by reducing the number of documents promoted to review. To do this, a team uses Canopy’s analytics to surgically segregate documents into three groups:  those that do not contain protected-data, those that contain protected-data, and those that require more analysis. Of course, culling is limited by the percentage of documents containing protected-data elements. Also, during this phase, there is an ability to automate some of the review by providing known lists of protected-data.

We’ve improved the document review speed by at least 20% for documents without tables/spreadsheets.  Taken together, the ability to reduce the volume of data reviewed and increase the efficiency of the review process itself represents the greatest cost savings.

For spreadsheets or documents containing tables, reviewers have been able to map and load tables with ten-thousand rows of protected-data in less than five seconds, as opposed to one hour, as is typical with a traditional review. 

We reduce the one to two-week effort to clean up and normalize the list of entities for notification by over 80%. We also provide unique tools to review and quality-control the final list of individuals.

Ultimately, Canopy reduces the amount of time and resources involved in post data breach discovery while increasing the quality of the list of affected individuals. With Canopy, incident response teams can meet regulatory deadlines while reducing their cost.

HARRY BUCK

What is next, in terms of building out the functionality of your tool?

RALPH NICKL

We will continue to build out Canopy’s data breach discovery software to enable service providers, legal teams, and consultants to conduct thorough post-data breach response analysis and review with unprecedented efficiency.

We are exploring a couple of other proactive protected-data use cases where our partners have expressed that we have the basis for a killer app.  We feel that applying Canopy’s technology to protected-data at risk is the next logical step.

Harry, thank you for the opportunity to speak with you about our partnership! it’s been wonderful to support LO2 as you’ve formalized your Data Breach Notification Document Review offering.

RALPH NICKL

About Ralph

Ralph Nickl has more than 25 years of experience working with startups to deliver business solutions…

Harry Buck

About Harry

CEO of Legal Outsourcing 2.0 – Director of Growth and Partnerships – Legal Technology Salesperson and enthusiast…