Pricing Legal Services Accurately With Data Analytics Technology

A Case Study by Legal Decoder, Inc.



Pricing legal services has become one of the legal industry’s biggest challenges – both for clients and law firms. Management headaches and maddening business outcomes result from pricing misfires. To contextualize the economic magnitude of the pricing challenge, every year $60 BILLION is essentially “up for grabs” as between clients and their law firms.

Both clients and law firms strive for the same things: pricing predictability, economic visibility, operating efficiency and value delivery.  Neither side is happy with the uncomfortable feeling of not really knowing what is fair to pay and not really knowing what is fair to charge.

Pricing (in)accuracy has a myriad of knock-on effects that impact business operations and performance.

Does the price estimate fall within industry norms and reflect a realistic budget?  Will the client balk at the quoted price? Will other corporate strategic initiatives have to be postponed if there are cost overruns?  Is the firm’s target realization rate attainable?

All are critical questions and accurate answers make a huge difference on the bottom line.  Any seasoned legal professional knows that the pricing challenge cannot be solved with rudimentary tactical measures such as adjusting billing rates, implementing AFAs and utilizing RFPs. Clients are now insistent on budgets and have become more aggressive about pushing back on bills that miss the budget. Law firms that don’t take control of their on-the-ground pricing choices run the risk of having to reduce bills in order to retain major clients.  Forward thinking legal industry leaders on both sides demand a systematic edge to accurately price legal services while optimizing quality, resource control, cost, profitability and value.


Legal Decoder developed the LSA Pricing Engine to address and overcome the challenges of accurately pricing legal services.  The LSA Pricing Engine intelligently analyzes law firm billing, time entry and timekeeper data, programmatically quantifying and categorizing line items into proprietary legal taxonomies based on data in a billing entry. It programmatically replicates how a legal professional would approach the pricing process by leveraging natural language processing (NLP) and proprietary algorithms. The LSA Pricing Engine analyzes, on a line item by line item basis, WHO (legal professional credentials) did WHAT (Work Elements identified in narrative) and HOW LONG did it take and measures those results against over 28 billion unique algorithms.

Once analyzed, the Work Element in a line item narrative is automatically assigned by the Pricing Engine to one of a series of sub-branches and branches (Tasks, Activities, Detailed Activities) in one of Legal Decoder’s proprietary taxonomies which cover many areas of law (e.g., M&A, Patent Litigation, IPOs, Commercial Litigation, Patent Prosecution, Internal Investigations, Employment Litigation and others).  Where applicable, Legal Decoder’s taxonomies align with the UTBMS codes (Uniform Task Based Management System) which is a series of codes used to classify legal services performed by a legal vendor in an electronic invoice submission. A key differentiator of the LSA Pricing Engine is that it does NOT require a legal professional to assign a code or change workflow or billing habits.  Historic data can be processed whether or not it has been previously coded.

Clients and law firms can leverage the power of the Pricing Engine to better understand the nature of the legal work performed and whether it is being assigned to the right level of legal professional and handled in an appropriate amount of time.

The graphic below (See Figure 1) illustrates how raw data and Work Elements in a single narrative billing entry are analyzed and then transformed into a structure which informs pricing decisions.



Commercial Litigation


L160 – Settlement / Non-Binding ADR


Settlement and Resolution






Motion to Compel Arbitration


“Begin drafting motion to compel arbitration”.

Figure 1 – Pricing Engine Commercial Litigation Taxonomy Tree – L160


In late 2017, Legal Decoder partnered with 10 law firms and commercial clients using the Pricing Engine to analyze their billing data from seven specific areas of law: commercial litigation, patent litigation, M&A, bankruptcy, internal investigations, IPOs and inter partes reviews.  Law firms and clients alike recognized that the best predictor of future pricing is properly analyzed historic billing data.

For participants, the goal of the pilot program was to transform mountains of raw data into structured, actionable information for pricing. Pilot participants embraced the Pricing Engine’s “bottom-up” approach which precisely pinpoints what a legal professional is doing and then aggregates data into meaningful phases when building a pricing model.  In the pilot program, the Pricing Engine processed hundreds of thousands of line items from a vast array of legal matters.  The nature of the legal work and the duration of the task were automatically detected, analyzed and categorized, granularly by Work Element, task and by phase.

After billing data was processed by the Pricing Engine, pilot participants could view and analyze data using the Pricing Engine dashboard or their preferred business intelligence tools.

The Pricing Engine answered questions like: “How many billable hours does it take to complete Task X?”; “What are the most common work activities?”; “What is the estimated cost through Y phases of a matter”; “What is the incremental cost of tasks in a given phase?” “How many associates and paralegals will be needed for the due diligence phase?”  These are just a sample of the myriad of queries and actionable insights that can be surfaced.


Everything Starts with Work Elements

In the data pilot, the Pricing Engine identified and aggregated discrete Work Elements within each participant’s data.  Work Elements are very specific “legal” things worked on, handled or produced by a legal professional (e.g., motion in limine, asset purchase agreement, expert depositions, due diligence reports, FERC application, owner’s affidavits, wills, court hearings and so on).  Some Work Elements may occur just once during a matter and others may recur. While no cases, transactions or matters are identical, every matter has hundreds or thousands of recurring Work Elements that make up its anatomy. By selecting and aggregating projected Work Elements, pilot participants could develop a “bottom-up” pricing model and, then, generate alternative pricing models by adding or deleting other Work Elements.  For pilot participants, the Pricing Engine produced metrics around hundreds of thousands of Work Elements to inform pricing.  The table below (See Figure 2) provides just a few metrics and examples of the Pricing Engine’s output where a legal professional’s time correlates to a Work Element.

Figure 2 – Average Hours and Seniority Level Outputs for selected Work Element

Division of Labor (aka, “What do lawyers really do during the day?”)

After Work Elements in a matter have been identified, the Pricing Engine helps determine an appropriate division of labor, essentially answering the question “Who should do what?” when it comes to activities required for each Work Element. Here again, intelligently analyzed historical data is crucial to understanding the optimal resources and time required for each Work Element’s activity(ies). Putting aside business activities required to run a law practice (firm management, business development, employee training, etc…), the core activities in the practice of law are straightforward. Attorneys read, write, analyze, investigate, communicate, meet, advocate, negotiate and manage a matter’s progress (See Figure 3).

Identifying individuals most suitable for each work element activity is critical to pricing and matter management and a key output of the Pricing Engine.

All participants in the data pilot affirmed that legal work should be delegated to the most competent, lowest cost legal professional. They also concurred that forecasting workload allocation, before now, has been an ad hoc, anecdotal art as opposed to a strategic, data driven science.

Today, the division of labor between “document drafting” versus “document review and revision” versus “document negotiation” should be an automated, strategic analysis, not a rote exercise. After all, they require different skill sets and expertise levels. The ability to leverage historic data using intelligently designed technology is game-changing for pricing. Data certainly should not blindly replace judgment and experience when pricing matters. However, as pilot participants saw, data surely amplifies and enhances the pricing analysis with meaningful guidelines and benchmarks. When crunching pilot participants’ data, the Pricing Engine surfaced the following statistics and baselines for the division of labor amongst various levels of legal professionals:

Figure 3 – Division of Labor – Activities Undertaken by Seniority Level


occurred most frequently in all timekeepers’ line items, showing up in nearly one-third of time entries, with partners accounting for nearly 30% of communications activities.  For pricing purposes, this data can be easily translated into actionable information.  If, for instance, a fixed fee billing arrangement is being used on a matter, a managing attorney should consider streamlining team communications using process management techniques, enhanced communication technologies and highly structured meeting schedules.  Insights of this nature improve cost-effectiveness and realization rates.


was the second most recurring activity overall in timekeeper line items and the most frequently recurring activity amongst associates. The critical takeaway from the data is that mid-level associates (3rd – 6th years) should take the laboring oar when it comes to drafting something anew as the data showed them to be most efficient. The data indicated that Partners should be projected to spend, on average, approximately 17.0% of their time on drafting and revising documentation.


is the third most frequently occurring activity and equally spread amongst partners, associates and paralegals at around a 15.0% rate of occurrence. While “reading” ranks third in terms of frequency, it is the most time-consuming activity, particularly for associates. This is unsurprising as associates are heavily involved in document review and due diligence. Being able to evaluate data both in terms of frequency and hourly volume is critical to pricing precision and easily handled by the Pricing Engine.

Even though countless legal professionals (especially seasoned associates and partners) can handle multiple aspects and activities associated with many Work Elements across several areas of law, the Price Engine helps to pinpoint the IDEAL professional for the job, not just a capable body.  This fuels a law firm’s ability to optimize efficiency for clients and internal resources on every aspect of a matter while maximizing realization rates and profitability.


In virtually every context, understanding the “standard(s)” is important.  The UTBMS codes are no exception for top level legal billing data categorization and drawing high level inferences to shed some light on pricing.  Even so, pilot participants identified several challenges inherent to the UTBMS Codes and looked to the Pricing Engine to overcome them.  The first challenge identified by pilot participants was that most billing data was not UTBMS coded.  Pilot participants next indicated that when data was coded, more often than not, it was coded incorrectly (or a generic, catch-all code was used). Lastly, the most tech- and data-savvy professionals sought an automated solution that offered greater granularity, specificity and flexibility than UTBMS codes alone and the ability to process huge amounts of data – foundational elements of the Pricing Engine.

In the pilot program, the Pricing Engine’s automated, bottom-up approach put together small pieces of billing data to create the bigger pricing picture using one of Legal Decoder’s proprietary taxonomies.  Work Elements were assigned to over 800 sub-branches which rolled up to 120 branches within Legal Decoder’s proprietary taxonomies and ultimately tagged with the proper UTBMS code number (or Legal Decoder equivalent).

While hundreds of thousands of different pricing metrics can be produced by the Pricing Engine, for ease of understanding, consider Figure 4 for a sample analysis of one UTBMS category, “Document Production” (or L320) derived from pilot program data.”

A user can see the percentage of hours attributed to different UTBMS categories within the pilot program data set to guide high level personnel planning and pricing. Indeed, a thoughtful scan of four of the five most common categories tells a pricing professional that process management of recurring underlying tasks is a needed skill set.  Thus, any pricing estimate must account for legal professional resources who can cost effectively engage in high volume, time intensive projects.

The UTBMS codes stop here, allowing only for inference and conjecture rather than a strategic pricing analysis.

Figure 4 – Most Common UTBMS Categories – All Practice Areas in Pilot Program

Legal Decoder’s Pricing Engine, however, digs deeper by further subdividing the category, “L320 – Document Production,” into more focused areas like “Review for Production,” “Evaluation of Produced Materials,” “Privilege and Responsiveness” and “Requests for Production, Objections and Response.”   Leveraging natural language processing (NLP) and Legal Decoder’s proprietary algorithms, the Pricing Engine quickly surfaced Work Elements associated with Document Production, such as “Draft and revise RFP”; “First level review for privilege/responsiveness” and “Update company document production collection” and assigned them to Legal Decoder’s proprietary sublevels that roll up to L320.

With data properly structured, pilot participants could evaluate the average time per matter spent on specific Document Production Work Elements and then, with statistically validated probability, project that “Evaluation of Produced Materials,” “Requests for Production, Objections & Responses” and “Review for Production” should take 22.8 hours, 21.9 hours and 51.6 hours, respectively (See Figure 5).


Figure 5 – Selected L320 Document Production Activities – Average Hours per Matter

After pilot data was categorized by the Pricing Engine into a Legal Decoder subcategory (e.g., “Requests for Production, Objections & Responses”), pilot participants could then naturally develop pricing projections by assigning and modeling professional resources based on past work allocation at the Work Element (or higher) level (See Figure 6).

Percentage of Hours by position for
Requests for Production, Objections & Responses

Figure 6 – Percentage of Hours by position for Requests for Production, Objections & Responses


The Pricing Engine has been architected to overcome a major concern of many legal professionals that relates to the quality or hygiene of the narrative entries in billing records. 

Block billing, idioms, acronyms and other billing hygiene challenges pose no problem for the Pricing Engine.  The Pricing Engine parses narratives and learns the lexicon and conventions that legal professionals use repetitively in narrative entries.

This allows otherwise uncategorizable data to be assigned to a category which provides some measure of insight into what a legal professional has been handling.  For example, billing data that does not otherwise fit neatly into a bucket under the UTBMS M&A codes is categorized into a “Deal Management” category created by Legal Decoder, which mostly consists of process management tasks and communicates (see Fig. 7 below).

Figure 7 – MJ00 Deal Management category – Average hours per line item entry

The overarching goal of the Pricing Engine is to structure data, even dirty data, in some manner so that actionable information can be gleaned from it.   For instance, a closer look at the Unclassified Tasks within the M&A Deal Management sub-category below shows that Partner-level lawyers billed the most hours that cannot be classified (See Figure 8).

Figure 8 – Unclassified tasks by position

Unclassified line items negatively affect the accuracy of how performed work contributed to the overall cost/price of a specific matter.  When multiplied by tens, hundreds, or thousands of times, unclassified line items can drastically alter the body of work for a matter and pricing services for similar matters.  Recording work (and timekeeping, in general) is a chore that most people do not enjoy but improving clarity in narrative entries (timekeeping, in general) helps manage the price and cost of services going forward.


Accurate pricing strategies and improved cost controls are paramount goals of legal industry leaders.  Towards this end, the Pricing Engine empowers users with a legal domain informed solution to intelligently analyze complex billing data.  Through automation, the Pricing Engine completes tasks in minutes that usually require months of manual work. Data analyses produced by the Pricing Engine can be used to track and model pricing and control costs.  Pricing teams can compare pricing against similar matters and scope.  Legal professionals can intelligently determine if a $1.3M fee proposal contract through discovery and motion practice can be accomplished based on past data analysis.  At long last, pricing predictability, economic visibility, operating efficiency and value delivery are entirely attainable. 

About the Authors

Jason Chi

Jason Chi

General Manager | Legal Decoder, Inc

Jason helps lead Product Development at Legal Decoder working with system design, product marketing, and data analysis.  He also leads Client Operations by working with clients on data and business requirements as well as leading internal operations (establishing processes, implementing new technology).  Jason brings 15+ years of work experience in software engineering and management consulting on large enterprise systems and IT modernization.

Jason graduated with a bachelors from the Kelley School of Business at Indiana University.

Legal Decoder
Joe Tiano

Joe Tiano

Founder and Chief Executive Officer | Legal Decoder, Inc

After practicing law for nearly 20 years, Joe founded Legal Decoder because he saw that legal industry leaders lacked the analytic tools and data to effectively price and manage the cost of legal services. Together with Chris Miller, his Co-Founder, Joe set out to build an intelligent, data driven technology company that would revolutionize the way that legal services from outside counsel are priced and economically evaluated. Previously, Joe was a Partner at Pillsbury Winthrop Shaw Pittman, LLP and Thelen LLP where he grew and managed all aspects of a multi-million dollar cross-border finance practice. Entrepreneurship runs through Joe’s veins since his early days as a venture capital lawyer representing a myriad of transformative technology companies.  He is a prolific writer publishing numerous articles on substantive legal issues and the legal industry in general. Joe graduated from Georgetown University in 1992 with a Bachelor’s of Science Degree in Business Administration and received his J.D. from the University of Pittsburgh School of Law in 1995.