COVID-19 Resources for Lawyers
Featured Guest
Josh Becker

Josh Becker, LexisNexis Head of Legal Analytics and Chairman, Lex Machina. Josh is a long-time recognized thought leader on...

Your Host
Bob Ambrogi

Bob Ambrogi is a lawyer, legal journalist, and the publisher and editor-in-chief of A former co-host of Lawyer...

Episode Notes

The use of analytics has permeated into various industries, from baseball to banking, but could it be invaluable to lawyers as well? In this episode of Law Technology Now, host Bob Ambrogi talks to Josh Becker from Lex Machina about legal analytics, machine learning, and the impact these new technologies will have on the legal industry. They discuss the various uses of data analytics and machine learning for lawyers including showcasing expertise, determining strategy, and mining data.

Josh Becker is the CEO of Lex Machina, a company that provides intellectual property litigation data and analytics to companies and law firms.

Special thanks to our sponsor, Thomson Reuters.


Law Technology Now

Moneyball for Lawyers: Changing the Game with Analytics



Intro: You are listening to the Legal Talk Network.


Monica Bay: Hello. I am Monica Bay.

Bob Ambrogi: And I am Bob Ambrogi.

Monica Bay: We have been writing about law and technology for more than 30 years.

Bob Ambrogi: That’s right. During that time we have witnessed many changes and innovations.

Monica Bay: Technology is improving the practice of law, helping lawyers deliver their services faster and cheaper.

Bob Ambrogi: Which benefits not only lawyers and their clients, but everyone.

Monica Bay: And moves us closer to the goal of access to justice for all.

Bob Ambrogi: Tune in every month as we explore new legal technology and the people behind the tech.

Monica Bay: Here on Law Technology Now.


Bob Ambrogi: Welcome to Law Technology Now on the Legal Talk Network. This is your host Bob Ambrogi. Today we are going to be talking with Josh Becker, the CEO of Lex Machina. But before we do that, I want to just take a moment to thank the sponsor for today’s program, Thomson Reuters. Demystifying artificial intelligence can be done in seven simple steps. AI will create change but managing change doesn’t just happen. Visit to learn more.

Well, it’s baseball season so what better topic to talk about then Moneyball. Moneyball of course was the name of the book and the movie about how the Oakland A’s, when they didn’t have the money to build a powerhouse team instead turned to analytics to do it. The Moneyball method that they pioneered is now widely used throughout baseball.

You might say that Lex Machina is the Oakland A’s of law. They really helped pioneer the use of analytics by lawyers in the legal field, the kind of Moneyball for lawyers as some people have called it. So today we are going to talk to the CEO of Lex Machina, Josh Becker.

Josh, thanks for joining us today.

Josh Becker: Thanks for having me.

Bob Ambrogi: And this is kind of a double treat for me, because I just got to interview you recently at Legaltech in New York and hear a little bit about what you have been up to. But if you could just kind of start by telling our listeners who may not know what Lex Machina is, what it does?

Josh Becker: Sure. Well, your Moneyball analogy is very apt. It’s funny, we are seeing lots and lots of analytics deepening its use in sports and around this time, which is something we used to talk about when we were going out and fundraising pitch for Lex Machina. So it’s funny that you bring that up but —

Bob Ambrogi: Well, there’s a reason for that because I was listening to an interview last night with the new Red Sox General Manager and they asked him about that, they asked how are you going to use analytics? I don’t even remember what he said because I was thinking about this, but it’s all over the place as you say.

Josh Becker: Yeah, exactly, for sure. Yeah, Lex Machina helps lawyers with a couple of key use cases. We talk about win business, win cases. So it’s using data to demonstrate your expertise and to compete now on data, not just on kind of reputation or relationships maybe from the past, but to compete on data to showcase your expertise.

And secondly, after winning business is winning cases. So using data to determine the best strategy in front of a judge, to size up your opposition, to understand how long it’s going to take to do your budgeting, all that part of winning cases. So it’s really about mining lots and lots of data. We have I think over a million cases in the database now across a number of different verticals, as we will discuss I am sure, and combining that data and finding patterns that can be helpful to attorneys, patterns of behavior, again, about opposition, about judges, about timing, about any number of things that can help lawyers do their job better.

Bob Ambrogi: I want to talk about all of that as you say, but I wonder if I could ask you to just step backward a little bit and tell us a little bit about how Lex Machina got started. This came out of Stanford Law School and its Computer Science Department as well, right?

Josh Becker: Yeah. Well, it was a little bit unique, having been in Silicon Valley for a long time, being on the venture capital side as well as on the entrepreneur side, I say very often you have cool technology that’s looking for some problem to solve, whereas here was very different.

You actually had big companies, Microsoft, Apple, Intel, Genentech, Cisco, a bunch of others who were starting to get fed up with the rise of patent lawsuits, at the very beginning of that rise that we saw in 2006-2007, and turned to Stanford and said there’s got to be better data out there. Tell me more about who this judge I am in front of, what’s their pattern behavior, who is this party that’s suing me, what should my strategy be?


And they collectively, along with a couple of law firms, actually gave $3 million to Stanford to build essentially the database of patent litigation, and they soon realized $3 million was a lot for an academic project, but really a drop in the bucket for what they were trying to achieve, because standard machine learning and natural language processing techniques didn’t work.

But because with Stanford they are able to enlist Andrew Ng, who is now one of the top machine learning guys in the world and Chris Manning, who then at the time was one of the top natural language processing guys in the world, and get their expertise eventually; it took a lot to get them on board, but get their expertise devoted to this problem and developing a data classification system that works for law, right, that can weave through and understand the legal language that we can extract out the key information, because really this is mostly unstructured data.

It’s literally mining through lawsuits themselves and answers and complaints and this and that to kind of figure out, okay, when was this motion filed, when did the ruling happened, what happened, who was involved, all of those pieces of information.

Bob Ambrogi: And that actually started I think way back in 2006, but then it went private a couple of years later, and I think if I have it right, you joined in 2011 as CEO.

Josh Becker: In 2010 — really it got going probably 2007, it was spun out in 2010, and then I took over in mid-2011, yeah.

Bob Ambrogi: And you came from an interesting background yourself and not really a background in legal technology, but certainly a background as an entrepreneur. Tell us a little bit about your background before you came to Lex Machina.

Josh Becker: Sure. I did do a JD/MBA but I did not practice. I graduated in 1999 from Stanford and at that time, if you recall back, it was the dot-com mayhem and there was so much going on, and I had already been involved with one startup that actually we took public in 1998; it was called, which is actually not a gambling site; I wish it was sometimes, but it was a tech job board essentially is really what it was.

So we were building websites and then we wanted to become more of a technology company, so there was content for techies, and then ultimately, actually bought this small job board from and that really became the company.

So I had background on that and I had worked on another company of my own that had, let’s call it moderate success, it lasted for a while and then died. So I had been pitching VC. So I was interested in the VC side of things and so I got to eventually go work for a venture capital firm and then have my own venture capital firm.

So I have got an interesting perspective. That helps me a lot with the Legal Tech Accelerator that I really started for Lexis and helped run the Legal Tech Accelerator because I have experience both on the venture side and everything from angel side, because I also helped start the Stanford Angels & Entrepreneurs, so everything from the angel side and the venture side, to the entrepreneur side as well.

So when I came across Lex Machina through the Stanford Angels & Entrepreneurs, I thought it was a really cool idea and then they asked me to come on board as the Interim CEO and that was about seven years ago when I started that process.

Bob Ambrogi: So what did you think was cool about it, what attracted you to it?

Josh Becker: Well, I thought it was really this fundamental concept that you laid out, it was the Moneyball notion. It was applying analytics to law. Many people thought we were crazy. They said, wait, you are going to try to sell data to lawyers, okay, that doesn’t seem like a great business idea.

And we knew it would take a while, but we had a real fervent belief in the mission really in bringing openness and transparency to the law and a feeling that this would work out over time. Like we felt like eventually all firms were going to have to be adopting this kind of technology and we knew it was just going to be able to figure out how can we get from here to there, right, crossing the chasm as they say or the valley of death, saying it in other ways.

You can get a few initial successes, a few early adopters like Jim Yoon at Wilson Sonsini, who is fantastic, but again, how do we get enough customers to get it to the point where we can really raise some money, because this is all expensive to do. This is not cheap. You need data. You need — a PACER is obviously not free and if you are trying to download every commercial case for the last 10 years, that costs a lot of money.

Let alone the team, bringing on the engineering team and also hiring Karl Harris, who is really now running — actually Karl is really running the day-to-day operations now at Lex Machina. I am starting to step back and take a broader view of legal analytics and do some thinking and writing about legal analytics as a whole and where is it going. So Karl is really stepping up. But to bring Karl and hire a whole team, so yeah, not cheap to do. I think we loved the original idea, and we knew that eventually it would prevail, but the question mark was, could we eke along and raise enough money as we went to get there.

Bob Ambrogi: So that probably is what brought you to 2015 when you were acquired by LexisNexis. What was it about LexisNexis that you thought then a CEO would be a good fit for Lex Machina?


Josh Becker: Yeah. Well, first of all, there’s a quality of the data, the quality and quantity of data. For us, we wanted to expand. As you know and I think you maybe alluded to, we really started out just with patent litigation and then we expanded on our own to the rest of IP. But really started out just with patent and to expand to other areas we needed access to lots and lots of lawsuits, millions, literally millions of lawsuits, which as we said, cost a lot of money on PACER.

So with Lexis, they have that database, I think it’s 150 times the size of Wikipedia they say, and it’s doubling every three years, and just 13 million documents daily. So I think it seems pretty clear that to me, and I haven’t seen it disputed, Lexis I think has the largest quantity of data, and I think the highest quality as well, and that was really important to us.

And then all of the rest was just making sure that they really understood analytics, that they were committed to analytics and that they were ready to execute on analytics, and we became convinced that they were, and that’s why ultimately it was a good fit.

Bob Ambrogi: I am not sure that everybody listening to this program necessarily even understands what analytics means as a concept, as a tool. Could you kind of give us the expert dummies version of what it is that you are doing?

Josh Becker: Sure. Well, analytics is a broad term as you say, and the other thing that gets thrown around of course now is AI; AI for law, AI, and what does AI mean, right?

And so when I think about AI is really the two disciplines maybe of, if that’s the right word that I mentioned, machine learning and natural language processing.

Machine learning is really the ability to mine through lots and lots of data and identify patterns. That’s what machine learning is at the end of the day.

So we are not telling you, Mr. or Mrs. attorney, this is exactly how this case is going to turn out, and you certainly have the benefit of your experience, but no human can recall or read through the last thousand cases in the District of Delaware of this type and recall how many times the plaintiff was ruled for, how many times the defendant was ruled for, and then be able to mentally slice down and look at just end of cases, which is a subset of patent cases, or to look at a certain kind of damages or a certain timeframe.

So machine learning is really that ability to identify those patterns, and again, natural language processing, the other thing I mentioned is the ability to parse language, to recognize, okay, well, if this term is mentioned in these number of words, it most likely means that this is a time when the motion was filed or whatever the case is. So that’s a way to think about it. Maybe from a technology standpoint, hopefully that’s helpful.

And analytics is just taking that data and interpreting it in various ways. So in the case of Lex Machina, as I mentioned, it’s the two primary use cases I mentioned; get the case, win the case.

But there are other ones as well. I have been out now in sort of my new — sort of stepping back and thinking deeply about legal analytics. I have been talking to a lot of managing partners and going into law firms, meeting with senior folks. I had a couple of meetings yesterday, and it’s really talking about these kind of use cases. But one that comes up is lateral hiring.

It turns out our data is very helpful if you are a senior executive at a law firm and you are trying to do lateral hiring, because now you are not relying just on recruiter to even say like — which I think sometimes is the best they can do is, okay, here is the firms the person worked at. We can actually say, here’s all the cases they have done. Here’s the client they have had. Here’s the work they have done for these clients.

And yes, they have worked for Google, but not in the last 18 months, and in the last 18 months Google has chosen these other five law firms or attorneys. And so there’s lots of data that’s useful for that use case as well.

So I would say when we think about analytics, it can be a pretty broad term, but the way I think of it is really by those use cases. Again, in the case of Lex Machina, it’s the Moneyball. It’s not the substantive law; it’s really data about the players, about the judges, about the law firms, about the attorneys.

In the case of something like Ravel Law, which I know you know, they actually are going to the substantive law. So they are doing analytics on their published opinions themselves. So they are really parsing the language very deeply with lots of linguistic analysis on board to determine what’s the best language to use in front of this judge, how can you kind of speak in a way this judge can hear.

And then there’s also analytics in other areas, like in contract analytics and things like that. So analytics itself is broad term, so I try to — and hopefully that’s helpful, I try to think about it in terms of the use cases.

Bob Ambrogi: Yeah, I know, it’s interesting. I mean you said earlier in this interview that people thought you were crazy to try and sell data to lawyers, but really the way I think of it is you are not really — you are not selling data, I mean the data was already there, the PACER database has been sitting there for, what, a couple of decades now, what you are really selling is sort of intelligence or insight into that data in a way that it was never available before.


Josh Becker: Yeah. I think people would take advantage of PACER, but they would do it to download one case, or they would say — you might ask your paralegal, hey, go to PACER and download the last 20 cases in front of this judge, and then you just sit there and read through.

And that’s the way that we actually thought about it in Lex Machina and we came up with a product called Case List Analyzer. It’s like, okay, we realized people were just printing out case lists and reading through, so we said, okay, we will automatically let — you pick a case list, you form your own case list; cases involving Samsung as a plaintiff in the last two-and-a-half years in the Eastern District of Texas with this law firm, whatever it is, and then we will automatically parse through those cases you identified and generate some insights for you.

Bob Ambrogi: Josh, stay with us, we are going to take a short break right now. We will be back in just a few moments to continue our discussion of analytics in law.


Bob Ambrogi: Nowadays there are as many definitions of artificial intelligence as there are companies trying to pitch AI solutions. So how do law firms know how and when to incorporate artificial intelligence? More and more law firms are starting to leverage AI across a broad range of applications, legal research, litigation strategy, e-discovery, self-help, online legal services, dispute resolution models, and contract review and analysis.

Visit to see how Thomson Reuters is helping legal professionals like you understand the impact and opportunities of this revolutionary technology and how to use it to deliver your best work faster and more accurately than ever.


Bob Ambrogi: Welcome back to Law Technology Now. This is Bob Ambrogi and I am speaking with Josh Becker, the CEO of Lex Machina. And the last time I saw you was in New York at Legaltech and you were about to go off and deliver a keynote address on the topic of data-driven law practice. What do you mean by that? What are you talking about when you talk about data-driven law?

Josh Becker: Good question. I mean that’s — to us I think that’s sort of the key phrase that we think about. Question is, are we replacing lawyers? Are lawyers going to be replaced? Are they just fundamentally changing law in some way? And we say, no, you will always have legal research and reasoning. What we are trying to do is help people make data-driven decisions.

So if you are a lawyer practicing for many years, you may have some intuition about the way to go in a certain case and now you can test that intuition versus the data and say, okay, I think or I have heard this judge kind of behaves the following way, let me go through the data, and you may end up still making the same decision you were going to make before, or you might make a completely different decision, but now it’s a data-driven decision.

So that’s the way that we think about it, it’s sort of keying up those data-driven insights. And I think you can say Ravel is the same way, keying up those data-driven insights to help make better decisions. And I think lawyers that are now on board are sort of deeply engaging with analytics or are thinking about that. Just thinking about your conversation with Jim Yoon the other day is in the sense of engaging with a client around the data to help make the decision collectively about which attack to take in a certain case.

Bob Ambrogi: And I should say that Ravel; you have mentioned Ravel a couple of times is a legal research platform that was also acquired by LexisNexis.

So you have talked a couple of times about the use case. There’s the litigation use case that you were just kind of alluding to in terms of knowing how long a particular judge takes to handle a particular kind of case. What else? What are some of the other litigation scenarios for how you would use analytics?

Josh Becker: Yeah, early case assessment. So if you are a corporate customer or a law firm — think of it from the in-house side, a lawsuit comes in, you are trying to assess, okay, how serious is this, how long is it likely to take, how much is it likely to cost, and which attorney should I use for this?

And data helps — our data particularly helps with each piece of that, and hopefully that’s clear, but each piece of that decision, which attorney — maybe I have a stable seven attorneys or seven law firms that I tend to use, which one has the most experience in front of this judge and in this kind of case, or maybe there’s some boutique out there that I can find that has a lot of great experience in this kind of matter in front of this judge. The timing analytics and everything we talked about.

And same thing, the law firms now are thinking about it much the same way. The matter comes in, they are trying to think can I make money on this, maybe it’s a fixed fee, maybe it’s not, but they are trying to think about how do I staff this case, how do I bid on it, how do I present our firm and myself in the best light to win this case, and then together work out and really understanding the judge and do we even want to be in this district. Maybe we want to transfer, and that’s another great use case for data; the venue shopping use case.


Bob Ambrogi: Yeah. And since you have been acquired by LexisNexis you have been branching out. You started as you said earlier in intellectual property. If I have got my tally right here, you have now added securities, antitrust, commercial litigation, employment, bankruptcy and product liability litigation. Are there others that I have missed there? And I know you are continuing to build out into other areas and there may have already been others that I have lost track of.

Josh Becker: In December we launched the Chancery Court of Delaware. We actually I think have a webcast today about that I think. I was talking to him last night and he is like, man, we have so many amazing insights from the data here, like people have just been flying blind for so long. But major, major — a lot of money at stake, big cases in the Chancery Court of Delaware and so we just launched that court in December and then we will have more this year. But yeah, I think you got the list pretty well I think.

Bob Ambrogi: So the other use case I just want to talk a little bit more about that you mentioned earlier is the marketing use case. Could you just expand on that a little bit, explain how analytics, how this docket analytics, data analytics can be used for a law firm in marketing?

Josh Becker: Yeah, absolutely. ALM did a survey and they asked people who have used legal analytics, is this helpful in demonstrating expertise, and actually 100% of the respondents said, yes, it’s helpful for demonstrating expertise.

And again, if you just think about it that way, so they are responding to a case, they know the company, they know what district the lawsuit was filed in, they may know the judge, and now we have a chance to use data to say — first of all, in the past people might have emailed around their firm, hey, who knows this judge, who has got experience here in front of this judge, in this firm?

I was meeting with an attorney yesterday who was saying when he started practicing they didn’t even have that. They didn’t have an intranet, so like they would literally just photocopy a article about the judge and that was what they had to go on.

And so now you can use it to slice and dice the data to say hey, great, we don’t think this, we know this is this judge’s tendencies, this is statistically how they have handled these kinds of cases and here’s what we believe the strategy should be based on that, and here’s our experience that proves it.

And not only have we appeared in front of that judge, you could even go deeper sometimes and say, hey, do you know that we have got a 30% faster time to trial in front of this judge in these kinds of cases than other firms that you tend to work with? Do you know that we have a better success rate in front of this judge than other firms you tend to work with?

So you are now using data, slicing and dicing data to make you and your firm look as good as possible and build out a strategy and hopefully win the business.

Bob Ambrogi: So how does this change the practice of law going forward? I mean where are we going to be — what’s going to be different about how lawyers are practicing over the next five or ten years because of using analytics in their practices?

Josh Becker: Yeah. Well, I think if you look at this, there has been some good analysis done; McKinsey did a study, and then there was Levy out of MIT and Remus out of UNC had another study, and we have a pie chart that I have been showing that looks at what percent of time that lawyer’s activity is on different tasks. And a lot of it, 44% I think according to this is actually on advising strategy, that kind of thing. That’s obviously not going to get replaced. That’s fundamental to what lawyers do.

And about 26% of it is research and analysis, but they are actually — but sometimes there might even be more time spent on that now. There is something called the rebound effect that we talked about on our panel where they show when they made light bulbs more efficient, people actually used more electricity. They would leave their lights on longer. And so because clients have an insatiable demand for information, they are going to — there is more and more analysis that’s going to be done.

So what I think is this is just going to be part of law, like just as analytics is now part of baseball. Now every team has had to adopt and they find other ways to compete, and that competition is partly deeper analytics.

So I was talking to one lawyer, his client has a technology that the Houston Astros use, which is you attach to your bat and it determines bat speed and all these things and all kinds of data and additional data real time on the players. So you will continue to deepen in that and then just compete on your expertise, your knowledge, and your instincts and judgment really.

So I think it will change to some extent where I think people who adopt data will have an advantage earlier and will flourish, but I think it sort of has just become something like water. It’s just part of the practice of law going forward. It’s like of course we would check the data and it just becomes part of the tools in the toolbox that you use to practice.

Bob Ambrogi: I know. I think we are up against time limit here and I don’t want to hold you, but I want to ask you, if you could just speak real quickly to the question of how does a law firm kind of go about budgeting for, picking an analytics tool, how do they, as sort of consumers of analytics, how do they get started down this road? What do they need to know about buying analytics and budgeting for analytics?


Josh Becker: Yeah, I think it’s an interesting question. I know when we started people said, we don’t have the budget for legal analytics, like where is the money going to come from. On the other hand, profits for partner at most big firms are doing quite all right, so there is money there.

Bob Ambrogi: So take it out of the partners’ pockets you are saying?

Josh Becker: It’s going to increase. You invest a little, you will make more, so it will increase profits for partner in the long run, but the point is that the money is there somewhere.

But yeah, I mean it’s good to be thoughtful and I think you have to feel like you have — you can’t just say, oh, I bought an analytic solution, you have got to think about what the use cases are. So you want data that you can use in marketing, again, to get the case, to win business for your firm. You want data that’s going to help you then win the case and understand the judge better and understand and do that opinion analysis as well, the stuff I think that Ravel does so well.

You are also going to want to look at analytics to mine your internal data. Hey, we have done a thousand licensing agreements for stage two pharma companies, let’s mine our data to figure out what market is for this kind of clause.

So I think you do have to look at it in a few different categories, but it’s pretty easy to get going. I mean, again, for Lex Machina, it’s just a — it’s an interface. It’s sort of data-as-a-service. That’s why we say it’s a partner tool, it’s not something — would you outsource a Google search? No. You wouldn’t say, hey, go to the Google and look up this for me. You would just do it.

And I think that’s what people should demand and I think that’s what hopefully this generation of analytics tools can provide is something that is an easy interface and an easy thing for people to adopt into their workflow, because at this point it’s really an adoption question and how do firms get deep penetration of these tools. And I think obviously the easier they are to use and the more that these firms allow themselves to be integrated into the workflow, it’s going to be easier and easier for folks to adopt.

Bob Ambrogi: We have been talking to Josh Becker, CEO of Lex Machina. That will do it for this episode of Law Technology Now. On behalf of everybody at the Legal Talk Network, thanks for listening.


Outro: If you would like more information about what you have heard today, please visit, subscribe via iTunes and RSS, find us on Twitter and Facebook or download our free Legal Talk Network App in Google Play and iTunes.

The views expressed by the participants of this program are their own and do not represent the views of nor are they endorsed by Legal Talk Network, its officers, directors, employees, agents, representatives, shareholders, and subsidiaries. None of the content should be considered legal advice. As always, consult a lawyer.


Brought to You by

Notify me when there’s a new episode!

Episode Details
Published: April 19, 2018
Podcast: Law Technology Now
Category: Legal Technology
Law Technology Now
Law Technology Now

Law Technology Now features key players, in the legal technology community, discussing the top trends and developments in the legal technology world.

Listen & Subscribe
Recent Episodes
Black Lawyers in Major American Law Firms: How to Make More Progress

Harvard’s David Wilkins and Host Ralph Baxter examine why law firms struggle to hire, retain, and promote black lawyers and how they can do...

Good Housekeeping Seal of Approval for AI?

To achieve wider adoption of AI tools, there needs to be more industry testing and vetting, Prof. Maura Grossman tells host Dan Linna.

Model for Change: Utah’s Data-Driven Approach to Closing the Justice Gap

Ralph Baxter hosts key players in Utah’s move to reshape the delivery of legal services, revealing the aha moment that sparked the movement.

Pros & Cons: Data Privacy’s Role in Advancing Legal Tech

Host Dan Rodriguez and German lawyer Markus Hartung parse the differences between legal tech advances in the U.S., U.K., and European Union.

Hotshot: 21st Century Training for New Lawyers and Law Students

Ralph Baxter hosts Hotshot co-founder Ian Nelson and Harvard’s Sara Dana and Morrison’s Rick Jenney to discuss how Hotshot’s videos teach practical skills lawyers...

The Spanish Flu to Covid-19: How this Pandemic is Pushing Courts to Modernize

Michigan Chief Justice Bridget Mary McCormack details how courts are breaking with century old processes and outdated technology to build trust and serve the...