Seven Use Cases Where AI can be a Hero to Digital Forensics

Featured Guest

Shashi Angadi

Shashi Angadi is CTO at Exterro and one of its original co-founders. Shashi focuses on the technology...

Your Hosts

Sharon D. Nelson

Sharon D. Nelson, Esq. is president of the digital forensics, managed information technology and cybersecurity firm Sensei...

John W. Simek

John W. Simek is vice president of the digital forensics, managed information technology and cybersecurity firm Sensei...

This Episode

Published:	August 24, 2023
Podcast:	Digital Detectives
Category:	Legal Technology & Data Security

Episode Notes

AI is capable of doing so many things in the legal tech world, not least of which are its uses in digital forensics. Sharon Nelson and John Simek welcome Shashi Angadi to discuss AI’s current applications in cybersecurity—including threat detection, analysis, evidence gathering, and more. Shashi offers real-world examples of AI at work and shares his thoughts on what we might expect in the future of generative AI technologies.

Shashi Angadi is CTO at Exterro and one of its original co-founders.

Special thanks to our sponsor PInow.

Transcript

[Music]

Intro: Welcome to Digital Detectives. Reports from the battlefront. We’ll discuss computer forensics, electronic discovery and information security issues and what’s really happening in the trenches, not theory, but practical information that you can use in your law practice right here on the Legal Talk Network.

Sharon D. Nelson: Welcome to the 152nd edition of Digital Detectives. We’re glad to have you with us. I’m Sharon Nelson, President of Sensei Enterprises, a digital forensics managed cybersecurity and managed information technology firm in Fairfax, Virginia.

John W. Simek: And I’m John Simek, Vice President of Sensei Enterprises. Today on Digital Detectives, our topic is Seven Use Cases Where AI can be a Hero to Digital Forensics. Our guest today is Shashi Angadi, the CTO at Exterro and one of the original co-founders of Exterro. Shashi focuses on the technology direction, vision, and innovation at Exterro to sustain the challenges of the legal GRC industry. Since the beginning of Exterro, Shashi has been leading the technology innovation engine at Exterro with disruptive innovations for the Legal GRC industry using AI and other advanced technologies. Shashi has been instrumental in transforming Exterro from a 4-member team to an industry leader in the Legal GRC space.

Welcome to the podcast today, Shashi.

Shashi Angadi: John and Sharon, thanks for having me on this podcast. It’s a pleasure to be here.

John W. Simek: Shashi, why don’t we get started here, and why don’t you provide our listeners with a little background about yourself and your primary duties as the CTO of Exterro?

Shashi Angadi: So my name is Shashi Angadi, I am the Chief Technology Officer at Exterro. I’m one of the virtual co-founders of the company, and I’m an undergrad computer science degree from Bangalore University and an MBA from Rice. Before starting Exterro I worked in large corporations like US Bank, consulted for companies like Nike, Xerox, AutoZone. At Exterro I run the technology and strategy for the organization. I mean, it includes meanly the product architecture engineering infrastructure services, security and compliance. I also look at new opportunities and innovative solutions to transform the business and keep it ahead of the competition. I live in beautiful Sarasota, Florida with my wife and two kids.

Sharon D. Nelson: When did Exterro begin using Artificial Intelligence and what were its functions initially?

Shashi Angadi: When we started looking at the data and when Exterro started moving into providing solutions in the data, Legal GRC and the data arena. So couple of things came to our mind, right.

One, I think you can consider it as a moto or how we look at problems and how we look at providing solutions to our customers. One is do more with less and the second one is get to the facts of the case faster, cheaper. Easier. Both of these things are related and also your our topic of discussion today about AI use cases. Why I want to mention this is, you know, so over the past, few years, I mean the data has, explosion has happened. So think about it, right. So you, two years back, you went and purchased your iPhone or your computer, people were talking about in gigabytes, right. You would have 8 gigabyte for one out of, in a 64 gigabyte phone.

But today, when you look at it, I mean all the phones are about 500 plus GB or 2 terabytes and even the computers are the same, any other devices go with the same thing and they can carry a lot of data and they have a lot of processing power, but if you look at from our investigation department on forensics or IT security and infrastructure or e-discovery itself, so the budgets have not grown, you know, 100 200, 300 percent, and even the number of personnel that deal with this day-to-day are not increase as well, right.

But, how do you go back and clearly, make sure that you can do your work with the limited amount of budget you have. So that’s when we started realizing so we will use the computing power, the processing power that exists and also, basically what we can look at is AI for some of the solutions where we can process this data as fast how much would it grows, we would be able to process it quickly and also with a lot of scaling power, right, and that’s when we say, we are going to start using AI and provide those solutions for our customers.

And initially, when we started, I mean the basic fundamental things we brought up is about document classification clustering, at that time those are the simple algorithms that were available and somewhere you better give the investigators and the people who work either in e-discovery or in forensics or IT security start off at a certain place, right.

(00:05:00)

And then you know, it’s gradually we built on the forensic side, we built something called as Image Recognition, so where you can literally go back and classify images and also you can search them, in layman terms, get me all the images which have guns in them or get me all the images with cars with the license plates. I think those are some of the quick events we started with, but again, we have innovating and getting to a lot of others use cases using AI.

John W. Simek: Let’s talk about the six use cases where AI is a hero to digital forensics. So why don’t we start with automated log analysis?

Shashi Angadi: As an engineer, as a CTO and also as working in the security area as well, I mean log analysis is the lifeline of any IT infrastructure or security teams and even for engineering teams as well.

Today, I mean, you have Application logs, you have system logs and you have different operating systems, you have Firewalls, you have networks, you have even logs that computers generate. There is so much of logs that have generated, but at the same time, so you cannot go back and look at them manually.

So back in the days when there was client-server applications, flat client applications that is all great. But now we are talking about multiple devices or heterogeneous operating systems, and then you’re talking about the cloud and other data sources. So when you look at the whole thing, I mean, it’s going to be impossible for any human beings to go through those logs to find problems.

So where AI can come up with is something like an anomaly detection, right, so you can have some models that you’ve built, and there you can literally look at the patterns of what these logs are generating and look at whether there were potentially security breaches, or there was system failures or critical issues that require immediate attention, right.

And then, AI can be always used to something like the Log Imputation, right. So sometimes there could be missing interest. I mean, it all depends upon how much of information that could be provided by the developer, right. So sometimes they could be missing but how you go back and look at what really went wrong, right, so AI can help you to plug in some of those values for you, right.

And also there is something like Realistic Log Data Generation, so there are generative models that can help build some of these logs and synthetic log files and all that stuff for training your own algorithms and then there is contextual analysis, right. So when you want to go back and understand what really went wrong whether it’s in your operating system, whether there was a threat of cyber incident, you need a timeline of the entire logs that when the logs were generated, right. The timelines would give exactly what happened when where it happened.

So some of these things it’s tough for us to generate manually. I mean, though that’s when AI really plays a crucial role to come up with some of the things that I described earlier. Overall. When you look at AI, at log analysis, it could improve the reliability of your systems, for your end user. I mean, it could maintain security posture, especially in cloud computing environments to prevent data breaches. And overall I think important business decision making is going to be very useful with this log analysis.

Sharon D. Nelson: It sounds very fascinating but it is of course, the biggest thing I think, but maybe not the only thing and certainly one of the things we have seen is AI began to become involved in being a part of the malware scene. It’s amazing how fast that happened. How are you operating with respect to malware detection? What are you doing?

Shashi Angadi: Virtually I am thinking so from a malware detection, I think so the legacy systems would be looking more at a signature-based detection, right. So, you’re looking at a hash, and you can get it out some sort of a database to see whether it’s a malware. I mean you won’t be looking at some checksumming and application allow listing or some of those passive technologies.

But I think what has happened recently, I had what I mean if you have looked at the tops project that I was like the Mac of Engineers going into dark web and they literally found out malware generation toolkit. So hackers are becoming so sophisticated, I mean they are running like a big enterprise where you can go and they sell this malware ransomware toolkits, with people can generate and pay them off through Bitcoin accounts and things like that. So that means so you can’t go with some sort of static method, right. So always you need to look at something like dynamic methods, where you have all these algorithms analyzing. So what this is how analyzing based on the behavior of your network, right? So I’m also looking at, what if there’s a malicious activity happening.

(00:10:00)

So some of these things from a malware detection. So what I call it as there is something like a good behavior versus a bad behavior, right analysis. So you cannot just do an analysis on the bad behavior. Looking at oh, looks, there is something going on in the network and I’m going to go back and isolate this malware or something like there is a software which is behaving maliciously, I’m going to take that out. I think there are newer models which we are looking for is to understand what are the good behavior. Because as I said, people can generate malware. Anybody can pick up that malware toolkit and start generating.

So instead of going against the bad behavior and understanding them, let’s look at what are the good behaviors of the system. Because we know from an IT infrastructure and security what are the systems that are allowed and what is going to be our patterns in terms of our network and access and everything. And then let’s build a model and anything that doesn’t fit into this model, the AI model probably is a bad actor, right? I think those are some of the things that we are looking for. So rather than trying for what we used to call it as malware trapping, now we can effectively start going out and doing a threat hunting. I mean, that’s what everybody is looking for. So that’s where some of the places we are looking at building some of these models and augmenting the cybersecurity use cases that our forensic toolkits provide.

John W. Simek: Before we move on to our next segment, let’s take a quick commercial break.

[Music]

Adriana Linares: Are you looking for a podcast that was created for new solos? Then join me, Adriana Linares each month on the New Solo Podcast. We talk to lawyers who have built their own successful practices and share their insights to help you grow yours. You can find New Solo on the Legal Talk Network or anywhere you get your podcasts.

[Music]

Jared Correia: They say the best things in life are free, which either means Legal Toolkit Podcast is pretty awesome or we’re totally committed to the wrong business model. You’ll just have to tune in to find out which it is. I’m Jared Correia, in each episode, I run the risk of making a total ass of myself so you can have a laugh, learn something new, and why not, maybe even improve your law practice. Stop believing podcasts can’t be both fun and helpful. Subscribe now to the Legal Toolkit. Go ahead, I’ll wait.

Sharon D. Nelson: Welcome back to Digital Detectives on the Legal Talk Network. Today, our topic is Seven Use Cases Where AI Can Be a Hero to Digital Forensics. Our guest today is Shashi Angadi, the CTO at Exterro and one of the original co-founders of Exterro. Shashi focuses on the technology direction, vision and innovation at Exterro to sustain the challenges of the legal GRC industry. Since the beginning of Exterro, Shashi has been leading the technology innovation engine at Exterro with disruptive innovations for the legal GRC industry using AI and other advanced technology. He has been instrumental in transforming Exterro from a four-member team to an industry leader in the legal GRC space.

John W. Simek: Well, Shashi, can you talk a little bit about how the artificial intelligence is applied to image and video analysis?

Shashi Angadi: One of the things that we use AI and some of these things that we are building in our product, one would be the media identification classification. What I mean by that is so you can train and identify and categorize, right? So the types of multimedia files, so images, audio recordings, media recordings. Within audio recordings, we won’t be looking for whether it’s going to be a message that is left on a voicemail or it’s coming off of some audio from a phone conversation, I mean, we will be able to identify some of these things and also within the videos itself, right? So whether it’s going to be based on language detection of the video, so we can go back and look at and classify them, right? So different languages based on those videos and this basically helps to organize, right?

So when we look at evidence, as I said, there is so much of data explosion. So whether it’s audio, video, think about, right number of selfies images, all the stuff on your phones, so it’ll make the easier for the investigators to focus on relevant materials. I mean, they could get rid of junk, I mean, they can just focus on something that is more relevant and the second one would be duplicate event of similar image detection. So there’s a lot of images which are similar. Again, as I said, we need to get to the facts of the case faster, cheaper and easier. And you could have a lot of images that are duplicates. I mean, some of them can come from your phone, some of them can come up from your computers, but they are all the same images.

(00:15:00)

Some of them can be basically resized, right? So identifying those duplicate images and getting rid of them, I mean, maybe it’s all going to be done through some sort of an AI algorithms there. Audio transcription and analysis again, so speech recognition, so making it we can transcribe them into audio recordings so for people to search and analyze spoken content, right? So key phrases, how can customers look at key phrases within the audio video investigation? Some of those things are also there are algorithms that are available and we are using them to do some of this analysis as well.

Sharon D. Nelson: Shashi, how can you use AI in natural language processing to identify suspicious patterns? How does that happen?

Shashi Angadi: Multiple ways. When you look at, when you talk about natural language processing, it’s basically analyzing text-based data, emails, chats documents and all that stuff, right? So text analysis and entity recognition, really what I mean by that is named entity recognition. What I mean by it is someplace there is a person, right? So somebody might be calling me Shashi, somebody might be calling as Mr. Angadi, somebody might be calling me Shashi Angadi. I mean there is CTO or Mr. Angadi. But how do you identify this is all pointing to the same source, right? So even sometimes in the document it could be somebody saying it is he, she. So identifying some of these entities and you can also look at organizing them into places, events, all that is involved in a case and tying them together, right? So how do you tie them all together? That’s one of the use cases that natural language using natural language processing. I mean we are building and sentiment analysis, right? So it’s an emotional determining an emotional tone of the text, constant context content, right?

So one of the things that I can give an example is let’s say somebody’s talking the word bomb can mean a lot of things, right? So I can go back and talk about, so the movie bombed at the box office. I mean there is a bomb in that particular sentence as well and the context how it is used, right? And also somebody can say the movie theater was bombed. I mean the context is completely different, right? I mean, it’s the same word. I mean, you cannot just go by when you’re looking at analyzing lot of this data, you cannot just go by some keyword searches and all that stuff. Really I think you need to look at what is the context that is being seen and all that stuff. So, those are some of the things that NLP can provide.

And then again, language translation, right? So there is so many multilingual communications might be happening because this is a global world. I mean, there are people communicating across the globe and they could be talking multiple languages. I myself can speak five languages. So, when you’re looking at it, how do you understand what are those messages, right?

So what is being talked about? So, within those language translations, I mean NLP can come in aid, right? And make the life of every investigator very easy, right? Also there is something called as text summarization, right? You have so many documents, let’s say you have a thousand page document or you have thousands of emails, chats. When you go back and look at it, what are these are all talking about? Is this even relevant, right? So you can come up with some sort of a summarization so AI algorithms can do that for you. So you can understand really what is happening on the case, right? So those are some of the things which are there, but there are multiple of other use cases. But predominantly some of these things we have been actively working on and those are some of the things that is happening within the forensic industry.

John W. Simek: Can AI be used to identify other patterns? For example, those of us that have spent way too much time looking at wire shark dumps. Well, I’m old enough, I remember network general sniffer dumps, too. But how can AI be used in network traffic analysis to prevent cyberattacks?

Shashi Angadi: One thing, I mean, as I spoke earlier, again, anomaly detection, right? So, AMI a models can learn normal patterns of network traffic, and they could really understand and detect anomalies that deviated from expected behavior. I mean, when we run, we know exactly within the US our systems, and how the behavior is, where the data is coming from, where the transfer is occurring and all that stuff. And suddenly you start seeing unusual spikes, I mean, traffic patterns or suspicious activities. Then potentially, there could be a cyberattack, right? So some of those things could be done using AI. I’ll just give one more example. Of late, you could have heard about the SIEM and the SOAR, right? Solutions, where everybody wants to do something proactive. One of the things that we provide our customers deploy a lot of our endpoints, FTK endpoints on computers and everything, —

(00:20:05)

— and whenever they see some sort of an anomaly happening on one of the devices, right? So, they have this sort of playbook which technically they can go back and really understand what’s really going on. This machine, I mean, there is some unusual activity and they can take off the machine out of the network and start analyzing what was really going on, right? So, this all can be done through AI and the next one would be traffic classification. So, depending upon the Crite of protocols and user behavior and all that stuff, I mean, obviously, people leverage it to do bandwidth optimization. There is packet inspection, deep packet analysis. I mean, as I said, cyberattacks are becoming more sophisticated.

I mean, people can go back and put something within your packets. I mean, you can go back, analyze, really, is this the right thing or not, right? So, there’s so much of data. I mean, human eyes cannot see it or even if you write it in a log file, I mean, you cannot analyze some of those things, right? And also, something bad happens in a cyber incident hacker or some already you could try to generate what I call it as traffic forensics. So, when you’re trying to do post-incident analysis, you could literally try to reconstruct some of this network traffic. I mean, what were the patterns leading up to this incident so that you can come up with root cause and impact, right? Again, so end of the day, I mean, there are tons of use cases that you could do with AI to keep your organization safe, especially from all cyber activities and that network traffic analysis is one of the use cases that it’s going to be predominantly used.

Sharon D. Nelson: You’ve sold me Shashi on the fact that AI is a vital component in gathering evidence. But to move to another example, can you use AI for forensic triage?

Shashi Angadi: Absolutely, right? Whenever a case comes and you want to start collecting the data, right? So, I mean, if it’s AI, right, when I talk about within your network, I already touched upon it a little bit so you can understand, identify if there are models. So, you could identify and potentially collect digital evidence proactively from various sources, computers, mobile devices, cloud storage, network, right? And also, it helps in evidence prioritization, right? So quickly, you can do some sampling of data, understand what is this case all about, right? So, let’s just takes some relevant pieces and at least the investigators can get even a good start, right? So, rather than trying to collect everything figuring out what’s going on so they can go back and really start, get a head start, right, so rather than trying to still end up figuring out what else to collect.

And again, so when they go back and analyze this data so it could come up and say, “Hey, look, this person had one phone.” I mean, based on what I can see from the model here, this person has been talking. I mean, maybe it’s not just one device but he’s using multiple devices. So, AI is going to provide some of those solutions so where our missing pieces of information, you can go back and start collecting which you may otherwise never know and all. And, obviously, media analysis, so we talked about multimedia file and images and also there was something about CSAM images or nudity which is there are so people don’t want to see it. You can flag them and all that stuff. So, you could do all those things. Obviously, hash matching and data duplication, so you don’t have to go back and do a lot of those things and, I mean, there’s tons of things, right?

So, I think quickly, as I said, the whole motto is to get to the facts of the case faster, cheaper, easier. I mean, AI is going to be one of those tools where very important in terms of triaging, right? So, if it provides a data timeline so it reconstructs about timelines or how things happen, so it gives a head start for the investigators.

John W. Simek: Well, before we move on to our final segment, let’s take a quick commercial break.

[Music]

Jud Pierce: Workers Comp Matters is a podcast dedicated to exploring the laws, the landmark cases and the true stories that define our workers’ compensation system. I’m Jud Pierce and together with Alan Pierce, we host a different guest each month as we bring to life this diverse area of the law. Join us on Workers Comp Matters on the Legal Talk Network.

[Music]

Christopher T. Anderson: If you’re a lawyer running a solo or small firm and you’re looking for other lawyers to talk through issues you’re currently facing in your practice, join the Un-Billable Hour’s Community Roundtable, a free virtual event on the third Thursday of every month. Lawyers from all over the country come together and meet with me, Lawyer and Law Firm Management Consultant, Christopher T. Anderson, —

(00:25:00)

— to discuss best practices on topics such as marketing, client acquisition, hiring and firing, and time management. The conversation is free to join but requires a simple reservation. The link to RSVP can be found on the Un-Billable Hour page at legaltalknetwork.com. We’ll see you there.

Sharon D. Nelson: Welcome back to Digital Detectives on the Legal Talk Network. Today, our topic is Seven Use Cases Where AI Can Be a Hero to Digital Forensics. We’ve introduced Shashi before. So, I’m going to do the condensed version here. Our guest is Shashi Angadi, the CTO at Exterro and one of the original co-founders of the company. Shashi focuses on the technology, direction, vision and innovation at Exterro to sustain the challenges of the legal GRC industry.

John W. Simek: My final question, Shashi, is in your opinion, what effect is generative AI going to have on Digital Forensics over time? And I know that’s kind of a crystal ball thing, but let’s say the next, gosh, maybe we’d be rich if we could forecast the future, right? What about the next, let’s say six months to a year?

Shashi Angadi: Yeah. I mean, probably, we will not be able to get too similar to Minority Report. I know if you have seen that movie there for sure, but certainly, I think what we can even think about it is the Generative AI. I mean, they’ll be able to analyze data and patterns in historic data to predict crimes before they happen or any issues, right? So, just let’s think about what happened in the Paris riots, right? Something happens. So, some incident happened and entire rioting started happening, right? I mean, how did they start, right? So probably, they started off with some WhatsApp messages or something from social media, something happened, right?

And we know obviously, similarly, I mean things have happened within the US itself, right, and after the elections, before the elections, whatever, right? So, I think one of the key things at least when there are any red flags, people would be able to look at and say, “Hey, what is this going impact and what are the things?” I mean, that gives a good edge for this law enforcement community to put things in place so that they can avoid all the damage that could occur, right? So, that’s one of the things that we think would happen. Obviously, there is crime analysis. I mean, you have all this historical data, historical case. Probably, it can help law enforcement to come up with and say, “Maybe, this is similar to a certain case.” Maybe, you need to look at collecting a certain type of evidence or you can take a certain path, right?

And also, one of the things it could happen is on the audio. If the audio is not right, how do you go back and enhance the audio to understand what really was being spoken, right? Sometimes, you don’t get the full audio. So, what might have been really spoken? I mean, you are looking at images, right? So, at a certain given instance, if you have somebody taken an image, I mean, there is something in the background and there is partial image was there. So, you can complete that image and really get a context of what really was happening there, right? So, the same thing with video. And also, you have all this CCTV footage. I mean, some of them could be garbled, some of them may not be really visible. How do you enhance those things for admitting the investigation? I mean, those things will be there.

I think some of these things are great for the investigation. On the flip side, I mean, I know you’ve been hearing about deep fakes where people can create a false narrative. They can generate a video as if our audio of some person. I mean, people can believe that this was happened by somebody said something, right, which may not be true. I mean, those things will show up. I think also we will have the good actors have to build again AI to beat the bad AI, right? I mean, it’s going to be interesting. So, generative AI has shown a lot of great things. But again, there will be a lot of bad things going to come out of it, but I think end of the day, I feel the good actors will prevail and it’s going to be a challenge, but I think it’s going to present a lot of opportunities as well.

Sharon D. Nelson: You’re right about the challenge and, certainly, we’re going to see it within the next couple of years. That’s for sure. John had asked you, Shashi, about the next six months and year. I mean, going further out, we’ve seen a lot of people say it’s only going to move a certain distance in the next six months to a year. But where do you see AI going with digital forensics after that?

Shashi Angadi: I think it’s going to be, you know, everything kind of becomes proactive, right? So, this is the time of now, right, especially in the cybersecurity and space. I think there is going to be a lot of improvements there —

(00:30:03)

— because the attacks are going to become more sophisticated and the proliferation of data is going to be pretty huge. So, there are different mobile systems and systems are going to come in, right, and you’re not just talking about computers and mobile phones now. You’re looking at smart devices, your iPhones and your Apple watches and also you’re looking at drones and IoT devices, medical devices. I mean, there is a whole slew of things that are going to come in. So, you need to be able to analyze that data as well. But I mean, they are all still we are touching the surface here, but I think AI is going to help in some of — analyze some of this data as well. As I said in the cybersecurity, I think so threat hunting. People will become more proactive. Systems will show up. A few systems will come where they can effectively and try to eliminate all these cyber incidents from happening. I think this has just started now, right? I think every time we can’t just come in and say this is at a mature phase. I mean, this has just started. I think this will go on for a few more years.

Sharon D. Nelson: Well, I have no doubt of that. We’ve so much enjoyed having you with us and you certainly have hammered into our heads that things will be faster, easier and cheaper. We got that line down, but my favorite line of today was there’s so much human eyes cannot see and last night John and I actually watched the movie I-Robot and that movie sort of stands for that proposition that there’s so much the human eyes cannot see. So, I think you’ve taught us a lot about what might be coming in the future for AI and it is sort of remarkable, and it was wonderful to have a guest who is so very knowledgeable in all of this. So, thank you so much for being with us, Shashi, today.

Shashi Angadi: Thanks Sharon and thanks John. I have been a fan of Digital Detectives. I mean, whenever I go and look at a lot of knowledgeable articles, knowledgeable podcasts. I mean, I increase my knowledge every single time when I look at new things out there. So, awesome work from both of you guys. I mean and thanks for having me, too.

John W. Simek: Well, that does it for this edition of Digital Detectives. And remember, you can subscribe to all the editions of this podcast at legaltalknetwork.com or an Apple podcast. And if you enjoyed our podcast, please rate us on Apple podcasts.

Sharon D. Nelson: And you can find more about Sensei’s Digital Forensics, manage technology and manage cybersecurity services at senseient.com. We’ll see you next time on Digital Detectives.

Outro: Thanks for listening to Digital Detectives on the Legal Talk Network. Check out some of our other podcasts on legaltalknetwork.com and in iTunes.

[Music]