Secure AI Labs

-
Video details
Anne Kim
Cofounder & CEORyan Davis
CofounderSecure AI Labs
-
Interactive transcript
[MUSIC PLAYING]
RYAN DAVIS: So my name is Ryan Davis. I'm a co-founder of a company called SAIL. I founded SAIL with Anne here at MIT while studying at the MIT School of Management. My background's in bioengineering, and I lead a lot of business development in the company.
ANNE KIM: And I'm Anne, Ryan's co-founder. My background is in computer science and molecular biology here at MIT in both my undergraduate as well as graduate work.
And so the origin of this company is that throughout my undergrad, I did a lot of research in bioinformatics. But at the end of the day, we're not trying to solve a problem of making neural nets 10% faster and gaining speeds of 30 minutes. We're talking about a bottleneck of months of internal review boards when you're talking about hospital data or patient records or molecular data in a pharmaceutical company.
And so with that in mind, the real question and the real problem in health care was access to data. And so I focused my graduate work in doing privacy and security research with Sandy Pentland in the MIT Media Lab. And so we founded this technology about two years ago with Manolis Kellis in the computer science and AI group at MIT. And about a year ago, I met Ryan at that PITCH competition in order to really accelerate our business and growth.
So the technology has the end goal of being able to work on encrypted data and encrypted algorithms in use. And the two major technologies we rely on are federated learning, as well as secure enclaves. So federated learning protects the data by actually sending the algorithms to the data and keeping the data where it is and only extracting the results of the analysis from the data set in order to do an aggregate analysis. And so that's how you collect the data.
How you protect the algorithms by using enclaves, and these are actually hardware that is predeployed on all modern computers, laptops, servers, and even phones. And what this ensures is the integrity of the operating system in the environment where you're going to deploy these algorithms, where the data lives.
[MUSIC PLAYING]
-
Video details
Anne Kim
Cofounder & CEORyan Davis
CofounderSecure AI Labs
-
Interactive transcript
[MUSIC PLAYING]
RYAN DAVIS: So my name is Ryan Davis. I'm a co-founder of a company called SAIL. I founded SAIL with Anne here at MIT while studying at the MIT School of Management. My background's in bioengineering, and I lead a lot of business development in the company.
ANNE KIM: And I'm Anne, Ryan's co-founder. My background is in computer science and molecular biology here at MIT in both my undergraduate as well as graduate work.
And so the origin of this company is that throughout my undergrad, I did a lot of research in bioinformatics. But at the end of the day, we're not trying to solve a problem of making neural nets 10% faster and gaining speeds of 30 minutes. We're talking about a bottleneck of months of internal review boards when you're talking about hospital data or patient records or molecular data in a pharmaceutical company.
And so with that in mind, the real question and the real problem in health care was access to data. And so I focused my graduate work in doing privacy and security research with Sandy Pentland in the MIT Media Lab. And so we founded this technology about two years ago with Manolis Kellis in the computer science and AI group at MIT. And about a year ago, I met Ryan at that PITCH competition in order to really accelerate our business and growth.
So the technology has the end goal of being able to work on encrypted data and encrypted algorithms in use. And the two major technologies we rely on are federated learning, as well as secure enclaves. So federated learning protects the data by actually sending the algorithms to the data and keeping the data where it is and only extracting the results of the analysis from the data set in order to do an aggregate analysis. And so that's how you collect the data.
How you protect the algorithms by using enclaves, and these are actually hardware that is predeployed on all modern computers, laptops, servers, and even phones. And what this ensures is the integrity of the operating system in the environment where you're going to deploy these algorithms, where the data lives.
[MUSIC PLAYING]
-
Video details
Anne Kim
Cofounder & CEORyan Davis
CofounderSecure AI Labs
-
Interactive transcript
[MUSIC PLAYING]
RYAN DAVIS: One of the biggest problems with data being exchanged between companies is that it's a vital resource for their operations. It contains many private, as well as confidential types of information. This information has to be protected and maintained in a private and confidential manner, while still providing value to partners and collaborators. In order to share this information, eventually somebody is going to have to hand a hard drive over to IT or through legal, and this can be a long, cumbersome process. There needs to be a streamlined third-party intermediate to allow information to be used in a respectful, private, and confidential manner.
Sure, so data today is exchanged physically. It's aggregated in the data lakes. It's put in the data clouds. What people don't understand is that this information is all of a sudden exposed and is out of the purview and control of IT and of companies that need to be able to protect this information. What we provide is a solution that can allow companies to gain insights or train algorithms from this data without exposing it in a way that has to sacrifice privacy or confidentiality.
A lot of industries have very sensitivity analysis, they have research that they want to perform on data, and so data sharing all of a sudden means research sharing. And this is where the friction between collaboration has actually doubled. We find that sometimes, the sensitivity of the data actually pales in comparison to the sensitivity of the algorithms or the queries that companies want to do on each other's information.
Pharmaceutical biotech companies is a great example of this. They don't want to tell each other the patients they're going after, the drugs of interest, the potential gene variants of interest. And so being able to protect this code, this analysis is just as vital as being able to protect the privacy and confidentiality of the data that they're accessing.
ANNE KIM: Yeah, and this is baked in in not only the methodology of the code and what it's actually searching for, but also in even deep neural networks, you'll find that you can have a lot of hallmarks of the data sets you've trained on. And to extract these is actually quite trivial, just given a couple of examples and given the sort of access that you have to an algorithm as a collaborator.
RYAN DAVIS: So a lot of the research that we started out with and a lot of the reason that when [INAUDIBLE] was able to patent this technology and help us co-found this company is because he has to access information that exists in many different data silos that cannot be combined. Great examples of this are genomic data and phenotypic data, as well as behavioral or psychology information for any type of CNS disease.
CNS specifically is a very big interest for our company, as well as for many pharmaceutical companies, but they can't get the data necessary to be able to correctly diagnose, treat, and cure these diseases. However, if you can actually perform analysis in a way that keeps data private and confidential, but also within its data silos, then you start combining insights and discoveries from all this various different information, so you can actually be able to cure disease in a new way that wasn't previously possible.
ANNE KIM: Right. And so by using this accessibility to heterogeneous data sets, you're now able to leverage modalities like images plus diagnosis plus electronic health records plus ancestry in order to give a fuller picture of personal medicine for people with CNS.
-
Video details
Anne Kim
Cofounder & CEORyan Davis
CofounderSecure AI Labs
-
Interactive transcript
[MUSIC PLAYING]
ANNE KIM: So at Secure AI Labs, we're not just putzing around with academic work. We're doing some real things with some of the customers that we're talking to. So one of the examples is a pharmaceutical company here in the Cambridge area that is actually an international pharmaceutical company.
And so they have the problem of being able to access internal data that spans many different countries, some of which are in the European Union, which is under the purview of GDPR. And then meanwhile, they have obvious sites in the US, which is under HIPAA. So how are you able to federate different data sets across different countries within the same company?
The way that you do that is, you have to use our platform in order to be compliant with what the lawyers want you to do. And so we were able to do a federated learning analysis on microbiome data across different enclaves in our platform. And these enclaves, because they can sit anywhere, whether that's here or in Switzerland, you are able to stay compliant with these regulations.
For example, in GDPR, you can't move the data outside of the country of origin. So federated learning is perfect for this because we send the microbiome algorithm to the data set in Switzerland, to the data set in Cambridge, and then we extract the analysis from it. And in a way, we have mimicked what you are able to do if the data sets were hypothetically moved to one place. But instead, we've enabled, through technology, compliance.
Another example is we're working with a digital health company, specifically a arm of a telecommunication company that has a huge network of customers with phones. These phones are personal data repositories of a lot of information about a person that indicate a lot of biomarkers for health-- so mental health. How many times you're opening your phone, how long you're on your phone, natural language processing around the text in your phone, this is very sensitive and private information but could be a huge boon to any sort of analysis done on this distributed network.
And so how are you able to bridge the gap between need for this high granularity data and obvious concerns about privacy? By using our platform, we are able to keep the data on your phone, train an algorithm across all of these phones, and then we just extract analysis that will tell us about the specific patterns in mental health that we're looking for about the population in the network of phones and not about individuals. And so by doing that, we can build these incredibly powerful algorithms on mental health through this phone network while providing security and privacy.
And in addition to that, it's not only for the gain of research in mental health. We also give back the value to the customers that actually provided the data. And so what this means is, deployed it on my own phone, I can have an algorithm trained. It goes to some researcher at the Broad Institute who's working with this telecom company. And then this algorithm is then sent back to my phone in order to diagnose if I need to be aware of any sort of mental health concerns. And that is extremely valuable for someone who makes health very important to their life.
Another example is that we're working with hospitals here in the Boston area in order to make data sharing and data analysis more accessible internal, as well as external to these hospitals, without violating any privacy of patients and while also being able to ensure security in these large data sets, that normally we've heard anecdotes about hard drives that are being sent over the mail in order to transmit these huge, massive data sets. And that is not a secure solution at all. So by sending the algorithms to the data instead, this platform is going to be the solution for that sort of analysis.
[MUSIC PLAYING]
-
Video details
Anne Kim
Cofounder & CEORyan Davis
CofounderSecure AI Labs
-
Interactive transcript
[MUSIC PLAYING]
RYAN DAVIS: So the company really started when Manolis Kellis had this problem of accessing enormous amounts of data was that extremely private and sensitive. These were actually genomic databases. And so working with Anne, he was able to patent this technology. And from there, Anne was able to work with an engineering team to build out our platform.
A major milestone with our company was actually getting into the MIT delta v programmed to actually spin this company out of the labs at MIT into a commercial entity. From there, we've been in the process of fundraising as well as acquiring these first early customers who really see the value in what we're building.
ANNE KIM: Technological milestones would be actually working on top of this enclave technology that we have and then not only building in federated learning architecture for the companies that we work with, like the pharmaceutical companies, but also adding in differential privacy, which is a new way of protecting data at a higher level.
Some of the milestones that we've covered technologically were first kind of discovering what kind of tech stack that we wanted to work with and the technologies that we wanted to patent. And so we did that. And then after, we were not done, because there are a lot of challenges in working with resource-constrained environments, like secure enclaves.
And so what we've been doing for the past two years is building a virtual machine that allows us to bridge the gap between high-security languages like C++ and working in these environments to a data scientist who wants to work with an IDE in Python, or Jupyter Notebook, or something. And so that has been a tremendous amount of work and a huge milestone.
And then on top of that, being able to leverage federated learning across multiple instances of enclaves is quite difficult. You have to think about how you're going to parse out analysis that you're going to distribute it to different enclaves that have different data sets and then combine that in a analysis-preserving way. Further, actually having these enclaves talk to each other in a secure way requires a lot of cryptographic innovations that we have been working on.
RYAN DAVIS: To accomplish that, we've been growing our team. We've actually added additional senior engineering members so that we can actually deploy this technology and get it into the hands of our customers so that they can extract the value that we provide.
[MUSIC PLAYING]
-
Video details
Anne Kim
Cofounder & CEORyan Davis
CofounderSecure AI Labs
-
Interactive transcript
[MUSIC PLAYING]
ANNE KIM: We have an awesome senior engineer named [INAUDIBLE]. He is incredible, very diligent and creative, very, very smart, has a background in computational chemistry with a PhD, and has a real enthusiasm and interest, a genuine interest in the technology that we're working on.
But even at a higher level, I think talking about the culture of our company is really important, that we are solving a huge problem. It's data access for the means of being able to enable research at an accelerated pace. I gave the example before. We're not talking about a 30 minute gain in making your neural net faster. We're talking about months of gain by cutting through a lot of red tape that your lawyers are going to put in, so that you are going to be compliant and privacy-preserving.
So that's huge for research. And by answering that question, we've drawn a lot of talented people who are inherently focused on and interested in that mission, as well as the technology. Because I think it's really easy to see the correlation here at MIT of people who are missionally-driven, as well as just nerds who really like working with hard problems. Hard problems are really fun for us to work with.
And so we have that sort of culture fomented in the office that we are interested in solving these big problems and trying to find elegant solutions for them. And it's not just about how many hours you work or how many meetings or how many emails you're sending. It's about getting things done and getting things done with purpose and intentionality.
RYAN DAVIS: So a large part of [? coming ?] [? out ?] [? of ?] MIT is building out the core functionality of our technology. But STEX25 has been instrumental in allowing us to stretch our legs and work with new industry partners. We've been able to expand from health into insurance. We're starting to talk to [INAUDIBLE] and finance companies, and this is really instrumental because we're starting to see the enormous value that we're creating here.
We obviously started in bioinformatics because that's where data is most private and sensitive, and that's why we created this great product and platform that can benefit all of industry. But now we're finally, with the help of STEX25 and the ILP, able to see how this is going to scale to solve enormous problems across many industries.
[MUSIC PLAYING]
-
Video details
Anne Kim
Cofounder & CEORyan Davis
CofounderSecure AI Labs
-
Interactive transcript
[MUSIC PLAYING]
ANNE KIM: I think a huge difference is just what the workflow of working with companies looks like when you are commercializing academic work. It means that you're trying to find decision-making units, understand motivation, product/market fit, which is quite challenging. The research that I did is almost like, just 10% of what that is and 90% of it is just learning these new skills and being able to adapt and learn quickly.
RYAN DAVIS: We're diligently working with the companies to be able to find a specific use case to solve their problems. It's a broad platform and does some very amazing work of getting the right data faster to the people that need it, but finding the exact use case to help the individuals in those companies is what we're really drilling down on and what's going to be instrumental in the success and growth of our company.
So we are looking for partners that really want to accelerate and find new insights from the data that they sit on and also to be able to access valuable insights that sit outside their organization. It's very common that companies don't know what they don't know. And we want to help them understand where does the risk lie in the research and their operations as a way of getting access to the right data at the right time.
ANNE KIM: So in the future, I hope that Secure AI Labs platform is integral to the data science workflow, that as a data scientist, you're going to open up your Secure AI Labs platform and you're going to access data easily, do the analysis you need to without any of the headache or any of the burden of having to really spend too much time and energy on thinking about compliance and privacy, which are, at the end of the day, of utmost importance but something that a data scientist should have seamlessly integrated into their experience of getting work done and getting analysis and research done.
RYAN DAVIS: So many industries are actually hampered because they have not learned to adapt to new regulations that are dearly needed for privacy and security. We want to see industries grow and accelerate as a result of what we're building. We want to put this tool in the hands of data scientists so that they are able to find the insights and create the analysis necessary to drive their companies forward, while still being able to meet the regulatory needs of the communities that these industries serve.
[MUSIC PLAYING]