Feature Labs

-
Interactive transcript
MAX KANTER: Hi, I'm Max Kanter, and the co-founder and CEO of Feature Labs. Prior to starting Feature Labs, I was a student and machine learning researcher at MIT's Computer Science and Artificial intelligence Lab. I was particularly interested in how we could take the advances of machine learning in academia and bring them to industry.
So when I joined a research group, I was particularly attracted to the work that [INAUDIBLE] was doing in making machine learning easier to use. His group was particularly interesting because he worked with a wide range of industries on a diverse set of problems. As we were going through these problems with our sponsors, we started to notice that our biggest challenge wasn't coming up with a solution for any individual problem, but rather the time that it took.
We used every available tool at our disposal, starting with existing database technology in ways of storing our data to the other end of actually using open source technology to train our machine learning algorithms. What we found was that going from the databases to the machine learning algorithms was where we spent most of our time, and that was the process of feature engineering. Or extracting the explanatory variables that made our machine learning algorithms work. As software engineers ourselves, we looked at the available tools and knew there was a better way. That's why we started Feature Labs so that we could automate the process of going from raw data to machine learning models.
My interest in data science, software engineering, and entrepreneurship comes from the fact that they're all very similar. They're all about finding problems and creating something new. In particular, data science is about taking the raw data that companies work with and finding new answers to the questions that we have.
While something like entrepreneurship is about identifying a problem and creating a solution and bringing it to market. I think both entrepreneurship and data science are particularly interesting because of how much attention each one of them is getting these days. This really challenges and forces me to look at problems differently than everybody else and find the solutions that others are missing.
Feature Labs is particularly unique because of our advanced technology for automated feature engineering. The way this works is through our Deep Feature Synthesis algorithm, or DFS, which can automatically extract features from relational and time series data sets. The way DFS works is by taking a repository of feature engineering building blocks called primitives. These primitives are not set to any specific data set, but rather are defined in terms of the types of data they take in and the types of data they output. This allows us to take any customer's data set and immediately begin to start applying primitive functions to them.
Even more, because we know the primitive function's output, we're able to stack the primitives on top of each other, creating those complex features that human data scientists would create on their own. Even more, our customers are able to extend our primitive library with the custom primitives that are specific to their domain, allowing them to not only take advantage of our automation but build highly accurate models.
[MUSIC PLAYING]
-
Interactive transcript
MAX KANTER: Hi, I'm Max Kanter, and the co-founder and CEO of Feature Labs. Prior to starting Feature Labs, I was a student and machine learning researcher at MIT's Computer Science and Artificial intelligence Lab. I was particularly interested in how we could take the advances of machine learning in academia and bring them to industry.
So when I joined a research group, I was particularly attracted to the work that [INAUDIBLE] was doing in making machine learning easier to use. His group was particularly interesting because he worked with a wide range of industries on a diverse set of problems. As we were going through these problems with our sponsors, we started to notice that our biggest challenge wasn't coming up with a solution for any individual problem, but rather the time that it took.
We used every available tool at our disposal, starting with existing database technology in ways of storing our data to the other end of actually using open source technology to train our machine learning algorithms. What we found was that going from the databases to the machine learning algorithms was where we spent most of our time, and that was the process of feature engineering. Or extracting the explanatory variables that made our machine learning algorithms work. As software engineers ourselves, we looked at the available tools and knew there was a better way. That's why we started Feature Labs so that we could automate the process of going from raw data to machine learning models.
My interest in data science, software engineering, and entrepreneurship comes from the fact that they're all very similar. They're all about finding problems and creating something new. In particular, data science is about taking the raw data that companies work with and finding new answers to the questions that we have.
While something like entrepreneurship is about identifying a problem and creating a solution and bringing it to market. I think both entrepreneurship and data science are particularly interesting because of how much attention each one of them is getting these days. This really challenges and forces me to look at problems differently than everybody else and find the solutions that others are missing.
Feature Labs is particularly unique because of our advanced technology for automated feature engineering. The way this works is through our Deep Feature Synthesis algorithm, or DFS, which can automatically extract features from relational and time series data sets. The way DFS works is by taking a repository of feature engineering building blocks called primitives. These primitives are not set to any specific data set, but rather are defined in terms of the types of data they take in and the types of data they output. This allows us to take any customer's data set and immediately begin to start applying primitive functions to them.
Even more, because we know the primitive function's output, we're able to stack the primitives on top of each other, creating those complex features that human data scientists would create on their own. Even more, our customers are able to extend our primitive library with the custom primitives that are specific to their domain, allowing them to not only take advantage of our automation but build highly accurate models.
[MUSIC PLAYING]
-
Interactive transcript
MAX KANTER: Feature engineering is one of the required steps to go from raw data to a predictive model that you're able to deploy. What feature engineering is, is the process of taking a raw data set and transforming and extracting the explanatory variables that you'll feed into your machine learning model. For example, imagine you had every transaction a customer had made in the past and you're trying to predict how much they'll spend in the future. You need to extract variables, like how much have they spent in the past? How long has it been since their last purchase? What was their average order size? And the list of potential variables you could extract goes on.
Automated feature engineering is a way for organizations to take their raw data sets and extract those variables that make their machine learning models work. Without the right features, machine learning models aren't able to build accurate predictions. And without accurate predictions, companies aren't able to realize the value of their data.
One of the most unique things about Feature Labs is our open source software called Feature Tools. Feature Tools is a way for any data scientists to try out automated feature engineering. We released Feature Tools after working on it for many years. We were excited to release it because it filled the big gap in the data science market today. Working on Feature Tools really was a labor of love for all of us at Feature Labs. Based on our many years as data scientists in the trenches, we knew that the technology we had built was going to change the way that people built predictive models. And we were really excited to make that available to anyone in the world for free.
One of the most exciting things about Feature Tools is that community that has formed around it. We see people who are new to machine learning, learning to build their first predictive models with it. For example, we teamed up with MIT's Office of Digital Learning on their big data and advanced analytics class. And in the section where professional learners are learning about predictive modeling, they use Feature Tools to extract variables that they'll use to ultimately train their models. Beyond that, we see people increasing the rate at which they're building predictive models using Feature Tools. Whether these are consultants building models and proof of concepts for their customers, or enterprises who have more problems to solve than resources to do it.
Feature Labs is great for enterprises who are trying to accelerate their machine learning process. The way we accomplish this is by structuring the normally ad hoc process of defining a prediction problem, extracting features, and training a machine learning model. All of our enterprise customers are able to take advantage of our technology so that they can more quickly see results from their machine learning endeavors.
One such company was Spanish bank BBVA. They came to Feature Labs wanting to use our technology to build better fraud detection models. The way they accomplished this was not by improving their machine learning modeling, but instead figuring out new and better explanatory variables to feed into the algorithms themselves.
Feature Labs was able to take their 900 million transactions and apply our algorithms to extract 150 new features to feed into the model. These are features like, what was the last country that a customer made a purchase in and what was the country they're currently making a purchase in? Other variables where things like, did they swipe their card or did they use the chip?
All of these variables got fed into the model to increase the accuracy of the predictive models that BBVA was building. Compared to their existing system, these new features led to 54% reduction in their false positive rate when identifying fraud. Even more, when you look and do the financial analysis of the impact this has on BBVA's business, we found the potential for millions of dollars of savings across all the transactions their customers process every single day.
Machine learning is a topic that's getting a lot of attention today, but many times it's difficult to chart a path of success from raw data to actually deploying and operationalizing the model that your business uses. For any ILP members considering machine learning projects, we recommend they find software that can help them structure and automate the process so that they can more rapidly and repeatedly create predictive models.
Feature Labs has worked with many enterprises that want to adopt machine learning and increase the rate at which they deploy predictive models. Across all those companies, the common thread is interesting, broad data sets that have untapped potential. If a company has questions that they want to answer but don't have the data science resources to get them done, Feature Labs is a great piece of software to help them accelerate and automate that process.
Feature Labs was started in 2015 after spinning out our technology from MIT. When we first started the company, the key question we wanted to answer was, how do we take this advanced technology and turn it into a product that was easy to use, reliable, and enterprise ready? We spent the first year and a half of the company working on taking that technology to market. This involved not only maturing the code base but also putting it to test against some of the hardest problems that we found on the market.
One of our early customers that gave us the confidence that our technology was ready was Accenture. At Accenture, we were working with a team that had never deployed a predictive model before but had a valuable data set to work with. The problem that they were working on was building an AI project manager. Essentially, how can we take all of our historical information about how we manage projects and build a predictive model to alert us of potential problems while we're implementing a project.
After we had worked with Accenture and many other enterprises, we got the confidence that our technology was ready to be brought to everyone. We went out and raised a funding round from great investors here in Boston and San Francisco in order to accelerate our efforts to bring this technology to every enterprise in the world.
[MUSIC PLAYING]
-
Interactive transcript
[MUSIC PLAYING]
MAX KANTER: Feature Labs has its roots at MIT. Prior to starting Feature Labs, I was a student and machine learning researcher at MIT's Computer Science and Artificial intelligence Lab. I worked in a group with Kalyan Veeramachaneni that focused on building tools for applied machine learning and data science. The group was particularly interesting because we worked with industry sponsors on real-world problems.
While we were working on the problems, we had access to a diverse range of industries and data sets that we could do our research on. What we found while we were working on that was that our biggest challenge wasn't building accurate machine learning models, but rather the time it took us to get to those solutions. While we were working on those problems, we used every tool available from us. This ranged from databases and ways of storing and warehousing our data to using open source technologies for machine learning.
However, in between those two things we realized there was a gap, and this gap was all the work that data scientists do to transform and extract the features from their data that make the machine learning models work. As engineers ourselves, we knew there was a better way. And I focused my graduate research on building that better way.
In 2015, we published the results of our research on an algorithm called Deep Feature Synthesis. We were blown away by the interest that industry had on it, and we ended up getting published in over 200 applications around the world. At that point, me and Kalyan started Feature Labs so that we could bring that powerful technology to the market and help every company use machine learning.
The biggest problem with machine learning today is not that it doesn't work but that companies struggle to use it. Feature Labs recommends for any company trying to apply machine learning that they develop a structured and repeatable process for going from raw data to their predictive model. To support this process, one of the most important things to have is the right tools. Beyond that, it's important to approach these projects not as research projects but as actual deployable products that you want to create.
Too often, we see companies start a machine learning project and end with a PowerPoint presentation or a research paper. If machine learning is going to have an impact for companies, they need to focus on deploying those models and actually seeing real results. This is why having a structured process and the correct tooling is really important because it enables them to take the code they wrote in development and put it into production.
One of the biggest keys to success we've seen across our customers is taking the people who understand the domain that they're working on and getting them involved in the process. This is because it doesn't matter how accurate your model is if it doesn't solve a real problem for the business. By using automation technology, we can lower the bar to entry for domain experts and companies to get involved with machine learning.
One of the best ways a company can build their first machine learning model is by focusing on the quickest path to value. Oftentimes, this means not starting with the state of the art technology but instead focusing on building that first, simple model that you can deploy. Automation technology helps us with this because it reduces the time to building that first model, validating the use case, and actually quantifying the importance of accuracy improvements.
Based on our experience with our customers, we developed Machine Learning 2.0, which is a new paradigm for developing and creating new machine learning products and services. Machine Learning 2.0 is a set of seven steps that enterprises can follow to go from raw data to building a model to validating that model and ultimately deploying it. Machine Learning 2.0 is intended to help companies very rapidly, and in a structured manner, create new machine learning products in a way that never was possible before.
We call it machine learning 2.0 because it's a departure from the status quo. Today, many companies approach machine learning projects as research endeavors. This means that the goal is to publish a paper or to create a presentation. However, in ML 2.0 the goal is to create a deployed machine learning model that's impacting how your business operates.
Feature Labs is really excited to be part of the MIT ecosystem, as well as ILP's STEX25 program. MIT has been core to Feature Lab's work from the beginning, and we're really excited to partner with them again. There are a lot of companies coming out of MIT every year, and we were particularly honored to be invited to STEX25 and be recognized for the advances Feature Labs is making, taking our technology to market, and helping our customers build machine learning models. We were excited to accept the invitation to join so that we could better connect with all of the industry sponsors coming to MIT.
The last three years at Feature Labs have been focused on taking the advanced technology developed at MIT and turning it into a product that enterprises could rely on when building predictive models. I'm really excited for the future of Feature Labs because we're working on furthering the automation that's possible in the machine learning process. We can't wait to announce the new products we are working on to help companies go from raw data sets to a deployed model while getting everyone on their data science team involved in the process.
[MUSIC PLAYING]
-
Interactive transcript
MAX KANTER: The problem with machine learning today is not that it doesn't work but that most companies struggle to use it. The challenge that every enterprise adopting machine learning faces is that going from raw data to a deployable model is a time consuming, error prone, and most importantly human driven process. I'm Max Kanter. I'm the co-founder and CEO of Feature Labs, and I'm excited to share with you how Feature Labs is building automated tools that make machine learning easier to use.
One of the tools that we built is for automating feature engineering, which is the process of extracting explanatory variables from raw data sets that make machine learning algorithms work. For example, imagine you're a marketing department and you're trying to predict what a customer will buy next. You have access to fine-grain detail about the purchases a customer has made, the promotional emails that they've opened, and every page they've looked at on your website.
In order to build a predictive model, you need to work with your data scientists to extract the explanatory variables or features that you'll feed into your machine learning algorithm. These are features, like how much has a customer spent in the past? How long has it been since their last purchase? And how often do they open up those emails that you send them? Each one of these variables exists in your raw data set, but a data scientist has to go in and find it.
Feature Labs is particularly unique because we have the most popular and advanced software for automating the feature engineering process. One of the impactful use cases we applied this technology to recently was improving predictive models for credit card fraud detection. This is an extremely important problem because one industry report estimated that over $100 billion a year was lost because of rejecting transactions that were legitimate.
This is why we weren't surprised when Spanish bank BBVA wanted to use our technology. By applying our automated feature engineering to over 900 million transactions that they record, they were able to decrease the number of transactions that they incorrectly classify as fraudulent by over 50%.
This is just one of the many examples of how feature engineering can help build accurate predictive models. All of our products have the goal of helping you streamline the process of taking advantage of machine learning. If you're interested in seeing how you can do this with your own data, please get in touch with us today. We'd love to have you try out our software.
[MUSIC PLAYING]