Interview with Adele Cutler: Remembering Leo Breiman
Salford Systems has maintained long-term relationships with data mining visionaries like Random Forests co-developer Dr. Adele Cutler. In a recent visit to San Diego, she spent time with Salford Systems' staff discussing plans for Random Forests future developments, offering an introductory session on Random Forests, and sharing some personal memories of her time working with Dr. Leo Breiman on Random Forests.
Dr. Cutler shared the chance origin or their collaboration on Random Forests in an interview with Salford Systems' marketing staff. Here is a brief portion of the interview:
Q: How did you come to work with Dr. Breiman?
A: Leo was my advisor at U.C. Berkeley. I did my Ph.D. at Berekely with Leo from 1983-88, but we didn't work on decision trees at all in that time. I knew basically what they were but I didn't understand them very well. Leo and I worked on optimization problems and archetype analysis when I was at Berkeley.
Q: When did you first become interested in decision trees?
A: After I left Berkeley I became a faculty member at Utah State University and worked on mixture models for a while. That was fun while it lasted, but I began to feel like the applications weren't really there. So I went to Leo one day and I said "Look Leo, I've come to the end of what I want to do with mixture models. Is there anything you can recommend as a direction for me to follow?"
He said, "Adele, you need to get into neural networks." He was enthusiastic about neural nets at that time. (in the mid 90s)
So I began attendeing conferences to learn more about neural nets, and it was very interesting to learn and listen to some of these very bright people. They weren't statisticians, there were only a couple of statisticians interested in that field at that time.
The Random Forests collaboration and my real start to working in [decision] trees came in a cab. It was a stretch limo actually! I was going to a conference and I had a habit of bumping into him (Leo) at the airport attending all of these conferences on neural nets. We were trying to hail a cab to the conference hotel and they didn't have a cab available so they gave us a stretch limo at the regular rate. He was telling me about some work he was doing, and it was the early Random Forests. And I started telling him about some of the experiments that I'd been doing that were using [decision] trees.
So we took my project at the time, Perfect Random Trees, and his project Random Forests and immediately stopped working on everything else and began collaborating on RF.
Q: How would you describe working with Leo? (audio clip below)
Random Forests is widely available now, and is documented as an excellent benchmark tool for data scientists and analysts. Much of the insight provided by RandomForests is generated by methods applied after the trees are grown and include new technology for identifying clusters or segments in data as well as new methods for ranking the importance of variables. The method was developed by Leo Breiman and Adele Cutler of the University of California, Berkeley, and is licensed exclusively to Salford Systems. Ongoing research is being undertaken by Salford Systems in collaboration with Professor Adele Cutler, the surviving co-author of RandomForests. RandomForests is a collection of many CART trees that are not influenced by each other when constructed. The sum of the predictions made from decision trees determines the overall prediction of the forest. the algorithms is best suited for the analysis of complex data structures embedded in small to moderate data sets containing less than 10,000 rows but potentially millions of columns.