Featured Interview: Dr. Tom Kehler

May 5, 2023

Overview

Today we feature an interview with Dr. Tom Kehler. Dr. Kehler started his career in solid-state physics at Drexel University, and later began researching computational linguistics and expert systems in the 1970s and 1980s. He led the AI research group at Texas Instruments, before becoming the CEO of IntelliCorp Inc. He then served as CEO of Connect Inc. Currently, he is the chief scientist for CrowdSmart.ai, a start-up dedicated to helping organizations reduce risk and make better decisions by combining the best of human and artificial intelligence. He was recently interviewed by Christianity Today, and in April presented his ongoing work at CrowdSmart.ai at the 2023 Missional AI conference.

Background

M. Schwarting: Could you tell me a little about your faith background? You were once drawn to work at Wycliffe Bible Translators, is that correct?

T. Kehler: I grew up as a preacher’s kid but got serious about my faith when I was about sixteen. I made a deep commitment and at that point began to think about my future. I was interested in amateur radio technology, so I got an amateur radio license when I was around eleven. My dad was involved with using amateur radio to communicate with missions. This was in the 1960s when we did not have cell phones – people would communicate between the mission field and their relatives using amateur radio. There is an organization called the Jungle Aviation and Radio Service, which is part of Summer International Linguistics (SIL). With an interest in radio technology, in the early stages of computational technology, I started to wonder what could be done with the technology. I was interested in empowering indigenous people. What I loved about Wycliffe and SIL was its strong academic bent. I knew I had a deep interest in mathematics, so I started reading mathematical linguistics and got in touch with people in that sector. I had an opportunity to get to know some of the early pioneers who were applying early computing techniques to text processing and natural language. I knew Joe Grimes at Cornell ¹ and Kenneth Pike, who was one the first president of SIL.

I went to Drexel University pursuing an electrical engineering path. I was hoping to find something that combined radio communications and data communications, especially as radioteletype² gained traction. But God closed a door on my electrical engineering path, and I lost my scholarship. But God had a plan. One day I was wandering the halls of the solid-state physics lab and a professor there took great interest in me. He got me into graduate school early and I was able to test out of courses. My education was funded, and I had a research stipend. For a long time, I could not figure out why God led me to a PhD in solid-state physics on the study of thin-film ferromagnetism. After I graduated, I reached out to Kenneth Pike, and I got to meet Cameron Townsend³ as well. I took an internship with SIL the summer of 1972, but I did not know what I was going to do within Wycliffe or SIL.

After I finished my PhD, I met with the president of Wycliffe and SIL. Since there was already a division on radio and engineering, I proposed creating a division on computing. Unfortunately, in 1972 there was no interest. Instead, God opened the door for me to become a professor at a local university. That is when I started my studies in computational linguistics and language modeling. I developed a simulation system for modeling natural language grammars, which was published in the late 1970s. We formed a computer advisory board for Wycliffe, where we tried to be helpful. I started researching natural language processing and knowledge representation. Eventually, I was invited to lead the AI group at Texas Instruments (TI). I worked with many people from MIT and the MIT Media Lab. I even got to present to Marvin Minsky ⁴, although I was scared out of my mind. Many of my colleagues eventually followed me to IntelliCorp, which was founded by Ed Feigenbaum⁵.

M. Schwarting: What led you to leave TI, and what did you do at IntelliCorp?

T. Kehler: I had invited Feigenbaum to be a speaker to TI about expert systems. Soon afterward, he convinced me to come work with him and several other AI luminaries at IntelliCorp. TI was very good to me and offered me a huge package to stay, but I have never been heavily swayed by money. I am more concerned with the impact can I have.

IntelliCorp was like the Google of the 1980s. We hired the top people out of Carnegie Mellon, MIT, Stanford, and from all over the globe. At one point we had around 27% PhDs on staff. We were an AI brain trust that went public. I got drafted into being the CEO, mainly because I had a gift for recruiting talent. I had people following me from TI to IntelliCorp, which resulted in a wonderful community of industry experts, academicians, and researchers. During that period, I published several papers on knowledge representation. The AI winter hit right around when I left IntelliCorp⁶. I had an opportunity to become the CEO of a company spinning out of Apple called Connect, which was developing an early e-commerce platform. America Online (AOL) was just getting started, I would converse with Steve Case⁷ every few days to figure out how to make the e-commerce market happen. Connect went public in 1997, and right after that, I left. Afterward, I started on the journey of what became CrowdSmart. It began as a company exploring ways to use machines to learn and weigh the issues important to a group. Consider asking a million people an open-ended question, then building a model of their values and personal priorities. Could you figure out what is personally relevant to a large group of people? That is what led to the creation of CrowdSmart.

CrowdSmart and Bayesian Intuitions

M. Schwarting: When you ask an open-ended question, and everyone answers differently, one tool might be Latent Dirichlet Allocation (LDA), which is often used for modeling the topics that a group emphasizes. In what other ways can AI drive people towards co-creation and collaboration?

T. Kehler: At the start of my work, I had gone down a route of symbolic AI. These were hand-crafted approaches to codifying human knowledge into programs. The problem was knowledge acquisition; that is, the codifying process was manual. When “black box” AI came into being, it was all about statistical learning from data with no connection to human cognition. It is driven purely by what the mathematics tell us. As Judea Pearl points out ⁸, that is an animal level of intelligence based on sensory correlation. Starting in 2016 to our current position, using a data-only feedback loop can lead to some very bad results. For example, recommender systems put people in their own content bubbles. That is very dangerous for human beings, because it plays to our fundamental flaw. One might consider a path towards sin as “I know how to define good for myself”, which places me over everybody else. Christianity is just the opposite: God over me. We now see a great divide in the country, which I believe was, to a large degree, caused by AI.

I have been a part of the third generation of AI. I want to solve the problem of human knowledge acquisition not in terms of the individual, but for a group. Can I build a model that captures the collective thoughts about an outcome? The underlying principles start with going back to Reverend Bayes and Bayesian logic.⁹ I think Thomas Bayes in his own humble explorations discovered something very important: we have a belief, we encounter evidence, then we decide whether to hold on to our beliefs or figure out how new evidence informs our updated beliefs. The result is a new set of beliefs that have either been modified by the evidence you have observed, or not. Karl Friston has taken this logic into a broad theory of the brain ¹⁰, and Judea Pearl took this logic when discussing how to transition from expert systems into probabilistic learning. These two individuals highly influenced my thinking about this new model, which is to bring people together and learn how they align on ideas. The way our system works is that you can only promote the ideas of others, but not your own.

We used this framework to predict what startup companies will be successful at raising money in the future. This is a good expert systems problem, with lots of demand for information. We would ask experts questions like: “What do you think about the product market fit, and why?” Once experts gave their reasons, we would synthesize that information using an AI learning algorithm and share that information with them. Experts could only determine where they aligned relative to the group. We analyze feedback in the form of a Bayesian belief network. An expert’s belief, and the strength of that belief, correspondingly influence the outcome. This becomes a powerful tool to present a cognitive model of a decision. This model removes the standard paradigm of statistical learning alone and allows us to tap into humans as the source of knowledge. This is why we call it a third generation of AI, because it takes statistical learning to derive the priorities, but the priorities are driven by people. In a way, this is the essence of the scientific method.

Concerns and Possible Solutions for Large Language Models

M. Schwarting: The rapid innovation in the space of large language models seems to correlate with troubling cutbacks in AI ethics teams at large companies like Twitter and Microsoft. What are your thoughts on our ethical understanding of ChatGPT and its ramifications?

T. Kehler: I discussed this a bit in my interview with Christianity Today. I believe that, for the first time in history, we have taken misinformation generated by second-generation AI and have put it on steroids. It is now possible with ChatGPT to create massive amounts of misinformation that sounds extremely intelligent, which is dangerous. An important facet of the issue is dealing with data provenance, and academic publishing is a great analogy. Google’s idea for PageRank stemmed from the mathematics of citation research¹¹. An eigenvalue problem drove the rankings. At CrowdSmart, we use a similar approach with the ideas of a group, with each idea retaining its provenance. I believe that retaining data provenance is critical to the future of AI. OpenAI relies on a whack-a-mole strategy, where any mistakes must be corrected over and over. Furthermore, if an LLM gave a wrong answer, who ought to be held accountable? If you query ChatGPT on its trustworthiness, it will say it is not trustworthy. From a legal perspective, if we are going to regulate one thing, let us require that the model can provide data provenance mapping to an accountable source.

M. Schwarting: There are many technical hurdles to providing accurate and reliable data provenance in generative models. Do you think that reliable data provenance has a hope of being taken on, both legally and technically?

T. Kehler: I think it has a very simple fix; namely, building a better training set. We have been lax about constructing a training set, and instead just scrape everything. Some have proposed watermarks ¹² to trace sources, but I believe more innovations will occur in this space. Historically, we have a process in human society where we speak to logic and evidence. Furthermore, there is nothing in faith that does not point to evidence for the reason for that belief. Faith has a somewhat Bayesian quality. For example, we have hope because of the resurrection of Jesus Christ, and if that is not true then we do not have hope. Hope is predicated on a verifiable piece of evidence: the large set of people who witnessed the resurrection. It is fundamental to society that we have a sense of the truth value of the information that we are building from.

M. Schwarting: Suppose we train a transformer, and we make sure that everything we gave it has proper provenance. Does that mean that I cannot ask it anything that would require a subjective answer? How do we decide what is objective truth?

T. Kehler: You raise an important question about current transformer architectures, I think they do have some limitations. Transformers are an extremely powerful computational architecture, but they also has their limits. The architecture that I think has a lot of power is something that looks more probabilistic, which leads to adherence to evidence. There are efforts going on at MIT and Stanford to do this, which are more like the Friston model. This architecture that is rooted in physical laws as well, where surprisal is like energy. Think of pulling a spring apart: the more data disagrees with a belief, the greater the potential energy for spring to do work. This underlying physical principle that a system seeks equilibrium applies to how people share ideas. We are constantly sensing and resolving, sensing and resolving. Life is about bringing certainty and order out of chaos. There are fundamental principles built into cooperation, which I think speaks to the wisdom of logos. I think we are tapping into something very fundamental about how nature and society work. I think the Friston model for AI is based more heavily on the laws of nature and will ultimately win out.

Acknowledgments

A big thanks to Dr. Tom Kehler for his time to carry out this interview, and thanks to Marcus Schwarting for hosting the interview. Thanks to Haley Griese for proofreading, editing, and publishing this work.

References

Grimes, Joseph E. Sentence Initial Devices. Summer Institute in Linguistics Publications in Linguistics. Publication Number 75. 1986.↑

Pearl, Judea, and Dana Mackenzie. The book of why: the new science of cause and effect. Basic books, 2018.↑

Friston, Karl J., Andrew P. Holmes, Keith J. Worsley, J‐P. Poline, Chris D. Frith, and Richard SJ Frackowiak. “Statistical parametric maps in functional imaging: a general linear approach.” Human brain mapping 2, no. 4 (1994): 189-210.↑

Grinbaum, Alexei, and Laurynas Adomaitis. “The Ethical Need for Watermarks in Machine-Generated Language.” arXiv preprint arXiv:2209.03118 (2022).↑

Radioteletype was an early precursor of modern cell phone communications networks. It allowed messages to be passed wirelessly over long distances in the radio frequency domain.
Cameron Townsend was the founder of Wycliffe Bible Translators and SIL.
Marvin Minsky was an MIT professor known as the inventor of head-mounted graphical displays and confocal microscopy. He wrote several noted books on AI, computer science, and the theory of mind. He co-founded the AI Laboratory at MIT.
Edward Feigenbaum is known as the “father of expert systems” and created EPAM, one of the first computer models that demonstrated how humans acquire knowledge.
The second AI winter lasted from 1987 to 1993. During this period, the market for “expert system” (or LISP machine) software and hardware collapsed and funding initiatives in AI were cut or cancelled. IntelliCorp was significantly affected by this downturn.
Steve Case was the former CEO and chairman of AOL from 1991 to 2003.
Statisticians tend to divide into two camps: frequentist and Bayesian. The latter group relies on a formula introduced by Reverend Thomas Bayes.
PageRank, named after Google co-founder Larry Page, is an algorithm that weights the importance of a website based on its links to and from other websites. It was the driver behind Google’s initial search engine. The algorithm relies on a large matrix eigendecomposition. Eigendecomposition is computationally expensive (O(n³) where n is the number of websites), and is no longer a part of Google’s search.

Interview with Bob Johansen

Mark Graves on Auditing Medical AI