The Pentagon's chief digital and artificial intelligence offer, Craig Martell, is alarmed by the potential for generative artificial intelligence systems like ChatGPT to deceive and sow disinformation. His talk on the technology at the DefCon hacker convention in August was a huge hit. But he's anything but sour on reliable AI.
Not a soldier but a data scientist, Martell headed machine-learning at companies including LinkedIn, Dropbox and Lyft before taking the job last year.
Marshalling the U.S. military’s data and determining what AI is trustworthy enough to take into battle is a big challenge in an increasingly unstable world where multiple countries are racing to develop lethal autonomous weapons.
The interview has been edited for length and clarity.
—- Q: What is your main mission?
A: Our job is to scale decision advantage from the boardroom to the battlefield. I don’t see it as our job to tackle a few particular missions but rather to develop the tools, processes, infrastructure and policies that allow the department as a whole to scale. Q: So the goal is global information dominance? What do you need to succeed?
A: We are finally getting at network-centric warfare -- how to get the right data to the right place at the right time. There is a hierarchy of needs: quality data at the bottom, analytics and metrics in the middle, AI at the top. For this to work, most important is high-quality data. Q: How should we think about AI use in military applications?
A: All AI is, really, is counting the past to predict the future. I don’t actually think the modern wave of AI is any different. Q: Pentagon planners say the China threat makes AI development urgent. Is China winning the AI arms race?
A: I find that metaphor somewhat flawed. When we had a nuclear arms race it was with a monolithic technology. AI is not that. Nor is it a Pandora’s box. It’s a set of technologies we apply on a case-by-base basis, verifying empirically whether it’s effective or not. Q: The U.S. military is using AI tech to assist Ukraine. How are you helping?
A: Our team is not involved with Ukraine other than to help build a database for how allies provide assistance. It’s called Skyblue. We’re just helping make sure that stays organized. Q: There is much discussion about autonomous lethal weaponry – like attack drones. The consensus is humans will ultimately be reduced to a supervisory role — being able to abort missions but mostly not interfering. Sound right?
A: In the military we train with a technology until we develop a justified confidence. We understand the limits of a system, know when it works and when it might not. How does this map to autonomous systems? Take my car. I trust the adaptive cruise control on it. The technology that is supposed to keep it from changing lanes, on the other hand, is terrible. So I don’t have justified confidence in that system and don’t use it. Extrapolate that to the military. Q: The Air Force's “loyal wingman" program in development would have drones fly in tandem with fighter jets flown by humans. Is the computer vision good enough to distinguish friend from foe?
A: Computer vision has made amazing strides in the past 10 years. Whether it’s useful in a particular situation is an empirical question. We need to determine the precision we are willing to accept for the use case and build against that criteria – and test. So we can’t generalize. I would really like us to stop talking about the technology as a monolith and talk instead about the capabilities we want. Q: You are currently studying generative AI and large-language models. When might it be used in the Department of Defense?
A: The commercial large-language models are definitely not constrained to tell the truth, so I am skeptical. That said, through Task Force Lima (launched in August) we are studying more than 160 use cases. We want to decide what is low risk and safe. I’m not setting official policy here, but let’s hypothesize. Low-risk could be something like generating first drafts in writing or computer code. In such cases, humans are going to edit, or in the case of software, compile. It could also potentially work for information retrieval — where facts can be validated to ensure they are correct. Q: A big challenge with AI is hiring and retaining the talent needed to test and evaluate systems and label data. AI data scientists earn a lot more than what the Pentagon has traditionally paid. How big a problem is this?
A: That's a huge can of worms. We have just created a digital talent management office and are thinking hard about how to fill a whole new set of job roles. For example, do we really need to be hiring people who are looking to stay at the Department of Defense for 20-30 years? Probably not. But what if we can get them for three or four? What if we paid for their college and they pay us back with three or four years and then go off with that experience and get hired by Silicon Valley? We're thinking creatively like this. Could we, for example, be part of a diversity pipeline? Recruit at HBCUs (historically Black colleges and universities)?