Large Language Models Won’t Save Army’s Data Overload

Tue, 07/16/2024

Data is critical for a connected Army, but managing datasets will require technologies that are useful at the tactical edge.

6m read

Written by:

Jordan McDonald

Young Bang, former principal deputy assistant secretary of the Army (Acquisition, Logistics and Technology), speaks during the Under Secretary of the Army’s Digital Transformation Panel in Washington, D.C., Sept. 10, 2023. Photo Credit: U.S. Army photo by Henry Villarama

Data is critical for a global connected U.S. Army, but as the service ingests data from disparate sources and battlefields all over the world, Army leaders want to ensure it is not inundated with data that it cannot use.

“There’s too much damn data out there, and we can’t overload our warfighters and our leaders with too much data,” Army Principal Deputy Assistant Secretary for Acquisition, Logistics and Technology Young Bang said at an AUSA Cyber & Information Advantage Hot Topic event earlier this month.

Michael Diorio, senior vice president of global operations at Dataminer, added that data can be collected from sources as diverse as sensors, tweets, videos, photos, audio broadcasts or transponders. As billions of data points are ingested to one location, the totality of information can give a clear picture of a location or event.

“It’s been talked about for years that data is the oil, but at the same time, there wasn’t necessarily the technology to really analyze all this data, look at causal inference and then correlations. And so what we do with our large language models and foundation models is really look at the strength of signal that is happening around an event,” Diorio said.

Edward Kao, research scientist at MIT Lincoln Laboratory, said to generate an information advantage for the Army, the public and private sectors will need to use the potential of generative AI.

“Our adversaries will be using and are already using generative AI. I think there’s ethical issues about using generative AI to put out content, but in terms of using generative AI at least to automate, the understanding of information landscape is critical. I don’t think we really have a choice there,” Kao said.

He added that generative AI is unlikely to be usable in the short term as a fully automated tool, and that the interaction between humans and AI in using it will be absolutely “critical.”

“I think what a human analyst offers is far more than just giving approval to the content. I think a human analyst actually provides the coloring and the context and the mission interpretation that I just don’t think we can expect a machine to be able to do that,” Kao said. “I think the machine is really good at aggregating information, so when the human applies that interpretation, you’re doing it at a mass scale. It’s to scale up that interpretation, but not actually making that interpretation.”

Stephen Riley, of the Army engineering team at Google, said that despite data overload, large language models (LLM) will rarely be effective for the Army.

“Ninety percent of the time, don’t do it. It’s the easy button, I know, but using LLMs, that is boiling the ocean to make yourself a cup of coffee. You don’t have the compute resources to run effective LLMs down at the tactical edge,” Riley said.

Riley encouraged leaders to look toward the “old ways of doing things” like knowledge graphs as a more effective way to manage and aggregate data, rather than jumping for complex technologies that require more compute power.

“You could actually encode all of those [automatic data processes], all the operations stuff, all the intel stuff. We could encode that into a knowledge graph, which requires, I’ll just be hyperbolic, infinitely less compute power. That’s something you could deploy forward on a pretty small box.”

Riley added that LLM hallucination is also a real issue, but warned that just because humans oversee an LLM, that does not automatically make it a valid dataset.

“Google is biased. Everybody’s biased. You’ve got to look out for that too,” Riley said. “Who is the one that’s the gatekeeper of the shifting of the Overton Window? Whose values are you implicitly encoding in the data set that you’re now using?”

Riley warned that as government acquires AI technology and datasets gathered from the commercial world, it is incumbent on the government to “demand to see where the data came from.”

“We have already seen cases where companies building large LLMs have sourced data from other companies that say they have a bunch of data. And it turns out they source from other companies that are given some pretty bad stuff. Maybe not deliberate misinformation, but stuff that absolutely would not comply with our nation or Army values,” Riley said.

Trending

This is a carousel with manually rotating slides. Use Next and Previous buttons to navigate or jump to a slide with the slide dots

Inside Oak Ridge National Lab’s Pioneer Approach to AI
22m listen
FEHRM CTO Targets Two-Year Cloud Migration for Federal EHR
4m read
New Army Acquisition Plan Cites Autonomy, Predictive Analytics
4m read
A Look at Federal Zero Trust Transformation
20m read
AI Enables Coast Guard’s Workforce to Transform Operations
14m watch

Related Content

- Artificial Intelligence
- Data
DHA CDAO Spearheads Master Data Catalog to Boost Transparency

Jesus Caban plans to boost DHA's data maturity through a new master data catalog, governance frameworks and inventory of tech tools.

5m read
- Artificial Intelligence
- Defense
Trump Orders Spark Government-Wide Acquisition Overhaul

As Trump pushes for a faster, simpler procurement system, agencies are leveraging AI and adapting strategies to meet new requirements.

5m read
- Customer Experience
- Data
IRS Makes Direct File Code Public as Lawmakers Debate Program’s Fate

The agency sees the Direct File source code as beneficial to government digital services despite what happens with it in proposed budgets.

5m read
Inside Oak Ridge National Lab’s Pioneer Approach to AI

Energy Department’s Oak Ridge National Lab transforms AI vulnerabilities into strategic opportunities for national defense.

22m listen
- Deep Dives
- Artificial Intelligence
A Look at Federal Zero Trust Transformation

Recent developments from CISA and DOD show how government is advancing zero trust quickly.

20m read
- Video
- Artificial Intelligence
Modernization Strategies to Enable Energy Innovation

Lawrence Berkeley National Lab and Maximus experts explore the modernization strategies driving digital transformation and operational resilience within the energy sector.

33m watch
- Artificial Intelligence
- Digitization
DOI Must Modernize Energy to Win AI Race, Secretary Says

Doug Burgum links AI innovation to energy reform as DOI advances digital infrastructure and wildfire response under Trump’s tech agenda.

2m read
- Artificial Intelligence
NIST to Release New AI Cybersecurity Guidance as Federal Use Expands

NIST plans to release AI cybersecurity guidance within the year to support safe adoption as federal agencies expand use cases.

4m read
- Event
- Artificial Intelligence
Federal Zero Trust Forum

The Federal Zero Trust Forum brings together key technology leaders from across government to explore practical strategies and share lessons for advancing zero trust architecture.

2025/10/07 8:00am - 11:30am Ritz Pentagon City | 1250 S Hayes St, Arlington, VA 22202
- Artificial Intelligence
- Workshops
CIA Adds Fourth Pillar to AI Strategy, CAIO Says

Lakshmi Raman says the new pillar marks a strategic shift toward embedding AI more deeply into the CIA’s day-to-day mission execution.

3m read
- Artificial Intelligence
- Customer Experience
FEHRM CTO Targets Two-Year Cloud Migration for Federal EHR

Lance Scott touts new EHR tech advancements, including cloud migration, expanded data exchange and AI integration to improve care delivery.

4m read
AI Enables Coast Guard’s Workforce to Transform Operations

The Coast Guard’s Deputy CIO Brian Campo delves into the ways AI is pushing the service to rethink its core services, workforce and operations.

14m watch