Expert Talks

Generative AI

Dr. David Schuster, PDS™

Lead Analytic Methodologist at the US Department of Defense

Welcome to the second panel conversation in the Expert Talks series presented by the Data Science Council of America (DASCA) on the topic ‘Generative AI’. We’re joined by, Dr. David Schuster, PDS™, Lead Analytic Methodologist at the US Department of Defense.

Born in Albuquerque, New Mexico with an early passion for two things: science and soccer, Dr. Schuster received his BS in Physics Magna Cum Laude from the University of Arizona in 2004, his MS in Physics from the University of Miami in 2006, and his PhD in Applied Physics from the Colorado School of Mines in 2011. He was certified as Principal Data Scientist (PDS™) by DASCA in 2023. His thesis work involved computational simulation and Bayesian statistical analysis of ultra-high energy cosmic rays to identify composition. Originally having planned to stay in academia, he changed course and entered the defense industry working first as a contractor and then in the civil service for the Department of Defense. He was appointed Chief Data Officer of NORAD and USNORTHCOM in 2021 working on policy, strategy, governance, and security for the entire Command data catalog. His current role is Lead Analytic Methodologist tasked with developing new analytic techniques focused on positioning the Department for the ML/AI challenges of the 21st century.

The moderator, Kshama Malavalli, holds a bachelor's degree in physics from Cornell University and has a keen interest in exploring the various facets of data science, especially as applied to physics.

Q: Could you share the role of lead analytic methodologist, citing an example or typical project challenges?

Dr. David Schuster: The analytic methodologist role, a relatively new position in the US government, involves developing analysis methodologies for continuous innovation. Recognizing the need to stay ahead of adversaries, the government fosters a culture of innovation and improvement. The methodologist stays informed about state-of-the-art technologies, adopting cutting-edge approaches to accomplish agency missions. Essentially, the role ensures the adoption of advanced techniques, keeping innovation at the forefront.

Q: Can you elaborate on your role as Chief Data Officer?

DS: As a Chief Data Officer, my responsibilities included developing policies, white papers, and governance documents for handling the substantial influx of data into the command. This involved making decisions on technological solutions and software to manage the data. NORAD and USNorthcom, being Unified Combatant Commands, are responsible for defense and security in North America, covering areas like military threats, defense support to civil authorities, and assistance in natural disasters or special events.

Q: How has generative AI influenced your work, from its initial introduction to its current applications?

DS: Encountering generative AI around six or seven years ago during the deep fake concerns, I initially considered it a curiosity. Now working for the Department of Defense, my focus is on defensive strategies against potential misuse by adversaries, particularly in coordinating responses to natural disasters.

Public perception is divided, with 75% anxious about job replacement and 25% excited about AI's productivity enhancements. I see generative AI as revolutionary, unlocking growth opportunities. In my work, it aids in code generation, offering skeleton code for projects like anomaly detection. I emphasize a symbiotic relationship between human intuition and AI's computational strengths.

While the future of true artificial general intelligence is uncertain, ongoing AI advancements raise intriguing questions about emergent qualities and the evolving human-AI dynamic.

Q: What would you say sets Generative AI apart?

DS: In my view, a classifier or an anomaly detector operates within strict, predefined parameters, providing specific outputs. On the other hand, generative AI works within broader, more flexible boundaries, simulating creativity by generating varied outputs based on loosely defined parameters. For instance, in image creation, a request for a general concept like a cat with a top hat may yield good results, but more complex or specific prompts might challenge the AI due to its training limitations. Unlike humans or artists who can adapt to unusual requests, AI's responses depend heavily on its training data and prompt instructions.

Q: Are there unique approaches or surprises in handling generative AI tools compared to other common AI paradigms, such as regressors, classifiers, or neural networks, in your past or current work?

DS: Speaking generally, generative AI is less commonly used due to limited expertise outside cutting-edge research labs. Classical ML and AI applications like classifiers and regressors are more prevalent and established. My principal concern with generative AI revolves around data integrity, crucial for making accurate data-driven decisions. There's also interest in using generative AI to create convincing training data for other algorithms., Modeling and simulation has been a part of the technological enterprise for decades. In some ways, generative AI is the next logical step from that, loosening the parameters and using more sophisticated computational models to provide more degrees of freedom. And what wows people is the feeling of: it sounds like me. There's some anthropomorphizing happening there, which raises ethical considerations, particularly in areas like AI companionship and potential societal impacts. Generative AI introduces additional ethical questions compared to more classical AI applications.

Q: What excites you most about current applications of generative AI and how do you envision leveraging it to enhance data analysis, and collection methods, and address the challenges posed by the massive influx of data?

DS: I find the growing volume of data to be a significant challenge, and making sense of it is crucial for effective decision-making. Generative AI, in my view, holds promise in assisting with tasks such as data analysis and training data generation. It has the potential to ease our workload and contribute to the development of new algorithms and technologies. While maintaining a focus on data integrity, I see AI as a valuable tool to handle the immense amount of data and extract relevant information.

Q: Do you view classical ML and AI models as precursors to generative AI, given the concerns around deepfakes about five to six years ago? Aware of any research or efforts involving classical models in this context?

DS: I'm not an AI researcher; I'm a physicist who plays an AI person on TV. When considering the evolution of generative AI, I think about it in terms of decision trees, which are fundamental and easy to grasp. Classical image classification involves training a model to recognize cats or dogs, for instance. The idea is to reverse this process, asking the model to generate content like a cat image.

This evolution from simpler classifiers to more sophisticated models like deep learning and large language models fascinates me. It's crucial to understand that these tools have limitations, and they should be viewed as aids to improve productivity rather than threats to jobs. Concerns about data integrity, privacy, and ethical use are valid and will require legislation to provide guidelines and guardrails. As technology advances, these considerations become even more critical, and finding the right balance will be an ongoing challenge.

Q: How do you stay updated on developments in data science, AI, statistics, and other relevant fields given the continuous advancements?

DS: To stay updated, I approach new topics with the mindset that I'm not the smartest in the room and leverage my research skills, delving into the scientific literature. For theoretical understanding, I explore academic papers on platforms like arXiv and Google Scholar. On the practical side, it involves internet searches, forums like Stack Overflow, and a trial-and-error approach to find solutions for specific use cases. Collaboration within my team, attending conferences, and engaging with professionals contribute to continuous learning. Additionally, I encourage newcomers to jump in, start with practical projects, and seek mentors, as the field's democratization allows anyone with a computer to explore and learn.

Q: Is there any topic or story you haven't discussed yet that you'd like to address before we conclude our conversation?

DS: My wife, a graphic designer, frequently uses generative AI in her work. While some designers fear it could replace their jobs, she embraces the tools. For tasks like retouching photos in brochures, she can instruct AI to fix specific issues quickly, saving time. This highlights that generative AI isn't limited to data science; it impacts various industries. Staying optimistic and seeing the positive aspects of these tools is crucial. As for coding, you can ask a tool like ChatGPT for specific code snippets, like a Python developer requesting an autoencoder using TensorFlow and Keras for anomaly detection in two-dimensional image data. It provides a starting point that requires further refinement based on specific use cases. Additionally, for learning, starting with sci-kit-learn in Python is recommended, emphasizing not overcomplicating things. Using AI-generated code as a starting point can help learners get to the 50-yard line before delving into the intricacies.

Q: What is the main message or key takeaway you'd like viewers to have after watching this episode, whether it's a single message or multiple points?

DS: The key message I'd like viewers to take away from this episode is to stay optimistic about the future. It's crucial to maintain perspective in a world filled with information overload. Unplugging from technology, connecting with loved ones, and embracing the positive aspects of AI can contribute to a happy and fulfilling life. Whether you're just starting or are well into your journey as a data professional, staying optimistic and continually learning in the field can lead to a rewarding career. So, stay positive, stay connected, and keep learning!

Thank you, Dr. Schuster, for generously sharing your knowledge, expertise, and time with us today. Your detailed responses have been immensely valuable.

Data Engineering

Data Analytics

Data Science

Explore

> Check Your Eligibility

> Certification Process

> DASCA Examination and Policies

> Frequently Asked Questions

> Digitally Badged Credentials

Expert Talks

Generative AI

Dr. David Schuster, PDS™

Lead Analytic Methodologist at the US Department of Defense