Built-in Prejudices
All AI models have biases from the data they are trained on. It's important to recognise these biases and decide which ones need to be addressed. Research has shown that popular Gen-AI models have racial and gender biases in medical situations (Zack et al., 2024). However, some studies suggest that Gen-AI performs at a level similar to doctors, and its performance does not significantly change based on race or ethnicity. This highlights that the bias issue is complex and needs to be carefully considered.
AI systems can reflect and amplify human biases from their training data unless they are actively designed to avoid this. Most models represent the dominant culture and language of their data, as well as the viewpoints of their creators.
Where the Data Comes From
AI models need large amounts of data for training. Because these datasets are so big, it's hard to verify that all data is properly licensed and ethically sourced. Major AI companies like OpenAI (ChatGPT) and Google do not fully disclose their training data sources, making it difficult to know what is included.
If an AI tool accidentally reproduces unlicensed content, you could face legal problems. Additionally, parts of AI development - like data labelling and content filtering may involve unfair labour practices in different countries, raising ethical concerns about how these technologies are made.
Zack, T., Lehman, E., Suzgun, M., Rodriguez, J. A., Celi, L. A., Gichoya, J., Jurafsky, D., Szolovits, P., Bates, D. W., Abdulnour, R.-E. E., Butte, A. J., & Alsentzer, E. (2024). Assessing the potential of GPT-4 to perpetuate racial and gender biases in health care: a model evaluation study. The Lancet. Digital Health, 6(1), e12–e22. https://doi.org/10.1016/S2589-7500(23)00225-X
One of the critical aspects of using Generative AI (Gen-AI) in research is the responsible management of sensitive data. Australian researchers work within a strict regulatory framework designed to protect personal information and intellectual property.
Defining and Classifying Sensitive Data
Before engaging with Gen-AI tools, researchers must carefully identify and classify their data. Sensitive data can include:
Never Upload Sensitive Data to GenAI Tools!
Strategies for Secure Data Handling with Gen-AI
Implementing robust data governance strategies is essential to mitigate the risks associated with Gen-AI.
Further Reading:
Copyright © 2025 The University of Notre Dame Australia | CRICOS Provider Code: 01032F | TEQSA PRV12170 | RTO Code 0064